CudaVision Lab - Summer 2025

Welcome to the comprehensive documentation for the Deep Learning course assignments and projects from CudaVision Lab, Summer 2025 at the University of Bonn.

Overview

This repository contains implementations and experiments from a comprehensive deep learning course, covering fundamental neural network architectures, computer vision, generative models, and advanced transformer-based approaches. The course progresses from basic neural networks to state-of-the-art transformer architectures for video understanding.

Course Structure

The course is organized into 7 assignments and 1 course project, each building upon previous concepts:

Assignments 1-2: Fundamentals

Assignment 1: Neural Network Fundamentals - Building and training basic MLPs and CNNs from scratch
Assignment 2: Transfer Learning and Fine-tuning - Leveraging pre-trained models for custom tasks

Assignments 3-4: Sequential and Generative Models

Assignment 3: Recurrent Neural Networks - Implementing RNNs from scratch for action recognition
Assignment 4: Variational Autoencoders - Generative models for image reconstruction and generation

Assignments 5-6: Advanced Generative and Self-Supervised Learning

Assignment 5: Generative Adversarial Networks - Implementing DCGAN and CDCGAN
Assignment 6: Self-Supervised Learning - Contrastive learning with Siamese networks and SimCLR

Assignment 7: Vision Transformers

Assignment 7: Vision Transformers for Action Recognition - Applying transformer architectures to video understanding

Course Project: Video Prediction

Course Project: Video Prediction with Object Representations - Advanced transformer-based video prediction using holistic and object-centric representations

Quick Start

Installation

Clone the repository:

git clone https://github.com/Cuda-Vision-Lab/CudaVisionSS2025.git
cd CudaVisionSS2025

Install dependencies:
```
pip install -r requirements.txt
```
Navigate to specific assignments: Each assignment is self-contained in its respective directory under src/.

Requirements

Python 3.8+
PyTorch 2.0+
CUDA-capable GPU (recommended)
Jupyter Notebook
See requirements.txt for complete dependency list

Technical Stack

The projects use a consistent technical stack across all assignments:

Framework: PyTorch
Visualization: TensorBoard, Matplotlib, Seaborn
Data Processing: NumPy, PIL, torchvision
Evaluation: scikit-learn metrics, custom evaluation scripts
Documentation: MkDocs with Material theme

Documentation

For detailed information about each assignment, please refer to the Repository Structure page, which provides links to comprehensive README files for each assignment and the course project.

Each assignment includes: - Complete implementation details - Training configurations - Experiment results - Visualization examples - Code usage instructions

Contributing

Contributions are welcome! If you have ideas to improve this project, find a bug, or want to add new features:

Open an issue to discuss your suggestions or report problems
Fork the repository and submit a pull request with your changes
Please follow best coding practices and include relevant tests and documentation

Support

If you found this project helpful, you can support the work by:

This repository represents a comprehensive journey through modern deep learning, from basic neural networks to advanced transformer architectures for video understanding.