CudaVision Lab - Summer 2025
Welcome to the comprehensive documentation for the Deep Learning course assignments and projects from CudaVision Lab, Summer 2025 at the University of Bonn.
Overview
This repository contains implementations and experiments from a comprehensive deep learning course, covering fundamental neural network architectures, computer vision, generative models, and advanced transformer-based approaches. The course progresses from basic neural networks to state-of-the-art transformer architectures for video understanding.
Course Structure
The course is organized into 7 assignments and 1 course project, each building upon previous concepts:
Assignments 1-2: Fundamentals
- Assignment 1: Neural Network Fundamentals - Building and training basic MLPs and CNNs from scratch
- Assignment 2: Transfer Learning and Fine-tuning - Leveraging pre-trained models for custom tasks
Assignments 3-4: Sequential and Generative Models
- Assignment 3: Recurrent Neural Networks - Implementing RNNs from scratch for action recognition
- Assignment 4: Variational Autoencoders - Generative models for image reconstruction and generation
Assignments 5-6: Advanced Generative and Self-Supervised Learning
- Assignment 5: Generative Adversarial Networks - Implementing DCGAN and CDCGAN
- Assignment 6: Self-Supervised Learning - Contrastive learning with Siamese networks and SimCLR
Assignment 7: Vision Transformers
- Assignment 7: Vision Transformers for Action Recognition - Applying transformer architectures to video understanding
Course Project: Video Prediction
- Course Project: Video Prediction with Object Representations - Advanced transformer-based video prediction using holistic and object-centric representations
Quick Start
Installation
-
Clone the repository:
-
Install dependencies:
-
Navigate to specific assignments: Each assignment is self-contained in its respective directory under
src/.
Requirements
- Python 3.8+
- PyTorch 2.0+
- CUDA-capable GPU (recommended)
- Jupyter Notebook
- See
requirements.txtfor complete dependency list
Technical Stack
The projects use a consistent technical stack across all assignments:
- Framework: PyTorch
- Visualization: TensorBoard, Matplotlib, Seaborn
- Data Processing: NumPy, PIL, torchvision
- Evaluation: scikit-learn metrics, custom evaluation scripts
- Documentation: MkDocs with Material theme
Documentation
For detailed information about each assignment, please refer to the Repository Structure page, which provides links to comprehensive README files for each assignment and the course project.
Each assignment includes: - Complete implementation details - Training configurations - Experiment results - Visualization examples - Code usage instructions
Contributing
Contributions are welcome! If you have ideas to improve this project, find a bug, or want to add new features:
- Open an issue to discuss your suggestions or report problems
- Fork the repository and submit a pull request with your changes
- Please follow best coding practices and include relevant tests and documentation
Support
If you found this project helpful, you can support the work by:
This repository represents a comprehensive journey through modern deep learning, from basic neural networks to advanced transformer architectures for video understanding.