Introduction to Deep Learning — DML

Course description:
This is an introductory course to Deep Learning. We will cover both the theoretical aspects in the design of neural networks and how to train and use them in practice. We will start by recalling classical machine learning tasks (along with a reminder of Python/Numpy), and then we will proceed to studying Deep Learning. We will present Deep Learning architectures and approaches while we learn how to use the PyTorch library. We will also study famous models and learning algorithms such as LeNet, ResNet, Proximal Policy Optimization (PPO) or ChatGPT. Depending on time and interest we will study some additional topics (see Topics).


Topics:

  • Basic Machine Learning tasks: Linear regression and classification. Basic concepts: losses, gradient descent, train-test-validation split, under- and overfitting, grid and random search. This first topic will also include a quick review of Python/Numpy.
  • Deep Neural Networks: Architecture (e.g., Multilayer Perceptrons, that is MLPs), activation functions, backpropagation, momentum, Adam, Gradient explosion and vanishing, initialization, dropout. In this topic we will start using the PyTorch library.
  • Principles of Geometric Deep Learning: Making use of symmetries in data to boost efficiency via parameter sharing, and acquire equivariant / invariant models. Deep Sets.
  • Convolutional Neural Networks (CNN): Definition as translation equivariant / invariant models, valid convolution, padding, and stride. Pooling layers. Normalization layers: Batch norm, layer norm, instance norm. Skip connections. Optional: Object detection, Generation of images using CNNs, Deep Dream.
  • Transformers: Text preprocessing, tokenization, embeddings. Multihead attention. Absolute and relative position embedding. Encoders and decoders. Pretraining and finetuning tasks. Alignment with human and AI preferences (eg. ChatGPT). Optional: Efficient transformers, LoRA, bitsandbytes.
  • Deep Reinforcement Learning: Markov Decision Processes, Bellmann Equation, temporal difference learning. Policy gradient, Proximal Policy Optimization.

Optional topics:
  • Recurrent Neural Networks (RNN): Stateful Neural Networks, Long-Short Term Memory (LSTM), Gated Recurrent Unit (GRU)
  • Variational Auto-Encoders (VAE): Latent representation, Estimate Lower Bound (ELBO) estimate.
  • Diffusion models.
  • Graph Neural Networks (GNN)
  • Metric Learning: Triplet loss, Hard negative mining, Circle loss.
  • Point Cloud Learning
  • Tensor deep learning: Tensor decompositions, bottlenecks
  • Unbalanced datasets: Under- and oversampling, Focal loss.

Note: You need to have a modern laptop. But you do not need to have a good GPU (Graphical Processing Unit) in it: if your laptop cannot handle a calculation, you can use google colab (https://colab.research.google.com/) for that.