Machine learning 1000-135UM
(I) Classical machine learning
0. Basic concepts, tasks, methods.
• Concepts: data – sample from population, loss function, risk/error, prediction, parameters, hyperparameters, learning/training/fitting/estimation, learning error, generalization error.
• Tasks: unsupervised learning: feature dimension reduction (PCA) and data dimension reduction (clustering); supervised learning: prediction = regression or classification.
• Learning by means of (penalized) empirical risk minimization.
1. Unsupervised learning.
• Variable dimension reduction: Principal Component Analysis (PCA) via SVD.
• Clustering using the k-means algorithm as an example. Soft k-means for a mixture of normal distributions.
2. Linear prediction. Fitting a logistic model using Iteratively Reweighted Least Squares.
3. Penalization of linear prediction.
• Variable selection, or L0 penalization.
• Ridge Regression, or L2 penalization. Penalty tuning via SVD.
• Lasso, or L1 penalization. Model fitting using Coordinate Descent. Penalty tuning using cross-validation.
4. Enhancement of linear prediction using kernel transformations of explanatory variables: Kernel Regularized Least Squares and Support Vector Machines (SVM). Fitting SVM using Gradient Descent (GD) and quadratic optimization with constraints.
5. Tree based prediction.
• Location, dispersion and dependency measures based on probability densities.
• Greedy tree generation and pruning using L0 penalization.
• Random Forests.
• Enhancement of linear and tree based prediction using Gradient Boosting.
6. Enhancement of linear and tree based prediction via Gradient Boosting method.
(II) Introduction to deep neural networks.
7. Basic concepts: network, architecture, layer, backbone, data batch, learning epoch.
8. Multi-layer perceptron (MLP) – fitting using the Stochastic Gradient Descent algorithm with an automatic differentiation (back-propagation).
9. MLP for data representation. An autoencoder network (nonlinear equivalent of PCA) with application to classification of handwritten digits.
10. Convolutional Neural Networks (ConvNets) with application to image classification. Convolution with local or global kernels. Computation of convolutions (discrete, cyclic, 1-2 dim) with global kernel using the Fourier transform (convolution theorem and FFT).
11. Transformers with application to sequence processing (text, speech, time series). Embeddings of words and natural numbers in R^512 and an averaging transformation called attention.
Type of course
Prerequisites (description)
Course coordinators
Learning outcomes
Student:
Understands the basic concepts, tasks and methods of machine learning.
Understands and uses unsupervised learning algorithms such as PCA or k-means.
Understands and uses L0 - L2 penalizations of linear prediction.
Understands and uses kernel enhancements of linear prediction.
Understands and uses tree-based prediction.
Understands and uses the Gradient Boosting algorithm.
Understands basic concepts related to deep neural networks.
Understands and uses the automatic differentiation algorithm and Stochastic Gradient Descent.
Understands and uses networks such as MLP, ConvNet and Autoencoder to classify digits.
Understands and uses simple Transformers for sentiment classification.
Assessment criteria
Final grade based on homework solutions and a written exam on the lecture content.
Bibliography
- Hastie, T., Tibshirani, R., Friedman, J. H., & Friedman, J. H. The elements of statistical learning: data mining, inference, and prediction. Springer 2009.
- Shalev-Shwartz, S., & Ben-David, S. Understanding machine learning: From theory to algorithms. Cambridge University Press 2014.
- Bishop, C. M., & Bishop, H. Deep learning: foundations and concepts. Springer 2024.
Additional information
Information on level of this course, year of study and semester when the course unit is delivered, types and amount of class hours - can be found in course structure diagrams of apropriate study programmes. This course is related to the following study programmes:
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: