Statistical machine learning 1000-1M18SUM

In recent years, the view has been clarified that deep neural networks are effective where there is a lot of data, the predictors have a certain spatial or temporal organization (such as signals, texts or images) and the signal-to-noise ratio is high. However, where: there is little data or no special organization, or the signal is weak or the results should be interpretable, modern linear methods, such as lasso or gradient boosting, are better (Robert Tibshirani in his presentation after receiving the ISI award, May 2021). The lecture is an introduction to the supervised learning, in other words statistical prediction, focused on such modern methods and partly based on "The Elements of Statistical Learning" by Hastie, Tibshirani and Friedman.

In the first part, I discuss the basic prediction method of the continuous response, that is the linear regression model, and ridge regression, lasso and the best subset selection. In the second part, I present linear prediction methods of the discrete response, that is classifications such as Fisher's linear discriminant analysis, logistic regression, or support vector machines. In the third part, I discuss universal, non-linear predictors, such as the k-nearest neighbors method and decision trees. The fourth part is devoted to the regularization of the learning and boosting the power of prediction. In particular, I discuss penalization of the prediction error, kernelization of explanatory variables and boosting (linear combinations of "weak" predictors). In the last part I will discuss the use of linear methods and deep neural networks (ConvNets, Visual Transformers) for the prediction of image properties such as classification or segmentation.

The lecture is focused on important machine learning methods, which are solutions of the penalized loss minimization on the training data. In this way, we get sets of predictors indexed by a "hyperparameter" (e.g. the weight of the function which penalizes the predictor parameters), which value is calculated on additional validation data or by means of cross validation on the training data. I will devote a lot of attention to rigorously explain popular validation procedures (also called the model selection).

Main fields of studies for MISMaP

computer science
mathematics

Type of course

elective monographs

Mode

Remote learning
Classroom

Prerequisites (description)

The lecture is a continuation of "Statistical data analysis" or "Statistics" or "Probability and Statistics".

Course coordinators

Piotr Pokarowski

Learning outcomes

Knowledge and skills:

1. Comprehend the basic methods of prediction.

2. Is able to learn a prediction function to the learning data, select its hyperparameter on validation data, estimate the prediction error on test data.

Social competence:

Can use prediction to study natural or social phenomena.

Assessment criteria

The final grade will be equal to the maximum of:
- grade for activity in the classroom (e.g. detecting an error in calculations, alternative proof or derivation of a prediction method),
- grade for solving homework during the course,
- grade for the oral exam or the programming project.

Bibliography

1. http://statweb.stanford.edu/~tibs/ftp/ISI.pdf
2. Hastie T., Tibshirani R. and Friedman J. The Elements of Statistical Learning, Springer 2009.
3. Shai Shalev-Shwartz S. and Ben-David S. Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press 2014.
4. Bishop, C. M., & Bishop, H. Deep learning: Foundations and concepts. Springer 2023.

Additional information

Information on level of this course, year of study and semester when the course unit is delivered, types and amount of class hours - can be found in course structure diagrams of apropriate study programmes. This course is related to the following study programmes:

Master's degree, second cycle programme, Mathematics

Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system:

Description of 1000-1M18SUM in USOSweb