Statistical machine learning 1000-2M21US
The detailed program
Explanatory Statistics (2-3 lessons)
Basic summary statistic (sample mean, median, sample variance, etc.)
Visualization of data (histogram, box-plots, kernel density estimation)
Principal component analysis
Clusterization, hierarchical clustering, k-means, k-medoids
Statistical theory (4-5 lessons)
Basic definitions (statistical models, statistics, likelihood, etc.)
Estimation theory (maximum likelihood estimators, efficiency, mean square error, bias vs variance trade-off, confidence intervals)
Statistical hypothesis testing (type I and type II errors, power of a test, significance, p-value)
Problems with p-values (effect size, multiple hypotheses testing)
Bayesian inference (prior distribution, posterior distribution, Bayesian risk and Bayesian estimator, credible intervals)
Distance between probability measures (Kullback-Leibler divergence, total variation distance, etc.)
Simple regression and classification models (3-4 lessons)
Linear regression
Classification. Logistic regression, LDA, QDA
Cross-validation and bootstrap
Model selection and regularization. Lasso, ridge regression, forward-backwards procedure
Advanced models (3 lessons)
Tree-like models, bagging, random forests, boosting
Support vector machines
Non-linear models: splines, generalized additive models
Type of course
Learning outcomes
Knowledge: the student
* has in-depth understanding of the branches of mathematics necessary to study machine learning (probability theory, statistics, multivariable calculus, and linear algebra);
* has based in theory and well organized knowledge of fundamental techniques of statistics used in modeling and data analysis.
Abilities: the student is able to
* construct mathematical reasoning;
* express problems in the language of mathematics;
* apply techniques of modern statistical data analysis.
Social competences: the student is ready to
* critically evaluate acquired knowledge and information;
* recognize the significance of knowledge in solving cognitive and practical problems and the importance of consulting experts when difficulties arise in finding a self-devised solution;
* think and act in an entrepreneurial way.
Assessment criteria
Final test and programming assignment with grades
Bibliography
1. Trevor Hastie, Robert Tibshirani, Jerome H., The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer, Berlin
2. Andrew Ng, Machine Learning Yearning, https://www.deeplearning.ai/machine-learning-yearning/
Additional information
Information on level of this course, year of study and semester when the course unit is delivered, types and amount of class hours - can be found in course structure diagrams of apropriate study programmes. This course is related to the following study programmes:
- Bachelor's degree, first cycle programme, Computer Science
- Master's degree, second cycle programme, Computer Science
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: