Statistical machine learning 1000-317bSML
The detailed program
Explanatory Statistics (2-3 lessons)
Basic summary statistic (sample mean, median, sample variance, etc.)
Visualization of data (histogram, box-plots, kernel density estimation)
Principal component analysis
Clusterization, hierarchical clustering, k-means, k-medoids
Statistical theory (4-5 lessons)
Basic definitions (statistical models, statistics, likelihood, etc.)
Estimation theory (maximum likelihood estimators, efficiency, mean square error, bias vs variance trade-off, confidence intervals)
Statistical hypothesis testing (type I and type II errors, power of a test, significance, p-value)
Problems with p-values (effect size, multiple hypotheses testing)
Bayesian inference (prior distribution, posterior distribution, Bayesian risk and Bayesian estimator, credible intervals)
Distance between probability measures (Kullback-Leibler divergence, total variation distance, etc.)
Simple regression and classification models (3-4 lessons)
Linear regression
Classification. Logistic regression, LDA, QDA
Cross-validation and bootstrap
Model selection and regularization. Lasso, ridge regression, forward-backwards procedure
Advanced models (3 lessons)
Tree-like models, bagging, random forests, boosting
Support vector machines
Non-linear models: splines, generalized additive models
Type of course
Course coordinators
Learning outcomes
Knowledge: the student
* has in-depth understanding of the branches of mathematics necessary to study machine learning (probability theory, statistics, multivariable calculus, and linear algebra) [K_W05];
* has based in theory and well organized knowledge of fundamental techniques of statistics used in modeling and data analysis [K_W07].
Abilities: the student is able to
* construct mathematical reasoning [K_U06];
* express problems in the language of mathematics [K_U07];
* apply techniques of modern statistical data analysis [K_U10].
Social competences: the student is ready to
* critically evaluate acquired knowledge and information [K_K01];
* recognize the significance of knowledge in solving cognitive and practical problems and the importance of consulting experts when difficulties arise in finding a self-devised solution [K_K02];
* think and act in an entrepreneurial way [K_K03].
Assessment criteria
Impact on the final grade: the final test 60%, two programming assignments 40%, in lab activity 10%.
Bibliography
1. Trevor Hastie, Robert Tibshirani, Jerome H., The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer, Berlin
2. Andrew Ng, Machine Learning Yearning, https://github.com/ajaymache/machine-learning-yearning
Additional information
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: