Machine Learning 2: predictive models, deep learning, neural network 2400-DS2ML2

1. Decision trees in regression: methods of building trees, splitting algorithms CART, C5.0, splitting criterion, interpretation of results, problem of overfitting, stopping criteria, prunning the trees, cost-complexity parameter, cross-validation, testing, building predictions using trees,
2. Decision trees in classification: splitting criterions and measures of node impurity, Gini coefficient, Information Gain, node entropy, primary and surrogate splits,
3. Bagging and random forests: boostrapping and resampling methods, sampling with replications, subagging, predictor randomization, averaging the trees. out-of-bag errors, parameters tuning
4. Boosting: weak learners vs. Strong learners, gradient boosting, extreme gradient boosting, regularization methods, adaptive boosting
5. Ensembling: simple and weighted averaging, majority voting, weighted voting, stacking of bottom-layer models and top-layer model
6. Neural Networks: artificial neurons, topology of the neural networks, input layers, hidden layers, output layers, weights, bias, activation functions, backpropagation algorithm, methods of weights modification
7. Convolutional Neural Networks: filters, receptive fields, activation maps, structure of CNN, convolutional layers, ReLu layers, pooling layers, dropout layers, stride, padding, transfer learning, methods of data augmentation, backpropagation algorithm
8. Recurrent Neural Networks: sequence processing, idea of internal loop, recurrent connection, state variable, vanishing gradient problem, LSTM/GRU layers, carry dataflow, recurrent dropout, recurrent layers stacking, bidirectional recurrent networks, time-series forecasting using RNNs, sequences classification, sentiment analysis

Szacunkowy nakład pracy studenta:
Typ aktywności K (kontaktowe) S (samodzielne)
wykład (zajęcia) 0 0
ćwiczenia (zajęcia) 30 0
egzamin 2 0
konsultacje 8 0
przygotowanie do ćwiczeń 0 30
przygotowanie do wykładów 0 0
przygotowanie do kolokwium 0 0
przygotowanie do egzaminu 0 30
… 0 0
Razem 40 60 = 100

Type of course

obligatory courses

Course coordinators

Paweł Sakowski
Ewa Weychert

Learning outcomes

After completing the course, the students will have structured and reliable knowledge on decision trees and neural networks. They will be able to apply them for both regression and classification problems. Also the will know how convolutional neural networks and recurrent neural networks work. They will know the theoretical foundations of these algorithms, as well as have programming skills allowing them to deploy the models in practice, also in the cloud framework. They will also know to interpret results and explain how they work to other non-technical people.
K_W01, K_U01, K_U02, K_U03, K_U04, K_U05, KS_01,

Assessment criteria

home taken project

Bibliography

1. Chollet, Allaire (2017) “Deep Learning with R”, Manning Publications
2. Chollet, Allaire (2017) “Deep Learning with Python”, Manning Publications
3. Géron Aurélien (2018) Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
4. Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani (2017), “Introduction to statistical learning. With Applications in R”, Springer-Verlag
5. Kuhn Max, Johnson Kjell (2013), “Applied predictive modelling”, Springer-Verlag
6. Hastie Trevor, Robert Tibshirani and Jerome Friedman (2009), “Elements of statistical learning”, Springer-Verlag
7. Zheng Alice (2018), Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists, O’Reilly
8. Lantz Brett (2013), “Machine Learning with R”, Packt, open source

Additional information

Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system:

Description of 2400-DS2ML2 in USOSweb