Data mining 1000-2M03DM
1. Introduction to KDD and data mining; templates and patterns
2. Transaction data analysis and association rules; main algorithms for association rule generation: Apriori, AprioriTid, FP-tree.
3. Classification problem and classifier evaluation methods; case based methods, naive Bayes classifiers, Bayesian networks. Improving nearest neighbors classifiers 4. Entropy measure and decision tree methods.
5. Clustering problem and clustering algorithms
6. Computational learning theorem
7. Rule-based classifiers;
8. Data cleaning and data preprocessing techniques;
9. Hidden Markov Model and its application
10. Searching for sequence patterns from time series data
11. OLAP and data mining
12. Web mining and text mining;
Main fields of studies for MISMaP
mathematics
Type of course
Mode
Prerequisites
Prerequisites (description)
Course coordinators
Learning outcomes
Knowledge and skills:
1. Knows the basic classes of problems related to data mining and knowledge discovery.
2. Knows and is able to use in practice the methods of market basket analysis, understands and is able to apply in practice the algorithms for searching for frequent itemsets.
3. Knows and is able to apply basic ML algorithms.
4. Can evaluate the effectiveness of ML models in classification, regression, and clustering problems.
5. Knows the basic techniques of text processing for the construction of ML models and is able to apply them in practice.
6. Can construct simple recommendation systems and understand their operation.
7. Knows the basic methods of constructing predictive models for time series. Can apply them to real-world data sets and assess their actual effectiveness.
8. Knows current major trends in fields of science related to machine learning and knowledge discovery from databases.
Social competence:
1. Is able to prepare a report on exploratory data analysis presenting the most important information using data visualization techniques.
2. Can present the results of the conducted analyzes.
Assessment criteria
The final grades are based on the sum of points from the laboratory and the exam.
Additionally, doctoral students may pass this course through the preparation of a special project involving participation in an international data mining competition.
Bibliography
1. "Data Mining: Concepts and Techniques". J. Han and M. Kamber. Morgan Kaufmann Publishers. 2001
2. "Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations". I. Witten and E. Frank. Morgan Kaufmann Publishers. 2000.
3. "Advances in Knowledge Discovery and Data Mining". Eds.: Fayyad, Piatetsky-Shapiro, Smyth, and Uthurusamy. The MIT Press, 1995.
4. Jure Leskovec, Anand Rajaraman, and Jeffrey David Ullman. 2014. Mining of Massive Datasets (2nd. ed.). Cambridge University Press, USA.
Additional information
Information on level of this course, year of study and semester when the course unit is delivered, types and amount of class hours - can be found in course structure diagrams of apropriate study programmes. This course is related to the following study programmes:
- Bachelor's degree, first cycle programme, Computer Science
- Master's degree, second cycle programme, Computer Science
- Master's degree, second cycle programme, Mathematics
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: