Information Refining 2700-M-ZBD-D2RI
Teaching is about digital information and data collection. Students will familiarize themselves with the identification of available sources of information. The main attention will be directed towards Big Data and the value hidden in such resources. Students will learn about information refining - a method of obtaining information. The subject also includes: identifying sentiments, models of the studied phenomena and models of prediction. Tools such as regular expressions and string splitting are applicable. As part of the course, case studies will be presented, including the results of information refining applications and practical aspects of data refining, including data cleansing.
The classes are carried out in the form of lectures and workshops.
Type of course
Course coordinators
Learning outcomes
After completing the course, students:
KNOWLEDGE:
- know what information refining is all about,
- knows how to look for information in the field of science, business, politics, media and other areas of human activity.
- knows refining methods for structured and unstructured data,
- is familiarized with exemplary case studies of data refining,
SKILLS:
- has the knowledge necessary in the field of data refinement, etc.,
- has the ability to construct regular expressions,
- can choose software for refining data from websites and other data sources,
OTHER COMPETENCES:
- the student has the competence to research and refine a wide spectrum of raw information. These competences can be useful in research teams, analytical teams, business units processing large sets of structured and unstructured data
Assessment criteria
Assessment of lectures - project
Passing the exercises - for a grade - current assessment of the student's work.
The final grade is a weighted average:
60% - project,
30% - activity in class,
10% - attendance at classes.
Bibliography
Primary Literature
• Bowles M., “Machine Learning in Python. Essential Techniques for Predictive Analysis”, Wiley, 2015
• Breiman L., “Statistical Modeling: The Two Cultures“, Statistical Science, 2001, Vol. 16, No. 3, 199–231
• Brownlee J., “Deep Learning for Natural Language Processing”, MLM, 2017
• Dinsmore T., “Disruptive Analytics, Charting Your Strategy for Next-Generation Business Analytics”,Springer 2016
• Goodfellow I., Bengio Y.,Courville A.,”Deep Learning” MIT Press, 2018
• Ed. Leonelli S., Tempini N., “Data Journeys in the Sciences”, Springer (open), 2020
• Ed. Trovati M., Hill R., AnjumA., Zhu S., Liu L., “Big-Data Analytics and Cloud Computing. Theory, Algorithms and Applications”, Springer 2015
Secondary Literature
• Gogołek, W. „Technologie informacyjne mediów”. Warszawa: Oficyna Wydawnicza ASPRA-JR, 2006.
• Kleppmann, M., Walczak T., „Przetwarzanie danych w dużej skali: niezawodność, skalowalność i łatwość konsekwencji systemów”. Gliwice: Helion, 2018.
• Mayer-Schönberger, V., Cukier K., Głatki M.,. Big data: rewolucja, która zmieni nasze myślenie, pracę i życie”. Warszawa: MT Biznes, 2014.
Additional information
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: