Tools Supporting Data Analysis in Python 1000-1M20NPD
1. Introduction to the Python programming language, code writing standards.
2. Working with the command line on Linux - often very useful for example for remote access.
3. Git - code repositories, versioning, and synchronizing.
4. Modules - better code organization, greater readability, and ease of code maintenance.
5. The argparse library and other options for configuring the program state and handling configuration files - how to use the program for different data (and get clear messages about incorrect parameters) without the need to modify the program.
6. Jupyter - an interactive console in the browser, allows convenient remote work, easy preview, and convenient presentation of results.
7. Software testing - good practices and useful tools.
8. Using the debugger while testing - how to speed up finding and understanding the source of program errors.
9. Python package - how to prepare code for sharing between projects.
10. Numpy library - support for multidimensional arrays.
11. Pandas library - support for data in tables.
12. Profiler - how to find program parts that really slow down the execution of the program.
13. Continuous integration and tox - that is, how to verify that the project is well defined and described, by automatic checking that no errors have been introduced to the project (eg. by adding new functions).
14. Cython, numba.jit - what to do to speed up program execution or use a library written in C language, which has no Python interface.
Type of course
Course coordinators
Learning outcomes
After completing the course the student knows:
- Python in sufficient degree to create own medium-sized applications,
- commonly used tools for data analysis,
- commonly used tools for teamwork.
Assessment criteria
Small programming tasks during the semester (30%).
Exam in the form of discussing the final assignment (70%).
The so-called "zero-exam" can be taken only by those students who had been awarded at least 90% of points from the small programming tasks. The form of this exam is exactly the same as in the case of the regular exam.
Bibliography
- Python for Data Analysis, Wes McKinney, 2nd ed., 2017,
- Python Crash Course, a hands-on, project-based introduction to programming, 2nd ed. Eric Matthes, 2019.
- Programming Python, Mark Lutz, 4th ed., 2011.
- Pro Git , Scott Chacon i Ben Straub, 2nd ed., 2014.
- https://docs.python.org/.
- Internet documentation for tools presented during this course.
Additional information
Information on level of this course, year of study and semester when the course unit is delivered, types and amount of class hours - can be found in course structure diagrams of apropriate study programmes. This course is related to the following study programmes:
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: