Advanced Programming Techniques for Bioinformaticians 1000-713ZTP
Part I – Programming Tools
How to develop software professionally?
- Getting started: configuring a virtual environment, PyCharm, and uv. Basics of bash for running computations on a server.
- Building modules and CLI scripts using the argparse library.
- Advanced tools for interactive work: IPython and Jupyter Notebook.
- Using debuggers and profilers for analyzing Python code.
- Software testing: best practices and tools such as pytest.
- Python in the industry: type hints, data modeling and validation, static type analysis, code quality, formatting, and documentation generation using typing, pydantic, mypy, flake8, black, and pydoc.
- Source code management tools: Git, GitHub, and GitLab. Continuous integration and tox.
- Building larger packages, managing inter-package dependencies, and working with pip and conda.
- Generative models supporting coding tasks: ChatGPT, GitHub Copilot.
Part II – Data Analysis Tools
How to carry out a scientific project and share it with the world?
- Working with tabular data and multidimensional arrays. Broadcasting and vectorization. Using pandas and numpy.
- Statistical and scientific tools in the SciPy package.
- Advanced plotting with seaborn, plotly, bokeh, and geoplot.
- Processing structured files: BeautifulSoup, JSON, and YAML.
- Automated retrieval of online data via the requests library.
- Implementing and deploying APIs with Selenium, FastAPI, and uvicorn.
- Containerization.
Part III – Selected Bioinformatics Tools
What does Biopython offer and what are the alternatives?
- Reading and processing biological sequences.
- Processing and visualizing structural data using Bio.PDB and PyMOL.
- Calling external subprocesses in Python applications using Bio.Application and subprocess.
Type of course
Prerequisites (description)
Course coordinators
Learning outcomes
Upon successful completion of the course, the student:
- Has a working knowledge of the Python programming language and its libraries sufficient to develop medium-scale applications.
- Is familiar with commonly used tools for data analysis, visualization, and general data science workflows.
- Understands widely adopted tools and practices for collaborative software development and project management in both academic and industrial settings.
- Is able to clearly present the structure, tools, and rationale behind a software project.
- Has gained practical experience through the staged development of their own bioinformatics project, applying the introduced tools and methods in a real-world context.
Assessment criteria
Throughout the semester, students will work on one or two programming projects developed in stages. Projects may be completed individually or in small teams. Detailed grading criteria will be provided in the course materials on the faculty's Moodle platform.
The final examination will take the form of a presentation and discussion of the completed project. A prerequisite for passing the course is demonstrating a thorough understanding of the presented project during the exam.
Bibliography
1. Fluent Python (2nd edition). Luciano Ramalho
2. Python Distilled, David M. Beazley
3. Effective Python: 90 specific ways to write better Python (2nd edition). Brett Slatkin
Additional information
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: