Introduction to computational biology 1000-2N03BO

1. Basic knowledge of molecular biology, structure of nucleic acids and proteins, transcription and translation.
2. Molecular sequence analysis: sequencing by hybridization, algorithms for global and local alignment of two sequences.
3. Mathematical models of molecular evolution: Jukes-Cantor and Kimura models for DNA sequences, PAM and BLOSUM substitution matrices for proteins.
4. Multiple sequence alignment: dynamic programming, greedy algorithms, efficient heuristics (CLUSTALW, T-Coffee, MUSCLE).
5. Hidden Markov Models and their applications to molecular sequences: Viterbi and Baum-Welch algorithms.
6. Searching sequence databases: BLAST algorithm, statistical significance of alignment scores.
7. Finding motifs in DNA sequences, functional enrichment analysis of gene sets.
8. Introduction to phylogenetics: reconstructing phylogenetic trees of single genes and reconciling them.
9. Introduction to genomic data analysis: mapping reads to reference genome, genome assembly, metagenomics.

The course will be given in Polish, if no non-Polish-speaking students register for it.

Course coordinators

Aleksander Jankowski

Type of course

elective courses

Prerequisites

Algorithms and data structures
Probability theory

Learning outcomes

Knowledge:
1. Has a general knowledge of the problems of contemporary computational biology.
2. Has basic knowledge of mathematical models and computational methods used in the description of molecular sequences.

Skills:
1. Can implement fundamental bioinformatics analyses of molecular sequences.
2. Can use advanced bioinformatics tools to analyze experimental data.

Competences:
1. Knows the limitations of their own knowledge and understands the need for further education (K_K01).
2. Is able to manage their time and make commitments and meet deadlines (K_K05)
3. Is able to use interdisciplinary literature.

Assessment criteria

Theory test (30 points), programming assignments (20 points), programming homework (20 points), at least 50% of the sum of points must be obtained to pass. Oral exam (30 points).

In the case of completing the course by a doctoral student, an additional element will be to read an original research article that is close to the current research front and discuss it with the lecturer.

Bibliography

1. Durbin, R., Eddy, S. R., Krogh, A., & Mitchison, G. (1998). Biological sequence analysis: Probabilistic models of proteins and nucleic acids. Cambridge University Press.
2. Pevzner, P. A. (2000). Computational molecular biology: An algorithmic approach. MIT Press.
3. Ewens, W. J., & Grant, G. R. (2001). Statistical methods in bioinformatics: An introduction. Springer.
4. Campbell, A. M., & Heyer, L. J. (2007). Discovering genomics, proteomics, and bioinformatics (2nd ed). CSHL Press.

Additional information

Information on level of this course, year of study and semester when the course unit is delivered, types and amount of class hours - can be found in course structure diagrams of apropriate study programmes. This course is related to the following study programmes: