- Inter-faculty Studies in Bioinformatics and Systems Biology
- Bachelor's degree, first cycle programme, Computer Science
- Bachelor's degree, first cycle programme, Mathematics
- Master's degree, second cycle programme, Bioinformatics and Systems Biology
- Master's degree, second cycle programme, Computer Science
- Master's degree, second cycle programme, Mathematics
Machine Learning Methods for Biomolecular Modeling and Discovery 1200-UMPB-OG
1. Fundamentals of Machine Learning
1.1 Machine learning methods; neural network architectures; data encoding; training and validation.
1.2 Applications in biomolecular modeling and drug discovery.
1.3 Introduction to the representation of proteins, ligands, and chemical compounds. Overview of data formats such as SMILES, PDB, and FASTA.
1.4 Basics of cheminformatics; models describing chemical compounds, calculation of molecular properties (QSAR/QSPR).
Demonstration: Using the RDKit library in Python to parse SMILES and convert them into molecular graphs. Visualizing structures and calculating basic properties such as molecular weight, hydrogen bond count, and polarity.
2. Machine Learning in Biomolecular Modeling
2.1 Simple classification and regression models; predicting ligand binding to proteins.
2.2 Advanced network architectures: convolutional, recurrent, and graph neural networks.
Demonstration: Using the PyTorch library to predict small-molecule binding to cytochrome CYP1A2.
3. Generative Neural Networks
3.1 Theoretical introduction to generative neural networks and their applications in molecular modeling.
3.2 Generating new chemical molecules, optimizing drug properties, and analyzing chemical space.
3.3 Examples of generating chemical compounds and proteins using conditional models (cVAEs).
Demonstration: Using cVAEs to generate new molecules from a property focused space, such as toxicity or pharmacokinetics.
4. Protein Design Using ML
4.1 Concepts of protein design, including de novo design, sequence modifications, and practical applications.
4.2 Leveraging generative ML methods for generating new protein sequences and structures.
4.3 Machine learning methods used in protein design: RFDiffusion, AlphaFold, Protein MPNN.
Demonstration: Designing a globular protein using existing online models.
Type of course
Mode
Blended learning
Prerequisites (description)
Course coordinators
Learning outcomes
The student understands:
Fundamental concepts of machine learning and their applications in biomolecular modeling and drug discovery.
The functioning of neural networks and the differences between basic architectures.
The student is able to:
Analyze and process biomolecular data, including representations of proteins and chemical molecules.
Critically evaluate the application of machine learning in biomedicine, considering ethical challenges and technological limitations.
Design de novo proteins, optimizing their functionality using deep learning models and generative networks.
Assessment criteria
To pass the course, you must:
1) have a minimum of 2/3 attendance at classes (assuming 15 classes per semester, at most 5 absences from classes are allowed)
2) send the solution of the final project
Practical placement
N/A
Additional information
Information on level of this course, year of study and semester when the course unit is delivered, types and amount of class hours - can be found in course structure diagrams of apropriate study programmes. This course is related to the following study programmes:
- Inter-faculty Studies in Bioinformatics and Systems Biology
- Bachelor's degree, first cycle programme, Computer Science
- Bachelor's degree, first cycle programme, Mathematics
- Master's degree, second cycle programme, Bioinformatics and Systems Biology
- Master's degree, second cycle programme, Computer Science
- Master's degree, second cycle programme, Mathematics
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: