Introduction to topological data analysis 1000-1M24TDA
In this lecture, we will discuss the following topics:
1. Broad overview of data analysis, difference between qualitative/descriptive, and statistical methods. Focus on clustering methods, clustering coefficients and similar concepts. Metrics and similarity measures for data sets. Geometric features in data, role of topology in data analysis. High dimensional data, limitations of metrics for high dimensional data (Curse of dimensionality). Dimension reduction, and variable selection methods.
2. Overview of topology: Mathematical and computational preliminaries. Usecase of topology when precise measurements of distances are not available. Topological notions of equivalence. Combinatorial topology: simplicial/cubical/(regular) CW complexes. Nerve complexes and the nerve lemma.
3. Graphs from data and descriptive TDA: Reeb graphs, merge trees, cover complexes, and mapper type algorithms, from the point of view of plots of relations (A, f (A)) for A being a subsample of a high dimensional set.
4. Standard mapper and Ball mapper, cluster graphs.
5. Complexes from data/Approximating shapes: How to obtain complexes from trees, graphs (both abstract and embedded), and pointclouds. Examples: filtered simplicial complexes (Vietoris—Rips, Cech, Alpha). Homotopy equivalence to sublevel sets of single or multivariable functions. Introduction of efficient data structures to store those complexes (simplex tree).
6. Chains and cycles as generalization of paths and cycles in graphs, simplicial persistent homology (in particular with Z2 coefficients), reduction algorithm. Barcodes.
7. Finite representation of persistent homology: Decomposition theorem. Persistence diagrams, distances between them. Discussing further requirements for statistics: stability and vectorization.
8. A stability theorem for persistent homology.
9. Vectorizations of persistence: persistence images, persistence landscapes,
Betti curves, Euler characteristic curves.
10. Topological Goodness-of-Fit tests: example of an application to time series analysis
11. Motivation for and limits to multi-parameter persistence, Euler characteristic curves and profiles.
12. Optimization of computations: Introduction to Discrete Morse Theory (DMT), connection of DMT and filtrations / persistent homology.
13. Iterated Morse complexes as a way to compute (persistent) homology with field coefficients using Morse theory.
14. TDA and machine learning
15. Overview of applications: Medical applications (Brain function, breast cancer, diabetes, lung structure in COPD), Industrial applications (classification of materials, applications to economics and political sciences, market prediction).
Koordynatorzy przedmiotu
Założenia (lista przedmiotów)
Efekty kształcenia
Basic understanding of topological data analyis, in particular the ability to identify
appropriate use cases of TDA in real-world applications as well as the ability to use basic TDA tools in practice via python. Moreover, the ability to read and understand basic scientific articles about TDA.
Kryteria oceniania
Programming or theoretical semester project (50%),
Oral exam (50%).
In order to pass the class, both need at least a passing grade.
Bonus points towards the semester project may be earned via homework.
Literatura
Herbert Edelsbrunner and John Harer, Computational Topology, an introduction,
AMS 2011.
Paweł Dłotko, Applied and computational topology Tutorial,
https://arxiv.org/abs/1807.08607
Mischaikow, Kaczynski, Mrozek, Computational Topology, Springer 2004.
Gunnar Carlsson and Mikael Vejdemo-Johansson, Topological Data Analysis with
Applications, Camebridge University Press, 2022
Gudhi library: gudhi.inria.fr
Więcej informacji
Więcej informacji o poziomie przedmiotu, roku studiów (i/lub semestrze) w którym się odbywa, o rodzaju i liczbie godzin zajęć - szukaj w planach studiów odpowiednich programów. Ten przedmiot jest związany z programami: