Causal Machine Learning for Spatial Data 2400-ZEWW899
The main aim of the course is to make students familiar with the broad variety of data science methods for spatial data. Course is divided into two parts. The first one will be conducted by a visiting scholar: dr Kevin Credit and the other will be conducted by prof. Katarzyna Kopczewska and mgr Maria Kubara.
The course will be taught in an intensive workshop setting over the course of two weeks (daily meetings). The students are asked to bring their own laptops with R v.3.3.0+ and RStudio Desktop installed in order to take active part in the practical live code exercises discussed during the class.
---------
The first part of the course:
Instructor:
Dr Kevin Credit
Assistant Professor
National Centre for Geocomputation
Maynooth University
Email: kevin.credit@mu.ie
Web: Maynooth University profile
Course Description
Machine learning (ML) approaches are increasingly being used across the social sciences to answer questions related to transportation, urbanisation, housing, neighbourhood change, and economic development. And while earlier iterations of these methods focused primarily on predictive outcomes, recent cutting-edge extensions of these approaches are now being used to assess problems of causal inference, explicitly integrate spatial information, and provide insight into the explanatory relationships driving model results.
The purpose of this intensive short course is to provide a comprehensive introduction to causal ML and its application to spatial data. To do this, we will complete a series of practicals in R focused around analysing the causal impact of constructing a new elevated pedestrian and cycling path in Chicago (called the 606) on adjacent 1) construction activity, 2) new business creation, and 3) on-road CO₂ emissions. All of the code and data will be made available for open distribution and use.
This part of the course is organised around five 3-hour sessions, each of which contain three 45-minute modules (with breaks in between). Each session will begin with two lecture modules describing the conceptual foundations of the topic before moving into a practical module where students will work with the data on their own. The sessions are designed to build on one another progressively, so that by the end of the course students should have a strong foundational theoretical and methodological understanding of causal inference and ML.
Some of the topics covered in the course include:
• Exploratory spatial data analysis and mapping
• Potential outcomes model
• Directed Acyclic Graphs (DAG)
• Spatial causal models
• Heterogeneous treatment effects models (HTE)
• Causal forest model
• Spatial T-learner model (STL)
Course Objectives
1. Develop an understanding of the fundamental principles of causal inference, basic model assumptions, and sources of bias.
2. Develop an understanding of more advanced applications of causal inference, including spatial causal models, heterogeneous treatment effects models, and causal ML models.
3. Learn how to think about - and diagram - causal pathways, and apply those diagrams to empirical models.
4. Employ basic and advanced causal methods in the context of analysing an empirical problem of spatial-causal inference using available data.
Organisation and Materials
• The course will consist of five consecutive 3-hour sessions, each of which contain three 45-minute modules (with breaks in between).
• The module will use email and Google Drive as the main communication mechanisms.
• All data and code for the practicals will be provided by the instructor. Students will need access to their own computer or laptop with R v.3.3.0+ and RStudio Desktop installed to complete the practicals.
• Readings will be provided by the instructor through links to Google Drive on the course schedule (https://docs.google.com/document/d/1wLTCeGVdUWithLmq0RTCWdofAXAyQmNHLmx1ifM8n1s/edit).
---------
The second part of the course:
Instructor:
dr hab. Katarzyna Kopczewska prof. ucz.
mgr Maria Kubara
This part of the course will focus on machine learning techniques application to spatial data. The topics include:
• Spatial machine learning – challenges and opportunities
• Spatial data clustering – techniques and applications
• Geographically Weighted Regression and Spatial Random Forest
• Spatial artificial neural networks
• Recurrent neural networks in spatial setting
The list may be extended, depending on the initial skills and the interest of the group.
Both parts of the course will be complementary and will provide the students with a broad overview of the spatial machine learning techniques and their applications in R.
Type of course
Course coordinators
Learning outcomes
After this course the student:
- is familiar with the challenges of spatial data operation
- knows a range of machine learning techniques and can apply it to the spatial data in R
- student knows the practices of causal inference
K_U02, K_U05
Assessment criteria
The final grade will be based on the exam result (one assessment for both parts of the course).
Additional information
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: