Language in social communication: Introduction to machine learning for linguistic applications 3201-LST-OC-LSC4
Introduction to Machine Learning in Linguistics
● Week 1: Course overview; Introduction to ML and its relevance to linguistics.
Essential ML Concepts
● Week 2: Overview of corpora in linguistics.
● Week 3: Language parsing and annotation techniques.
● Week 4: Fundamentals of Recurrent Neural Networks (RNNs), Seq2Seq, and attention
mechanisms.
● Week 5: Word embeddings, Transformers, and Language Models, and their role in
understanding language.
Building Linguistic Corpora and Data Generation
● Week 6: Data generation techniques for linguistic applications (Part 1): Grammars, Web
Crawling, and manual annotations.
● Week 7: Data generation techniques for linguistic applications (Part 2): LLMs.
Machine Learning Pipeline
● Week 8: Steps to train ML models; Model evaluation metrics.
● Week 9: Case studies and best practices in ML model training.
Applications in Linguistics
● Week 10: Sentiment analysis in social media and reviews.
● Week 11: Speech recognition systems and challenges.
● Week 12: The process and challenges of machine translation.
Project and Revision
● Week 13: Student project proposals and initial presentations.
● Week 14: Project development workshop.
● Week 15: Final project presentations and course wrap-up.
Type of course
Mode
Course coordinators
Learning outcomes
● Understand the basics of machine learning and natural language processing (NLP).
● Learn to build and evaluate ML models for linguistic tasks.
● Apply ML to practical linguistic problems like sentiment analysis and machine translation.
Assessment criteria
Assessment Methods:
● Weekly homework assignments (30%)
● Midterm project proposal (20%)
● Final project (40%)
● Participation and class activities (10%)
Additional information
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: