Linguistic seminar: An introduction to corpus linguistics 3304-1DZXW-KJ-11
Corpus linguistics is the study of language based on large collections of "real life" language use selected according to specific criteria. Their exploration is possible thanks to search engines using CQL (Coprus query language). The data obtained in this way make it possible to describe various linguistic phenomena in a more objective way than it is possible through introspection or questionnaire research and perceive new, previously unnoticed regularities in the language. The seminar aims to familiarize the participants with the theoretical assumptions of corpus linguistics, as well as with examples of the possibilities of using French and Polish corpora in research on vocabulary, syntax, phraseology, etc. In addition to reading articles devoted to selected issues, practical classes are planned that will allow you to acquire and practice the ability to search for data in various corpora, plan and conduct a corpus research, and analyze the results obtained. [Translation: Google Translate]
Course coordinators
Learning outcomes
Knowledge: the student knows the theoretical assumptions of corpus linguistics, the typology of corpuses, the features of a good corpus, the characteristics of selected French and Polish-language corpora, the possibilities of their use in various types of linguistic research and in glottodidactics.
Skills: the student is able to use corpus search engines, formulate a query in CQL, download and process corpus data, interpret the results of the research.
Social skills: the student is able to cooperate in a group, share knowledge, organize the work of a team.
Assessment criteria
The seminar ends with a grade. The following elements will be taken into account in the grading process:
- active attendance (no more than 50% of absences are allowed, including no more than 2 unexcused absences);
- homework (exercices and individual corpus research)
The form and criteria for passing the course may change depending on the current epidemic threat situation. Equivalent credit conditions will be established in accordance with the guidelines in force at the University of Warsaw, in consultation with the participants of the classes. [Translation: Google Translate]
Baude O. (et al.) (2006). Corpus oraux, guide des bonnes pratiques, Orléans, CNRS Editions/Presses Universitaires Orléans. URL:
Cavalla C., Hartwell, L. (2018) « L’enseignement et l’apprentissage de l’écrit académique à l’aide de corpus numériques », Lidil, 58. URL:
Cori, M., David, S. (2008), « Les corpus fondent-ils une nouvelle linguistique ? », Langages, 171(3), 111-129. URL:
Frérot, C., Karagouch L. (2016) « Outils d’aide à la traduction et formation de traducteurs : vers une adéquation des contenus pédagogiques avec la réalité technologique des traducteurs », ILCEA. URL:
Habert, B., Nazarenko, A., Salem, A. (1997), Les linguistiques de corpus, Paris, Armand Colin.
Pincemin, B.. « Concordances et concordanciers : de l'art du bon KWAC », dans: XVIIe colloque d'Albi Lagages et signification - Corpus en Lettres et Sciences sociales : des documents numériques à l'interprétation. URL:
Yan R., Tutin A. & Tran T.T.H., (2018), « Routines verbales pour les français langue étrangère : des corpus d’experts aux corpus d’apprenants », Lidil, 58. URL:
Tran T. T. H. & Falaise A. (2018), « Un dictionnaire basé sur corpus pour une aide à la rédaction universitaire », Lidil, 58. URL:
Williams G. (éd.) (2005), La linguistique de corpus, Rennes, Presses Universitaires de Rennes.
Zuffrey S. (2020) Introduction à la linguistique de corpus, London, ISTE Editions
Additional information
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: