Introduction to Corpus Analysis 3301-JF2665-1ST
Language corpora are increasingly used in research in linguistics. They provide access to vast resources of authentic and natural, written and oral linguistic production, and thus facilitate a more accurate and reliable analysis of language at its many levels: phonetic and phonological, morphological, syntactic, lexical, phraseological, semantic, pragmatic and at the discourse. Thanks to corpora, new methods of linguistic data analysis were created, emphasizing the concepts of frequency, patterning and variation. Corpus linguistics also proposes a new approach to language description based on probability rather than rules.
This course is meant as an introduction to the rapidly growing field of corpus linguistics. During the course students will be introduced to the basic concepts in the field and will learn about the major corpora-related projects in the world, especially related to English language. They will also familiarise themselves with different methods of corpus analysis (such as frequency lists or concordance lines) and with statistical tests used to interpret its results. They will also get acquainted with a range of electronic tools used to process corpus data. An overview of the main applications of corpora in language studies will be provided. Students will also learn how to build their own corpus and how to use it in their own linguistic research.
Zakres tematów zajęć:
1. Preliminaries
a. basic concepts related to corpora and corpus linguistics
b. types of language corpora
c. the role of language corpora in linguistics research
d. the most prominent corpus projects in English and Polish
e. most popular corpus tools
2. Corpus analyses
a. frequency lists
b. concordances
c. extraction of collocations
d. lexical bundles (n-grams)
e. keywords
f. statistical tests
3. Compiling custom corpora
4. Examples of corpus-informed research into (applied) linguistics
Type of course
Mode
Prerequisites (description)
Course coordinators
Learning outcomes
Knowledge:
The student will know and understand
K_W01 the specifics of corpus analysis within the field of humanities.
K_W02 the trends in the development of corpus-based research within English linguistics.
Skills:
The student will be able to
K_U01 use the terminology and conceptual apparatus of corpus linguistics.
K_U03 use the acquired knowledge to describe and solve a problem and carry out a scientific project on a topic within the discipline of linguistics.
K_U04 analyse and synthesise linguistic concepts and phenomena in a social and historical context.
K_U05 recognise differences between alternative methodological approaches used in linguistics, with particular reference to corpus methodology.
K_U08 participate in project work, interact with others as part of team work and lead the work of a team.
K_U09 present acquired knowledge in a coherent, precise and linguistically correct manner.
Social competences:
The student will be ready to
K_K02 engage in lifelong learning, personal and professional development using the knowledge and skills acquired during the course.
K_K03 take responsibility for their own work and respect the work of others, ensuring compliance with professional ethics, developing the ethos of the profession and ethical principles and standards in science in relation to corpus linguistics.
K_K04 critically assess their own knowledge and skills in the field of corpus linguistics.
Assessment criteria
1. class attendance (two absences permitted)
2. participation in class discussions
3. completion of all homework assignments
4. completion of a short corpus project in groups of two or three on a topic chosen by the participants and agreed with the lecturer. The project assessment is the main component of the final assessment (80%).
Bibliography
Textbooks:
• Biber, D., & Reppen, R. (Eds.). (2015). The Cambridge Handbook of English Corpus Linguistics. Cambridge University Press. https://doi.org/10.1017/CBO9781139764377
• McEnery, T., & Brezina, V. (2022). Fundamental Principles of Corpus Linguistics. Cambridge University Press. https://doi.org/10.1017/9781107110625
• McEnery, T., & Hardie, A. (2011). Corpus Linguistics: Method, Theory and Practice. Cambridge University Press.
• O’Keeffe, A., & McCarthy, M. (Eds.). (2022). The Routledge Handbook of Corpus Linguistics (2nd edition). Routledge.
• Paquot, M., & Gries, S. T. (Eds.). (2021). A Practical Handbook of Corpus Linguistics. Springer.
Additional reading:
Selected papers which serve as examples of corpus-based studies in different areas of linguistics
Additional information
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: