Introduction to Computational Linguistics 3301-JF2705
This course presents an overview of the field of computational linguistics. It covers key concepts and techniques used in a variety of natural language processing applications and presents computational tools employed in linguistic research, lexicography, translation, and language pedagogy. We will also discuss applications of linguistic knowledge in designing intelligent technologies.
Broad topics include:
- Research Methods in Experimental Linguistics (Experiment Design);
- Gamification in second language learning;
- Formal grammars and their role in natural language processing;
- Corpus Linguistics;
- Linguistic Perspectives on Artificial Intelligence (chatbots and AIML).
Type of course
Mode
Prerequisites (description)
Learning outcomes
The prospective graduate will have acquired insight into the various computational research methods used in linguistics. Furthermore, they will have developed an analytical skillset which is indispensible in the interdisciplinary study of natural language.
Detailed study outcomes:
K_W08 - the student acquires advanced knowledge about contemporary linguistics; in particular, the student differentiates between various formal theories of language description and is aware of their advantages and limitations;
K_W09 - the student is able to differentiate between deductive, inductive, and abductive reasoning;
K_W09 - the student gains increased insight into research design in linguistics and philology; in particular, the student learns about methods, techniques, and tools that will enable them to conduct innovative research into natural language;
K_U01 - the student is able to apply the methodology that they have learnt; in particular, the student is capable of designing and performing their own experiments relating to the study of natural language;
K_U01 - the student is able to operationalize the phenomena analyzed, propose a null and alternate hypothesis, and test both hypotheses using the rigor of the Scientific Method;
K_U01 - the student can apply deductive, inductive, and abductive reasoning in their research;
K_U04 - the student is capable of producing a complete concept of a linguistic research project;
K_U09 - the student is capable of assessing the viability of various theoretical constructs in their own linguistic and philological research as well as for practical purposes;
K_U10 - the student is capable of critically reflecting on prior research and defining their own future research objectives based on past studies in the field; in particular, the student is capable of operationalizing a research problem and propose an appropriate method of analyzing it;
K_U14 - the student is capable of taking advantage of modern technology (in particular information technology) in their pursuit of further knowledge in the field of linguistics and for the sake of professional development;
K_W20 - the student is aware of, and adheres to, the highest ethical standards in science, values integrity above self-interest, and aspires to academic excellence for the betterment of their own self and society at large;
K_U01 - the student is proficient in advanced, specialist linguistic terminology in English; this level of proficiency is expected to be no less than at C1-level. as defined by the CEFR, with the student making an active effort to attain full proficiency (C2 level) whenever possible.
Assessment criteria
The final grade is comprised of two components, course participation and the final test.
Course participation amounts to 70% of the final grade and includes attendance and assignment credits. The final test contributes to the remaining 30% of the final grade.
Practical placement
None
Bibliography
NOTE: All necessary reading material is provided by the teacher on a dedicated Moodle server in electronic form. Enclosed below is a comprehensive list of the primary source materials used (i.e. this is NOT in itself mandatory reading, although it is recommended).
Agresti, A. & Finlay, B. (2008) Statistical Methods for the Social Sciences
Babbie, E. (2012). The Practice of Social Research, 13th Edition. Wadsworth Publishing.
Biber, D. (2012). Corpus-Based and Corpus-driven Analyses of Language Variation and Use. Oxford Handbooks Online. Retrieved 20 Nov. 2016, from http://www.oxfordhandbooks.com/view/10.1093/oxfordhb/9780199544004.001.0001/oxfordhb-9780199544004-e-008.
Boulton, A.; Carter-Thomas, S. & Rowley-Jolivet, E. (ed.): CorIpus-informed Research and Learning in ESP (2012). Amsterdam: John Benjamins.
Broege, N. & Brick, A. (2015). Introduction to Social Science Methods: An Overview of Quantitative and Qualitative Methods
Burke, B. (2016). Gamify: How gamification motivates people to do extraordinary things. Routledge.
Carnie, A. (2002). Syntax. A generative introduction. Malden: Blackwell Publishers.
Chomsky, N. & Schützenberger, M.P. (1963). The algebraic theory of context free languages. in Braffort, P., Hirschberg, D. Computer Programming and Formal Languages. Amsterdam: North Holland. pp. 118–161.
Chomsky, N. (1957). Syntactic Structures. The Hague/Paris: Mouton.
Chomsky, N. (1959). On certain formal properties of grammars. Information and Control 2 (2): 137–167. doi:10.1016/S0019-9958(59)90362-6.
Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
Chomsky, N. (1973). Conditions on Transformations. in Anderson and Kiparsky, A Festschrift for Morris Halle, New York: Holt, Rinehart & Winston, pp. 232–286.
Chomsky, N. (1981). Lectures on Government and Binding. Mouton de Gruyter.
Chomsky, N. (1986). Knowledge of Language: Its Nature, Origin and Use. New York: Praeger.
Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: The MIT Press.
Chou, Y. K. (2019). Actionable gamification: Beyond points, badges, and leaderboards. Packt Publishing Ltd.
Field, A. P. (2013). Discovering statistics using IBM SPSS Statistics: and sex and drugs and rock 'n' roll (fourth edition). London: Sage publications.
Field, A. P., Miles, J. N. V., & Field, Z. C. (2012). Discovering statistics using R: and sex and drugs and rock 'n' roll. London: Sage publications.
Gałkowski, B. (2010). Exploring Formulaicity. Unpublished Course Materials.
Jurafsky, D.; Martin, J.H. (2009). Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics. 2nd edition. Prentice-Hall.
Kapp, K. M. (2012). The gamification of learning and instruction: game-based methods and strategies for training and education. John Wiley & Sons.
Larsen-Hall, J. (2015). A Guide to Doing Statistics in Second Language Research Using R. Routledge.
Lu, X. (2014). Computational Methods for Corpus Annotation and Analysis. Dordrecht: Springer. [Reviewed in Language, International Journal of Corpus Linguistics, and Linguist List]
Manning, C.D.; Schütze, H. (1999). Foundations of Statistical Natural Language Processing. The MIT Press: Cambridge, Massachusetts.
Moravcsik, Edith 2006. An introduction to syntactic theory. London: Continuum.
National Research Council (2001), “Educating Children with Autism. Committee on Educational Interventions for Children with Autism. Catherine Lord and James P. McGee, eds. Division of Behavioral and Social Sciences and Education. Washington, D.C. National Academy Press
Newmeyer, F. (1986) Linguistic Theory in America (2nd edition). Academic Press.
Opacki, M. (2017). Reconsidering Early Bilingualism. A Corpus-Based Study of Polish Migrant Children in the United Kingdom. Frankfurt am Main: Peter Lang.
Pawley, A.; Syder, F. (1989). Two puzzles for linguistic theory: Nativelike selection and nativelike competence. in J.C. Richards and R.W. Schmidt (eds), Language and Communication, pp. 191–227. London: Longman, 1983.
Payne, Thomas 2006. Exploring language structure. A student's guide. Cambridge: CUP.
Peirce, C. S. (1931-1935). Collected Papers of Charles Sanders Peirce, vols. 1–6, 1931–1935 [in:] Hartshorne, C. and Weiss, P. (eds.) (1958), vols. 7–8. Harvard University Press: Cambridge, MA.
Pollard, C. & Sag, I.A. (1994). Head-driven Phrase Structure Grammar. Chicago University Press / CSLI Publications, Chicago, IL.
Pollard, C. (1996). The nature of constraint-based grammar. Paper delivered at the Pacific Asia Conference on Languagein formation, and Computation, Kyung Hee University, Seoul, Korea, December 20, 1996. Available from: ftp://julius.ling.ohio-state.edu/pub/pollard/anthology/paclic.txt.
Radford, Andrew 1989. Transformational grammar. A first course. Cambridge: CUP.
Sage Research Methods. URL[11.10.2016]: http://srmo.sagepub.com/
Smith, D. (2003). Five guidelines for research ethics. URL[11.10.2017]: http://www.apa.org/monitor/jan03/principles.aspx
Steinkuehler, C., Squire, K., & Barab, S. (Eds.). (2012). Games, learning, and society: Learning and meaning in the digital age. Cambridge University Press.
Tognini-Bonelli, E. (2001): Corpus Linguistics at Work. Amsterdam/Philadelphia: John Benjamins.
Wray, A. (1998). Protolanguage as a holistic system for social interaction. Language & Communication 18:47-67.
Wray, A. (2000). Holistic utterances in protolanguage: the link from primates to humans. in Knight, C., Studdert-Kennedy, M. & Hurford, J. (eds.). The evolutionary emergence of language. Stanford, CA: Cambridge University Press.
Wray, A. (2002). Formulaic language and the lexicon. Cambridge UK: Cambridge University Press.
Yule, G. (2001). The Study of Language [Third Edition]. Cambridge: Cambridge University Press.
Additional information
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: