Corpus Linguistics and the Study of Fiction in the Digital Age
主讲人：Professor Michaela Mahlberg
Michaela Mahlberg is Professor of corpus linguistics at the University of Birmingham, UK, where she is also the Directorof the Centre for Corpus Research and the Director of Research and Knowledge Transfer for the College of Arts and Law. Michaela is the editor of the InternationalJournal of Corpus Linguistics (John Benjamins) and together with WolfgangTeubert she edits the book series Corpus and Discourse (Bloomsbury). Oneof her main areas of research is Dickens’s fiction and the socio-culturalcontext of the 19th century. Her publications include CorpusStylistics and Dickens’s Fiction (Routledge, 2013), English GeneralNouns: a Corpus Theoretical Approach (John Benjamins, 2005) and Text,Discourse and Corpora. Theory and Analysis (Continuum, 2007, co-authoredwith Michael Hoey, Michael Stubbs and Wolfgang Teubert). Michaela was thePrincipal Investigator on the AHRC-funded project CLiC Dickens:Characterisation in the representation of speech and body language from acorpus linguistic perspective which led to the development of the CLiC webapp.
Corpus linguistic methods can be seen asearly manifestations of digital humanities. However, corpus research hastraditionally focused on non-literary texts. While the application of corpuslinguistic methods to literary texts is increasingly referred to under theumbrella term of ‘corpus stylistics’ (cf. e.g. Semino and Short 2004) the studyof fiction still requires new tools and methodologies that are tailored to thedescription of literary qualities. In this talk, I willillustrate key functionalities of the web application CLiC (http://clic.bham.ac.uk/) that has been specifically designed for the corpus linguisticstudy of narrative fiction. I will, for instance, discuss applications of theKWICgrouper to support the identification of patterns of body languagepresentation. The examples will be drawn from the CLiC corpora. The CLiCcorpora comprise more than 130 books across four subcorpora: the corpus ofDickens’s Novels, the 19th Century Reference Corpus (19C), theCorpus of 19th Century Children’s Literature (ChiLit) and the Corpusof Additional Requested Texts (ArTs). For all CLiC texts, direct speech andspecific places around speech have been marked up (Mahlberg et al. 2016) toenable searches across defined textual subsets. The findings that can begenerated with CLiC open up possibilities for the comparison of patterns inliterary and non-literary language. In this talk, I will not only illustratethe innovative potential of corpus methods for the study of literature, I willalso make links to relevant current debates in the digital humanities.