Learner corpora, emerging constructions, and language teaching
This talkadopts a usage-based perspectiveon language acquisition to investigate how knowledge of verb-argument constructions (VACs) develops in second language learners across proficiency levels. I will first present findings from an analysis of L1 German and L1 Spanish learner use of English VACs, such as the ‘V aboutn’(e.g.,let’s talk about the weather) or the ‘V with n’ construction (e.g., he always agreed with her).I will then discuss what the findings mean for language teaching.The analysis is based oncorpora of learner writing at different levels of proficiency, described in further detail below. I was interested in determining (1) how VACs develop in second language (L2)writingas proficiency increases and (2) how the use and emergence of VACs is affected by the learner’s first language.
The paper builds on previous work on learner knowledge of VACs carried out in a usage-based linguistics tradition (Gries, &Wulff, 2005; Römer, et al., 2014a and 2014b). This work has shown that advanced learners of English have constructional knowledge, that learners’ VAC knowledge differs in systematic ways from that of native speakers, and that learners’ verb-VAC associations differ across L1 groups.What previous studies have not been able to address, mostly due to the unavailability of pertinent data at lower proficiency levels,is how this constructional knowledge unfolds over time (though see Li, Eskildsen, &Cadierno, 2014). Likewise, only few studies have systematically contrasted learners from different L1 backgrounds to investigate the role of transfer from the first language. The present talk seeks to take steps to closing both of these gaps.
To gather information on learner VAC use at different proficiency levels, I use subsets of the Education First-Cambridge Open Language Database (EFCAMDAT; Geertzen, Alexopoulou, &Korhonen, 2013), consisting of writing samples by learners of a range of L1s who were placed into 16 different proficiency levels.For this study, I retrieved sets of texts written by German and Spanish learners at Common European Framework of Reference (CEFR) levels A1 through C2. The resulting EFCAMDAT subsets—over 28,000 texts and 2.8 million words from L1 German learners, and over 40,000 texts and 3.2 million words from L1 Spanish learners—constitute a pseudo-longitudinal learner corpus that complements existing corpus resources. From these EFCAMDAT subsets,Iexhaustively retrieved instances of 19 different VACs. In addition, and in order to provide further evidence on advanced learner VAC knowledge, data on the same 19 VACs was retrieved from the German and Spanish subcomponents of the International Corpus of Learner English (ICLE) and the Louvain International Database of Spoken English Interlanguage (LINDSEI).