269
Views
0
CrossRef citations to date
0
Altmetric
Article

An Empirical Evaluation of Lexical Diversity Indices in L2 Korean Writing Assessment

ORCID Icon, & ORCID Icon

REFERENCES

  • Alexopoulou, T., Michel, M., Murakami, A., & Meurers, D. (2017). Task effects on linguistic complexity and accuracy: A large‐scale learner corpus analysis employing natural language processing techniques. Language Learning, 67(S1), 180–208. https://doi.org/10.1111/lang.12232
  • Bai, D.-Y. (2012). Hankwuke haksupcauy ssukiey nathanan ehwi tayangto mich ehwi milto yenkwu [A study on the lexical variation and lexical density shown in writing of KFL learners]. Journal of Language Sciences, 19(1), 99–117.
  • Bauer, L., & Nation, P. (1993). Word families. International Journal of Lexicography, 6(4), 253–279. https://doi.org/10.1093/ijl/6.4.253
  • Bulté, B., & Roothooft, H. (2020). Investigating the interrelationship between rated L2 proficiency and linguistic complexity in L2 speech. System, 91, 102246. https://doi.org/10.1016/j.system.2020.102246
  • Carroll, J. B. (1964). Language and thought. Reading Improvement, 2(1), 80.
  • Castañeda-Jiménez, G., & Jarvis, S. (2014). Exploring lexical diversity in second language Spanish. In K. L. Geeslin (Ed.), Handbook of Spanish second language acquisition (pp. 498–513). Wiley. https://doi.org/10.1002/9781118584347.ch28
  • Chapelle, C. A. (1998). Construct definition and validity inquiry in SLA research. In L. F. Bachman & A. D. Cohen (Eds.), Interfaces between second language acquisition and language testing research (pp. 32–70). Cambridge University Press.
  • Chapelle, C. A., & Douglas, D. (2006). Assessing language through computer technology. Cambridge University Press.
  • Chapelle, C. A., Enright, M. K., & Jamieson, J. M. (Eds.). (2011). Building a validity argument for the Test of English as a foreign language. Routledge.
  • Chung, T., & Gildea, D. (2009). Unsupervised tokenization for machine translation. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (pp. 718–726). https://doi.org/10.3115/1699571.1699606
  • Cohen, J. (2013). Statistical power analysis for the behavioral sciences. Routledge. https://doi.org/10.4324/9780203771587
  • Covington, M. A., & McFall, J. D. (2010). Cutting the Gordian Knot: The moving-average type–token ratio (MATTR). Journal of Quantitative Linguistics, 17(2), 94–100. https://doi.org/10.1080/09296171003643098
  • Crossley, S. A., Salsbury, T., McNamara, D. S., & Jarvis, S. (2011). Predicting lexical proficiency in language learner texts using computational indices. Language Testing, 28(4), 561–580. https://doi.org/10.1177/0265532210378031
  • Engber, C. A. (1995). The relationship of lexical proficiency to the quality of ESL compositions. Journal of Second Language Writing, 4(2), 139–155. https://doi.org/10.1016/1060-3743(95)90004-7
  • Fergadiotis, G., Wright, H. H., & Green, S. B. (2015). Psychometric evaluation of lexical diversity indices: Assessing length effects. Journal of Speech, Language, & Hearing Research, 58(3), 840–852. https://doi.org/10.1044/2015_JSLHR-L-14-0280
  • Guiraud, P. (1960). Problèmes et méthodes de la statistique linguistique [Problems and methods of linguistic statistics]. Reidel.
  • Harrell, F. E., Jr., Harrell, M. F. E., Jr., & Hmisc, D. (2017). Rms R Package. https://CRAN.R-project.org/package=rms
  • Haspelmath, M., & Michaelis, S. M. (2017). Analytic and synthetic: Typological change in varieties of European languages in language variation. In I. Buchstaller & B. Siebenhaar (Eds.), European perspectives VI: Selected papers from the 8th international conference on language variation in Europe (ICLaVE 8), Leipzig 2015 (pp. 3–22). Benjamins. https://doi.org/10.1075/silv.19.01has
  • Haspelmath, M., & Sims, A. (2013). Understanding morphology. Routledge.
  • Herdan, G. (1960). Type-token mathematics: A textbook for mathematical linguistics. Mouton.
  • Hess, C. W., Sefton, K. M., & Landry, R. G. (1986). Sample size and type-token ratios for oral language of preschool children. Journal of Speech, Language, & Hearing Research, 29(1), 129–134. https://doi.org/10.1044/jshr.2901.129
  • Hur, W.-J., & Lee, M. (2019). hankwuke haksupcauy swuktaltopyel ehwi sayong yangsang [Study on Korean language learner’s vocabulary usage]. Bilingual Research, 77, 215–239.
  • Jarvis, S. (2002). Short texts, best-fitting curves and new measures of lexical diversity. Language Testing, 19(1), 57–84. https://doi.org/10.1191/0265532202lt220oa
  • Jarvis, S. (2013). Defining and measuring lexical diversity. In S. Jarvis & M. Daller (Eds.), Vocabulary knowledge: Human ratings and automated measures (pp. 13–43). Benjamins.
  • Jarvis, S. (2017). Grounding lexical diversity in human judgments. Language Testing, 34(4), 537–553. https://doi.org/10.1177/0265532217710632
  • Jarvis, S., & Hashimoto, B. J. (2021). How operationalizations of word types affect measures of lexical diversity. International Journal of Learner Corpus Research, 7(1), 163–194. https://doi.org/10.1075/ijlcr.20004.jar
  • Johnson, W. (1944). Studies in language behavior: I. A program of research. Psychological Monographs, 56(2), 1–15. https://doi.org/10.1037/h0093508
  • Kang, J.-H. (2018). Hankwuke haksupcauy ehwilyek paltalkwa ehwi sayong yangsang yenkwu -ssuki theyksuthuey nathanan ehwi chukcengul cwungsimulo- [A study on the vocabulary development and the lexicon use aspect of Korean learners – focusing on the lexicon development appearing in the writing test]. Bilingual Research, 71, 31–64.
  • Koizumi, R. (2012). Relationships between text length and lexical diversity measures: Can we use short texts of less than 100 tokens? Vocabulary Learning and Instruction, 1(1), 60–69. https://doi.org/10.7820/vli.v01.1.koizumi
  • Koizumi, R., & In’nami, Y. (2012). Effects of text length on lexical diversity measures: Using short texts with less than 200 tokens. System, 40(4), 554–564. https://doi.org/10.1016/j.system.2012.10.012
  • Kwon, H.-C., Kang, M.-Y., & Choi, S.-J. (2004). Stochastic Korean word-spacing with smoothing using Korean spelling checker. International Journal of Computer Processing of Languages, 17(4), 239–252. https://doi.org/10.1142/S0219427904001103
  • Kyle, K. (2019). Measuring lexical richness. In S. Webb (Ed.), The Routledge handbook of vocabulary studies (pp. 454–476). Routledge.
  • Kyle, K., Crossley, S. A., & Jarvis, S. (2021). Assessing the validity of lexical diversity indices using direct judgements. Language Assessment Quarterly, 18(2), 154–170. https://doi.org/10.1080/15434303.2020.1844205
  • Kyle, K., Sung, H., Eguchi, M., & Zenker, F. (2023). Evaluating evidence for the reliability and validity of lexical diversity indices in L2 oral task responses. Studies in Second Language Acquisition, 1–22. Advance online publication. https://doi.org/10.1017/S0272263123000402
  • Lee, S.-M. (2017). hankwuke haksupcauy malhakiwa ssukiey nathanan ehwi sayonguy congtancek yenkwu [A longitudinal study of vocabulary usage presented in speaking and writing of Korean learners]. The Korean Language and Literature, 74, 183–214.
  • Leech, G. N., Rayson, P., & Wilson, A. (2001). Word frequencies in written and spoken English. Longman.
  • MacWhinney, B. (2000). The CHILDES Project: Tools for analyzing Talk, vol. 2, the database. Lawrence Erlbaum.
  • Malvern, D., Richards, B., Chipere, N., & Durán, P. (2004). Lexical diversity and language development: Quantification and assessment. Palgrave Macmillan.
  • McCarthy, P. M., & Jarvis, S. (2007). vocd: A theoretical and empirical evaluation. Language Testing, 24(4), 459–488. https://doi.org/10.1177/0265532207080767
  • McCarthy, P. M., & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods, 42(2), 381–392. https://doi.org/10.3758/BRM.42.2.381
  • Meara, P. (2005). Designing vocabulary tests for English, Spanish and other languages. In C. Butler, S. Christopher, M. Á. G. González, & S. M. Doval-Suárez (Eds.), The dynamics of language use (pp. 271–285). John Benjamins.
  • Mhammond. (n.d.). Pywin32 Python Package. https://github.com/mhammond/pywin32
  • Nation, P. (1990). Teaching and learning vocabulary. Newbury House.
  • Park, L. (2018). KoNLPy Python Package Documentation. https://buildmedia.readthedocs.org/media/pdf/konlpy/v0.3.3/konlpy.pdf
  • Qi, P., Zhang, Y., Zhang, Y., Bolton, J., & Manning, C. D. (2020). Stanza: A python natural language processing toolkit for many human languages. arXiv Preprint arXiv: 200307082. https://doi.org/10.18653/v1/2020.acl-demos.14
  • R Core Team. (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. https://www.R-project.org/
  • Ripley, B., Venables, B., Bates, D. M., Hornik, K., Gebhardt, A., Firth, D., & Ripley, M. B. (2013). Mass R Package. https://cran.r-project.org/web/packages/MASS/index.html
  • Schmitt, N., Jiang, X., & Grabe, W. (2011). The percentage of words known in a text and reading comprehension. The Modern Language Journal, 95(1), 26–43. https://doi.org/10.1111/j.1540-4781.2011.01146.x
  • Shin, G-H, & Jung, B. K. (2021). Automatic analysis of passive constructions in Korean: Written production by Mandarin-speaking learners of Korean. International Journal of Learner Corpus Research, 7(1), 53–82. https://doi.org/10.1075/ijlcr.20002.shi
  • Sung, H., & Shin, G-H (2023). Towards L2-friendly pipelines for learner corpora: A case of written production by L2-Korean learners. In Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023) (pp. 72–82). https://doi.org/10.18653/v1/2023.bea-1.6
  • Thomson, G. H., & Thompson, J. R. (1915). Outlines of a method of the quantitative analysis of writing vocabularies. British Journal of Psychology, 8(1), 52–69. https://doi.org/10.1111/j.2044-8295.1915.tb00128.x
  • TOPIK. (n.d.). Retrieved July 31, 2022, from https://www.topik.go.kr/HMENU0/HMENU00018.do
  • Treffers-Daller, J. (2013). Measuring lexical diversity among L2 learners of French. In S. Jarvis & M. Daller (Eds.), Vocabulary knowledge: Human ratings and automated measures (pp. 79–104). John Benjamins. https://doi.org/10.1075/sibil.47.05ch3
  • Treffers-Daller, J., Parslow, P., & Williams, S. (2018). Back to basics: How measures of lexical diversity can help discriminate between CEFR levels. Applied Linguistics, 39(3), 302–327. https://doi.org/10.1093/applin/amw009
  • Tweedie, F. J., & Baayen, R. H. (1998). How variable may a constant be? Measures of lexical richness in perspective. Computers and the Humanities, 32(5), 323–352. https://doi.org/10.1023/A:1001749303137
  • Vidal, K., & Jarvis, S. (2020). Effects of English-medium instruction on Spanish students’ proficiency and lexical diversity in English. Language Teaching Research, 24(5), 568–587. https://doi.org/10.1177/1362168818817945
  • Won, Y. (2016). Common European framework of reference for language (CEFR) and Test of Proficiency in Korean (TOPIK). International Journal of Area Studies, 11(1), 39–58. https://doi.org/10.1515/ijas-2016-0003
  • Won, H., Lee, H., & Kang, S. (2020). Multi-prototype morpheme embedding for text classification. In The 9th International Conference on Smart Media and Applications (pp. 295–300). https://doi.org/10.1145/3426020.3426095
  • Zenker, F., & Kyle, K. (2021). Investigating minimum text lengths for lexical diversity indices. Assessing Writing, 47, 100505. https://doi.org/10.1016/j.asw.2020.100505

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.