CrossRef citations to date
Research Article

Computer-Based Listening Test with Full Video, Visual-Limited Video, and Audio: A Comparative Analysis Based on Difficulty, Discrimination Power, and Response Time



  • Batty, A. O. (2015). A comparison of video- and audio-mediated listening tests with many-facet Rasch modeling and differential distractor functioning. Language Testing, 32(1), 3–20. doi:10.1177/0265532214531254
  • Batty, A. O. (2021). An eye-tracking study of attention to visual cues in L2 listening tests. Language Testing, 38(4), 511–535. doi:10.1177/0265532220951504
  • Bejar, I., Douglas, D., Jamieson, J., Nissan, S., & Turner, J. (2000). TOEFL 2000 listening framework: A working paper. (TOEFL Monograph Series, Report No. 19). Princeton, NJ: Educational Testing Service. https://www.ets.org/research/policy_research_reports/publications/report/2000/iciu
  • Bryant, W. (2017). Developing a strategy for using technology-enhanced items in large-scale standardized tests. Practical Assessment, Research & Evaluation, 22(1), 1–10.
  • College Board (2023). Digital SAT. https://satsuite.collegeboard.org/digital
  • Coniam, D. (2001). The use of audio or video comprehension as an assessment instrument in the certification of English language teachers: A case study. System, 29(1), 1–14. doi:10.1016/S0346-251X(00)00057-9
  • Educational Testing Service (2022). TOEFL iBT test. https://www.ets.org/toefl/test-takers/ibt/register/at-home-requirements.html
  • Ginther, A. (2002). Context and content visuals and performance on listening comprehension stimuli. Language Testing, 19(2), 133–167. doi:10.1191/0265532202lt225oa
  • Gruba, P. (1997). The role of video media in listening assessment. System, 25(3), 335–345. doi:10.1016/S0346-251X(97)00026-2
  • Haladyna, T. M., & Rodriguez, M. C. (2013). Developing and validating test items. New York, NY: Routledge.
  • Holzknecht, F., McCray, G., Eberharter, K., Kremmel, B., Zehentner, M., Spiby, R., & Dunlea, J. (2021). The effect of response order on candidate viewing behaviour and item difficulty in a multiple-choice listening test. Language Testing, 38(1), 41–61. doi:10.1177/0265532220917316
  • IMS Global Learning Consortium (2022). IMS question & test interoperability assessment test, section and item information model. https://www.imsglobal.org/question/qtiv2p1/imsqti_infov2p1.html
  • Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73. doi:10.1111/jedm.12000
  • Kang, T., Arvizu, M. N. G., Chaipuapae, P., & Lesnov, R. O. (2019). Reviews of academic English listening tests for non-native speakers. International Journal of Listening, 33(1), 1–38. doi:10.1080/10904018.2016.1185210
  • Kim, A. A., Tywoniw, R. L., & Chapman, M. (2022). Technology-enhanced items in grades 1–12 English language proficiency assessments. Language Assessment Quarterly, 19(4), 343–367. doi:10.1080/15434303.2022.2039659
  • Lesnov, R. O. (2017). Using videos in ESL listening achievement tests: Effects on difficulty. Eurasian Journal of Applied Linguistics, 3(1), 67–91. doi:10.32601/ejal.461034
  • Lesnov, R. O. (2022). Furthering the argument for visually inclusive L2 academic listening tests: The role of content-rich videos. Studies in Educational Evaluation, 72, 101087. doi:10.1016/j.stueduc.2021.101087
  • Li, Z., Banerjee, J., & Zumbo, B. D. (2017). Response time data as validity evidence: Has it lived up to its promise and, if not, what would it take to do so. In B. D. Zumbo & A. M. Hubley (Eds.), Understanding and investigating response processes in validation research (pp. 159–177). Springer. doi:10.1007/978-3-319-56129-5_9
  • Loevinger, J. (1957). Objective tests as instruments of psychological theory. Psychological Reports, 3(3), 635–694. doi:10.2466/pr0.1957.3.3.635
  • Ministry of Education, Culture, Sports, Science and Technology. (2018). Koutougakkou Gakusyu Shidou Yoryo Kaisetsu :Gaikokugohen Eigohen [The national curriculum in English for high schools]. https://www.mext.go.jp/content/1407073_09_1_2.pdf [in Japanese]
  • National Center for University Entrance Examinations. (2023). Eigo Listening Ni Tsuite [On English listening tests]. https://www.dnc.ac.jp/kyotsu/listening.html [in Japanese]
  • Ockey, G. J. (2007). Construct implications of including still image or video in computer-based listening tests. Language Testing, 24(4), 517–537. doi:10.1177/0265532207080771
  • Open Assessment Technologies. (2022). TAO testing. https://www.taotesting.com/
  • Parshall, C. G., Harmes, J. C., Davey, T., & Pashley, P. J. (2010). Innovative items for computerized testing. In W. J. van der Linden & C. A. W. Glas (Eds.), Elements of adaptive testing (pp. 215–230). Springer. doi:10.1007/978-0-387-85461-8_11
  • Pusey, K. (2020). Assessing L2 listening at a Japanese university: Effects of input type and response format. Language Education and Assessment, 3(1), 13–35. doi:10.29140/lea.v3n1.193
  • Qian, H., Woo, A., & Kim, D. (2017). Exploring the psychometric properties of innovative items in computerized adaptive testing. In H. Jiao & R. W. Lissitz (Eds.), Technology enhanced innovative assessment: Development, modeling and scoring from an interdisplinary perspective (pp. 95–116). Charlotte, NC: Information Age Publishing.
  • R Core Team. (2022). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/
  • Robitzsch, A., Kiefer, T., & Wu, M. (2022). TAM: Test analysis modules. R package version 4.0-16. https://CRAN.R-project.org/package=TAM
  • Roy, R. K. (2001). Design of experiments using the Taguchi approach: 16 steps to product and process improvement. New York, NY: A Wiley Interscience Publication.
  • Shin, D. (1998). Using videotaped lectures for testing academic listening proficiency. International Journal of Listening, 12(1), 57–80. doi:10.1080/10904018.1998.10499019
  • Suvorov, R. (2009). Context visuals in L2 listening tests: The effects of photographs and video vs. audio-only format. In C. A. Chapelle, H. G. Jun, & I. Katz (Eds.), Developing and evaluating language learning materials (pp. 53–68). Ames, IA: Iowa State University.
  • Suvorov, R. (2015). The use of eye tracking in research on video-based second language (L2) listening assessment: A comparison of context videos and content videos. Language Testing, 32(4), 463–483. doi:10.1177/0265532214562099
  • Suvorov, R., & He, S. (2022). Visuals in the assessment and testing of second language listening: A methodological synthesis. International Journal of Listening, 36(2), 80–99. doi:10.1080/10904018.2021.1941028
  • Taguchi, G. (1987). System of experimental design: Engineering methods to optimize quality and minimizing costs. White Plains, NY: UNIPUB/Kraus International Publications.
  • van der Linden, W. J. (2010). Linear models for optimal test design. New York, NY: Springer.
  • Wagner, E. (2007). Are they watching? Test-taker viewing behavior during an L2 video listening test. Language Learning & Technology, 11(1), 67–86. https://eric.ed.gov/?id=EJ805397
  • Wagner, E. (2010). Test-takers’ interaction with an L2 video listening test. System, 38(2), 280–291. doi:10.1016/j.system.2010.01.003
  • Wagner, E. (2013). An investigation of how the channel of input and access to test questions affect L2 listening test performance. Language Assessment Quarterly, 10(2), 178–195. doi:10.1080/15434303.2013.769552
  • Wagner, E., & Ockey, G. J. (2018). An overview of the use of audio-visual texts on L2 listening tests. In E. Wagner & G. J. Ockey (Eds.), Assessing L2 listening: Moving toward authenticity (pp. 130–144). Amsterdam, Netherlands: John Benjamins Publishing Company.
  • Wools, S., Molenaar, M., & Hopster-den Otter, D. (2019). The validity of technology enhanced assessments: Threats and opportunities. In B. P. Veldkamp & C. Sluijter (Eds.), Theoretical and practical advances in computer-based educational measurement (pp. 3–19). Springer. doi:10.1007/978-3-030-18480-3_1
  • Yan, D., von Davier, A. A., & Lewis, C. (2014). Computerized multistage testing: Theory and applications. Boca Rato, FL: CRC Press.
  • Zenisky, A. L., & Baldwin, P. (2006). Using item response time data in test development and validation: Research with beginning computer users. Paper Presented at the Annual Meeting of The National Council on Measurement in Education, San Francisco, CA, April 8-10, 2006.