164
Views
0
CrossRef citations to date
0
Altmetric
Research Article

A Model-Based Approach to the Disentanglement and Differential Treatment of Engaged and Disengaged Item OmissionsOpen Data

, &

References

  • Barnard, J., McCulloch, R., & Meng, X.-L. (2000). Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage. Statistica Sinica, 10(4), 1281–1311.
  • Buckland, S. T., Burnham, K. P., & Augustin, N. H. (1997). Model selection: An integral part of inference. Biometrics, 53(2), 603–618. https://doi.org/10.2307/2533961
  • Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P., Riddell, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, 76(1), 1–32. https://doi.org/10.18637/jss.v076.i01
  • Cosgrove, J. (2011). Does student engagement explain performance on PISA? Comparisons of response patterns on the PISA tests across time (Tech. Rep.). Dublin, Ireland: Educational Research Centre. Retrieved from http://www.erc.ie/documents/engagement_and_performance_over_time.pdf
  • de Ayala, R., Plake, B. S., & Impara, J. C. (2001). The impact of omitted responses on the accuracy of ability estimation in item response theory. Journal of Educational Measurement, 38(3), 213–234. https://doi.org/10.1111/j.1745-3984.2001.tb01124.x
  • Debeer, D., Janssen, R., & Boeck, P. (2017). Modeling skipped and not-reached items using IRTrees. Journal of Educational Measurement, 54(3), 333–363. https://doi.org/10.1111/jedm.12147
  • De Boeck, P., & Jeon, M. (2019). An overview of models for response times and processes in cognitive tests. Frontiers in Psychology, 10, 102. https://doi.org/10.3389/fpsyg.2019.00102
  • Frey, A., Spoden, C., Goldhammer, F., & Wenzel, S. F. C. (2018). Response time-based treatment of omitted responses in computer-based testing. Behaviormetrika, 45(2), 505–526. https://doi.org/10.1007/s41237-018-0073-9
  • Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7(4), 457–472. https://doi.org/10.1214/ss/1177011136
  • Gelman, A., & Shirley, K. (2011). Inference from simulations and monitoring convergence. In S. Brooks, A. Gelman, G. Jones, & X.-L. Meng (Eds.), Handbook of Markov chain Monte Carlo (pp. 163–174). Chapman Hall.
  • Glas, C. A. W., & Pimentel, J. L. (2008). Modeling nonignorable missing data in speeded tests. Educational and Psychological Measurement, 68(6), 907–922. https://doi.org/10.1177/0013164408315262
  • Gorgun, G., & Bulut, O. (2021). A polytomous scoring approach to handle not-reached items in low-stakes assessments. Educational and Psychological Measurement, 81(5), 847–871. https://doi.org/10.1177/0013164421991211
  • Guo, J., Gabry, J., Goodrich, B. (2018). rstan: R interface to Stan [Computer software manual]. Retrieved from https://CRAN.R-project.org/package=rstan (R package version 2.18.2)
  • Hinne, M., Gronau, Q. F., van den Bergh, D., & Wagenmakers, E.-J. (2020). A conceptual introduction to Bayesian model averaging. Advances in Methods and Practices in Psychological Science, 3(2), 200–215. https://doi.org/10.1177/2515245919898657
  • Holman, R., & Glas, C. A. W. (2005). Modelling non-ignorable missing-data mechanisms with item response theory models. The British Journal of Mathematical and Statistical Psychology, 58(Pt 1), 1–17. https://doi.org/10.1348/000711005X47168
  • Jakwerth, P. M., Stancavage, F. B., & Reed, E. D. (2003). An investigation of why students do not respond to questions. In NAEP validity studies. U.S. Department of Education, National Center for Education Statistics.
  • Köhler, C., Pohl, S., & Carstensen, C. H. (2015). Taking the missing propensity into account when estimating competence scores: Evaluation of item response theory models for nonignorable omissions. Educational and Psychological Measurement, 75(5), 850–874. https://doi.org/10.1177/0013164414561785
  • Köhler, C., Pohl, S., & Carstensen, C. H. (2017). Dealing with item nonresponse in large-scale cognitive assessments: The impact of missing data methods on estimated explanatory relationships. Journal of Educational Measurement, 54(4), 397–419. https://doi.org/10.1111/jedm.12154
  • Kong, X. J., Wise, S. L., & Bhola, D. S. (2007). Setting the response time threshold parameter to differentiate solution behavior from rapid-guessing behavior. Educational and Psychological Measurement, 67(4), 606–619. https://doi.org/10.1177/0013164406294779
  • Lewandowski, D., Kurowicka, D., & Joe, H. (2009). Generating random correlation matrices based on vines and extended onion method. Journal of Multivariate Analysis, 100(9), 1989–2001. https://doi.org/10.1016/j.jmva.2009.04.008
  • Man, K., & Harring, J. R. (2022). Detecting preknowledge cheating via innovative measures: A mixture hierarchical model for jointly modeling item responses, response times, and visual fixation counts. Educational and Psychological Measurement, 83(5), 1059–1080. https://doi.org/10.1177/00131644221136142
  • Mislevy, R. J., & Wu, P.-K. (1996). Missing responses and IRT ability estimation: Omits, choice, time limits, and adaptive testing (ETS Research Report No. RR-96-30-ONR). Educational Testing Service.
  • Molenaar, D., Bolsinova, M., & Vermunt, J. K. (2018). A semi-parametric within-subject mixture approach to the analyses of responses and response times. The British Journal of Mathematical and Statistical Psychology, 71(2), 205–228. https://doi.org/10.1111/bmsp.12117
  • Moustaki, I., & Knott, M. (2000). Weighting for item non-response in attitude scales by using latent variable models with covariates. Journal of the Royal Statistical Society Series A: Statistics in Society, 163(3), 445–459. https://doi.org/10.1111/1467-985X.00177
  • Nagy, G., & Ulitzsch, E. (2021). A multilevel mixture IRT framework for modeling response times as predictors or indicators of response engagement in IRT models. (Educational and Psychological Measurement.)
  • Nagy, G., Ulitzsch, E., & Lindner, M. A. (2022). The role of rapid guessing and test-taking persistence in modelling test-taking engagement. Journal of Computer Assisted Learning, 39(3), 751–766. https://doi.org/10.1111/jcal.12719
  • OECD (n.d). Education and skills online assessment. Retrieved 2023-03-08, from https://www.oecd.org/skills/ESonline-assessment/
  • OECD (2013). Technical report of the survey of adult skills (PIAAC) (Tech. Rep.). Paris, France: OECD Publishing. Retrieved from https://www.oecd.org/skills/piaac/_TechnicalReport_17OCT13.pdf
  • Pohl, S., & Becker, B. (2020). Performance of missing data approaches under nonignorable missing data conditions. Methodology, 16(2), 147–165. https://doi.org/10.5964/meth.2805
  • Pohl, S., Gräfe, L., & Rose, N. (2014). Dealing with omitted and not-reached items in competence tests: Evaluating approaches accounting for missing responses in item response theory models. Educational and Psychological Measurement, 74(3), 423–452. https://doi.org/10.1177/0013164413504926
  • Pohl, S., Ulitzsch, E., & von Davier, M. (2019). Using response times to model not-reached items due to time limits. Psychometrika, 84(3), 892–920. https://doi.org/10.1007/s11336-019-09669-2
  • Pokropek, A. (2016). Grade of membership response time model for detecting guessing behaviors. Journal of Educational and Behavioral Statistics, 41(3), 300–325. https://doi.org/10.3102/1076998616636618
  • Robitzsch, A. (2020). About still nonignorable consequences of (partially) ignoring missing item responses in large-scale assessment. Retrieved from https://osf.io/hmy45/download
  • Robitzsch, A. (2021). On the treatment of missing item responses in educational large-scale assessment data: An illustrative simulation study and a case study using PISA 2018 mathematics data. European Journal of Investigation in Health, Psychology and Education, 11(4), 1653–1687. https://doi.org/10.3390/ejihpe11040117
  • Robitzsch, A. (2022). Exploring the multiverse of analytical decisions in scaling educational large-scale assessment data: A specification curve analysis for PISA 2018 mathematics data. European Journal of Investigation in Health, Psychology and Education, 12(7), 731–753. https://doi.org/10.3390/ejihpe12070054
  • Rohwer, G. (2013). Making sense of missing answers in competence tests (NEPS Working Paper No. 30). Otto-Friedrich-Universität, Nationales Bildungspanel.
  • Rose, N. (2013). Item nonresponses in educational and psychological measurement (Doctoral dissertation, Friedrich-Schiller-Universität Jena). Retrieved from https://d-nb.info/1036873145/34
  • Rose, N., von Davier, M., & Xu, X. (2010). Modeling nonignorable missing data with item response theory (IRT) (ETS Research Report No. RR-10-11). Educational Testing Service.
  • Rubin, D. B. (1976). Inference and missing data. Biometrika, 63(3), 581–592. https://doi.org/10.1093/biomet/63.3.581
  • Sachse, K., Mahler, N., & Pohl, S. (2019). Effects of changing nonresponse mechanisms on trends and group comparisons in international large-scale assessments. Educational and Psychological Measurement, 79(4), 699–726. https://doi.org/10.1177/0013164419829196
  • Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2020). Specification curve analysis. Nature Human Behaviour, 4(11), 1208–1214. https://doi.org/10.1038/s41562-020-0912-z
  • Ulitzsch, E., Domingue, B. W., Kapoor, R., Kanopka, K., & Rios, J. A. (2023). A probabilistic filtering approach to non-effortful responding. Educational Measurement: Issues and Practice, 42(3), 50–64. https://doi.org/10.1111/emip.12567
  • Ulitzsch, E., Penk, C., von Davier, M., & Pohl, S. (2021). Model meets reality: Validating a new behavioral measure for test-taking effort. Educational Assessment, 26(2), 104–124. https://doi.org/10.1080/10627197.2020.1858786
  • Ulitzsch, E., Pohl, S., Khorramdel, L., Kroehne, U., & von Davier, M. (2021). A response-time-based latent response mixture model for identifying and modeling careless and insufficient effort responding in survey data. Psychometrika, 87(2), 593–619. https://doi.org/10.1007/s11336-021-09817-7
  • Ulitzsch, E., Pohl, S., Khorramdel, L., Kroehne, U., & von Davier, M. (2023). Using response times for joint modeling of careless responding and attentive response styles. Journal of Educational and Behavioral Statistics, 107699862311736. https://doi.org/10.3102/10769986231173607
  • Ulitzsch, E., von Davier, M., & Pohl, S. (2019a). A multi-process item response model for not-reached items due to time limits and quitting. Educational and Psychological Measurement, 80(3), 522–547. https://doi.org/10.1177/0013164419878241
  • Ulitzsch, E., von Davier, M., & Pohl, S. (2019b). Using response times for joint modeling of response and omission behavior. Multivariate Behavioral Research, 55(3), 425–453. https://doi.org/10.1080/00273171.2019.1643699
  • Ulitzsch, E., von Davier, M., & Pohl, S. (2020). A hierarchical latent response model for inferences about examinee engagement in terms of guessing and item-level nonresponse. The British Journal of Mathematical and Statistical Psychology, 73 Suppl 1(S1), 83–112. https://doi.org/10.1111/bmsp.12188
  • Ulitzsch, E., Yildirim-Erbasli, S. N., Gorgun, G., & Bulut, O. (2022). An explanatory mixture IRT model for careless and insufficient effort responding in survey data. British Journal of Mathematical and Statistical Psychology, 75(3), 668–698. https://doi.org/10.1111/bmsp.12272
  • van Barneveld, C., Pharand, S.-L., Ruberto, L., & Haggarty, D. (2013). Student motivation in large-scale assessments. In M. Simon, K. Ercikan, & M. Rousseau (Eds.), Improving large-scale assessment in education: Theory, issues and practice (pp. 43–61). Routledge.
  • van der Linden, W. J. (2007). A hierarchical framework for modeling speed and accuracy on test items. Psychometrika, 72(3), 287–308. https://doi.org/10.1007/s11336-006-1478-z
  • Wang, C., & Xu, G. (2015). A mixture hierarchical model for response times and response accuracy. The British Journal of Mathematical and Statistical Psychology, 68(3), 456–477. https://doi.org/10.1111/bmsp.12054
  • Weeks, J. P., von Davier, M., & Yamamoto, K. (2016). Using response time data to inform the coding of omitted responses. Psychological Test and Assessment Modeling, 58(4), 671–701.
  • Wise, S. L. (2017). Rapid-guessing behavior: Its identification, interpretation, and implications. Educational Measurement: Issues and Practice, 36(4), 52–61. https://doi.org/10.1111/emip.12165
  • Wise, S. L., & DeMars, C. E. (2006). An application of item response time: The effort-moderated IRT model. Journal of Educational Measurement, 43(1), 19–38. https://doi.org/10.1111/j.1745-3984.2006.00002.x
  • Wise, S. L., & Gao, L. (2017). A general approach to measuring test-taking effort on computer-based tests. Applied Measurement in Education, 30(4), 343–354. https://doi.org/10.1080/08957347.2017.1353992
  • Wise, S. L., & Kong, X. (2005). Response time effort: A new measure of examinee motivation in computer-based tests. Applied Measurement in Education, 18(2), 163–183. https://doi.org/10.1207/s15324818ame1802_2
  • Yamamoto, K. (1989). Hybrid model of IRT and latent class models (ETS Research Report No. RR-89-41). Educational Testing Service.
  • Yildirim-Erbasli, S. N., & Bulut, O. (2021). The impact of students’ test-taking effort on growth estimates in low-stakes educational assessments. Educational Research and Evaluation, 26(7-8), 368–386. https://doi.org/10.1080/13803611.2021.1977152

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.