ABSTRACT
Combining methods from earlier content validity and more contemporary content alignment studies may allow a more complete evaluation of the meaning of test scores than if either set of methods is used on its own. This article distinguishes item relevance indices in the content validity literature from test representativeness indices in the subsequent content alignment literature. We then demonstrate that setting minimum thresholds for both types of indices in a sequential process can affect conclusions about test content representativeness, which are integral to validity evaluation. We illustrate this outcome using test item-to-target domain objective match data from subject-matter expert panelists who reviewed sixteen US state achievement tests. Although the illustration incorporates educational achievement test data, the process has broader applicability to development of survey questionnaires, workplace tests, and other instruments.
Acknowledgments
We appreciate the thoughtful responses of eight state education agency assessment leaders who replied to data requests by the first author. We gratefully acknowledge productive criticisms and insights from the editor and three anonymous reviewers of this work. Norman Webb also raised several questions after reading an earlier version that helped us sharpen our ideas. The ideas expressed herein are our own.
Disclosure statement
The first author is Associate Professor at Purdue University and studies technical and practical issues in educational measurement. The second author is an educator and educational researcher employed by the Wisconsin Center for Education Products and Services, a nonprofit organization that is affiliated with the University of Wisconsin – Madison. The second author conducts content alignment analyses for test users and other entities.
Data availability statement
We have made the study dataset available from the openICPSR Repository.