187
Views
0
CrossRef citations to date
0
Altmetric
Research Article

The development and validation of a self-audit survey instrument that evaluates preservice teachers’ confidence to use technologies to support student learning

ORCID Icon, ORCID Icon & ORCID Icon
Received 26 Dec 2022, Accepted 13 Mar 2024, Published online: 23 Apr 2024

ABSTRACT

Internationally, university teacher educators are responding to the requirement that preservice teachers (PSTs) need to be confident, operational users of a broad range of technologies, and that they can apply their technological pedagogical and content knowledge (TPACK). TPACK is a framework representing the complex interactions of teachers’ technological content knowledge and technological pedagogical knowledge with their pedagogical content knowledge; the interaction of the three domains is believed to produce effective teaching with technology. The challenge is designing an instrument that can validly and reliably evaluate whether PSTs have this wide range of knowledge, and if not, what their learning needs are. This paper reports on the development and validation of a self-audit survey instrument that evaluated 296 PSTs’ confidence to use technologies to support student learning. Using Rasch modelling techniques, construct validity was assessed for participant and item fit for 100 survey items organized within 10 components representing the construct PST confidence to use technologies to support student learning. Rasch modelling indicated the need to remove 23 ill-fitting items, which resulted in the development of a 77-item survey with strong construct validity and reliability. Technologies educators and researchers are encouraged to use this validated survey as a PST self-audit instrument.

Introduction

The integration of effective technology practices in the classroom is a critical proficiency for teachers internationally (Martin et al. Citation2020, Crompton and Sykora Citation2021, Kayaalp et al. Citation2022, Thohir et al. Citation2022, Xianhan et al. Citation2022). Consequently, when preservice teachers (PSTs) graduate from an initial teacher education programme, it is essential they are confident in their ability to meet the requirements and standards of their profession (AITSL Citation2018, Blannin et al. Citation2022). In the Australian context, the Australian Professional Standards for Teachers (Australian Institute for Teaching and School Leadership [AITSL] Citation2018, pp. 13–17) define these expectations in the graduate teacher standards. Standard 2.6 states that PSTs need to ‘use ICT in teaching to expand curriculum learning opportunities’. Standards 3.4 and 4.5 respectively require PSTs ‘demonstrate knowledge of ICT resources to engage students’ and ‘demonstrate understanding of the safe, responsible, and ethical use of ICT’. Although it is expected that PSTs are informed, equipped and willing to integrate information, communication, digital and robotics technologies (ICDRT) into their teaching (Blannin et al. Citation2022), they feel ill-prepared (Redmond and Lock Citation2019, Martin et al. Citation2020, Wilson et al. Citation2023).

Technological pedagogical and content knowledge (TPACK), theorized by Mishra and Koehler (Citation2006), is a conceptual framework that identifies the knowledge that educators need to effectively teach with technology (Tondeur et al. Citation2017). The systematic literature review on TPACK by Voogt et al. (Citation2013) revealed that ‘there were different understandings of TPACK and that teacher knowledge (TPACK) and their beliefs about pedagogy and technology determined whether or not a teacher might teach with technology’ (Finger et al. Citation2013, p. 106). A literature review by Fernández-Batanero et al. (Citation2022) reported the need for teacher training programmes to include understanding of TPACK and the technologies used for teaching and learning purposes.

Evaluating PST confidence to use technologies for teaching and student learning purposes is difficult to do because of TPACK’s multifaceted nature (Willermark Citation2018, Valtonen et al. Citation2020); however, there are descriptive explanations available for the multifaceted elements teachers are required to demonstrate (Koehler et al. Citation2013). As a result, the TPACK framework was chosen as a primary source used to design the self-audit survey items presented in this paper. Following the recommendation by Willermark (Citation2018) to combine multiple sources when constructing survey items, additional key sources (see ) were used in the development of the self-audit survey items (Martin et al. Citation2024). Martin et al. (Citation2024) utilized each of these key sources to identify the breadth of technological, pedagogical and content knowledges required of teachers and to capture this knowledge base for the development of the self-audit survey items. The initial challenge was to identify these underlying items, create coherent statements for participants to respond to in a survey, and then establish that each item validly and reliably forms part of the latent variable. A latent variable is a construct that can only be inferred indirectly by measuring responses to more readily definable and observable underlying items.

This paper describes the process of developing and validating the self-audit survey instrument designed to evaluate PST’s confidence to use technologies to support student learning. For details regarding the application of the self-audit survey and the accompanying instructional design elements, we refer readers to Martin et al. (Citation2024).

Materials and methods

Designing the survey items

The self-audit survey developed in this study was initially designed with 10 technologies components, each consisting of 10 survey items. The TPACK framework was chosen as one primary source to inform the design of the survey because the framework is used extensively within educational technology research designs (Voogt et al. Citation2013). Despite the widespread use of the TPACK framework, it has come under criticism ‘for being vague and too extensive’ (Willermark Citation2018, p. 315). A review of empirical studies published from 2011 to 2016 (Willermark Citation2018), found that the main approach in researching teacher TPACK had been via self-reporting which primarily focused on general TPACK knowledge. In contrast, the design of our self-audit survey, informed by key sources (), identified specific technologies and pedagogies used in education to cover the knowledge base expected of teachers. As a result, and for classification purposes only, each survey item can be aligned to one of the TPACK knowledge components: content knowledge (CK), pedagogical knowledge (PK), technological knowledge (TK) and then the three additional constructs which combine these knowledges. These are pedagogical content knowledge (PCK), technological pedagogical knowledge (TPK) and technological content knowledge (TCK). Some survey items were aligned with the integrated TPACK domains of TCK, TPK and PCK; see examples in .

Table 1. TPACK Component definitions and survey component and item examples. Adapted from Chai et al. (Citation2013).

The self-audit survey followed a four-stage development cycle which included a survey validation procedure using Rasch modelling techniques (Martin and Jamieson-Proctor Citation2020):

  1. item formation,

  2. content design,

  3. construct validation, and

  4. construct reliability analysis.

Rasch modelling with Winsteps software (Linacre Citationn.d.a) was used in the validation and reliability analysis stages and forms the Results and Discussion section of the paper. The rationale for selecting Rasch modelling is that it uses a rigorous set of processes which can improve the construction of measurement instruments, monitor their precision for measuring what they are intended to measure, compare responses to survey items, and from those responses create an item difficulty hierarchy. It is a highly effective tool for evaluating Likert-type, multi-item self-audit surveys due to its strong psychometric properties and diagnostic capabilities (Zile-Tamsen Citation2017).

Rasch modelling has been used extensively to validate self-audit scales that assess teachers’ pedagogical knowledge across various STEM disciplines: e.g. science education (Puspitasari et al. Citation2020); technologies education (Tondeur et al. Citation2017); mathematics education (Martin and Jamieson-Proctor Citation2020). In particular, through several studies Rasch modelling has proven to be an effective tool to validate and fine-tune self-audit scales that assess preservice teachers’ TPACK in technologies education (e.g. Jamieson-Proctor et al. Citation2013, Ramnarain et al. Citation2021, Sofyan et al. Citation2023). A difference between the survey scale we have developed, and those from the afore-cited ones is that they contained items that assessed broad TPACK components. Our scale is the first tool that has responded to recommendations from studies in this field (Mishra Citation2019, Brianza et al. Citation2022, Sofyan et al. Citation2023) that further research be conducted to develop and evaluate scales for specific contexts that contain items relating to specific technology skills and tools within the TPACK components (Martin et al. Citation2024).

Rasch modelling is used to evaluate both survey and test rating scales, so the standard Rasch term ‘person’ is used to represent both participants in surveys and test-takers in tests. Similarly, the term ‘person ability’ is used meaningfully in test evaluation, but its use in surveys creates an unusual meaning. For this reason, in our self-audit survey evaluation we will use the alternative terms participant ability to affirm an item (in place of person ability) and item affirmation difficulty (in place of item difficulty).

In Rasch terms, if a participant’s ability to affirm an item is greater than the difficulty level of that item, (e.g. difficult ICDRT items that are new, or less frequently used) the participant is more likely to affirm the item, meaning they self-assess as more confident users of technology.

Stage 1: Item formation

Item formation requires an extensive review of the literature (Senocak Citation2009) so that researchers can have confidence that the developed instrument is constructed in a way that ensures it measures what it is intended to measure (validity), and that it does this consistently (reliability) (Cruickshank et al. Citation2015). A range of key national and international sources, comprising frameworks, curriculum, policy documents and peer-reviewed reports were reviewed (). For example, the UNESCO ICT Competency Framework example activities (Citation2018) were used as a guide to make sure that the survey items reflected international developments in technology and pedagogy (Martin et al. Citation2024).

Table 2. Key national and international literature sources used to develop the survey.

Stage 2: Content design

Survey content design involved the selection of survey items that encompassed the range of technological pedagogical and content knowledges and specific technologies used in education. One hundred survey scale items were created from the key literature sources identified in (see also Martin et al. Citation2024). The 100 items were divided into 10 components (), each originally containing 10 scale items (Appendix 3). The components were designed to include ICDRT used in education.

Table 3. Survey components.

The item statements covered the range of TPACK elements from the simple isolating domains of PK, CK and TK, to the combinations of knowledges (TCK, TPK, PCK) and technology skills, and these were aligned with specific technologies and pedagogies used in education. It was important to ensure that pedagogical aspects of TPACK were included in the scale items such as ‘I can explain … ’ and ‘I can use student-centred pedagogy … ’. Survey response options utilized a Likert-type four-point scale:

  1. not confident at all,

  2. slightly confident,

  3. moderately confident, and

  4. very confident.

The choice of an even-numbered 4-point rather than a 5-point scale with a mid-point is a deliberate one. Some literature recommends not using a middle (point 3) category in a 5-point survey scale unless there is a real need to provide an option for neutral/unsure (Bradley et al. Citation2015). A midpoint can make an ordinal scale less ordinal and more categorical in construction, or the mid-category can become a null/throw away item, thus potentially losing data, or disordering category thresholds. Introducing a response like ‘neutral’ or ‘not sure’ within the scale, placed between ‘slightly confident’ and ‘moderately confident’, disrupts the presumed ordered sequence of categories.

Stage 3: Construct validation

This stage involved verifying that the scale items measure the intended construct. Rasch modelling techniques were used to examine the fit of the survey instrument’s items and the participants’ responses with the 10 survey components. The premise of Rasch modelling is that the items in a survey are reflections of a single construct that has been made explicit by the researchers’ choice of items to represent that construct and the collective responses to the survey items by the participants (Bond et al. Citation2021). For survey scale rating, the Rasch model posits that the relationship between the participants’ level of ability to affirm an item, and how difficult that item is to affirm, provides the most accurate predictor of a trait.

An issue with using an ordinal confidence scale, containing points on a scale that do not have equal intervals, is that different response items on the survey that attempt to collect confidence measures may elicit a different degree of confidence from the respondents depending on the characteristics of the item. For example, a 4 (very confident) in response to item 8 of a survey should not be assumed to be equal to the same level of agreement as a 4 (very confident) response to item 10 of a survey. To illustrate this point in the current survey, it should not be assumed that a very confident response to the item ‘I can use keyboard shortcuts’ is as easy to feel very confident about as ‘I can set up a data projector and external speakers to the computer for a presentation’. In this Rasch analysis we refer to this difference in the characteristics of items as item affirmation difficulty. Likewise, a younger PST may be more able to respond with confidence to items that involve social networking skills than a more mature-aged PST. In this Rasch analyses we refer to this as the participants’ ability to affirm items.

Rasch modelling provides the ability to observe whether the participant behaviour is being inconsistently influenced by the level of survey item affirmation difficulty. Further, a strength of Rasch models is that they log-transform participants’ raw score data, such as ordinal (e.g. 4-point Likert) survey responses. The resulting logit scale is measured on a line from negative infinity to positive infinity, which avoids the ceiling and floor effect when using raw scores and allows the interaction between items and participants to be observed, and modified by, for example, removing or adding items, or altering the wording of items.

Fit statistics are also provided and are useful in determining the unidimensionality of data and how well items contribute to the construct. Fit indicators use the patterns of [participant] responses to estimate how much misfit is evident, or ‘how likely that misfit is’ (Bond et al. Citation2021, p. 36). Expected fit is a value of 1. Fit statistics Infit and Outfit mean square values (MNSQ) from 0.6–1.4 are considered productive for rating scale measurement with low implications (Wright and Linacre Citation1994, Bond et al. Citation2021). That is, survey items with MNSQ values falling between 0.6 and 1.4 are considered to fit the model for rating scales that are low stakes such as the survey scales used in this study in comparison to tests used for high-stakes, gate-keeping purposes. The ZSTD for the infit and outfit is the standardization of the fit scores. Acceptable ZSTD values fall between – 2.0 and +2.0.

Stage 4: Construct reliability analysis

This stage aims to assess the consistency or reliability of the scale in measuring the construct. Rasch analyses provide measures of item and participant reliability, which indicate the consistency of item affirmation difficulty and participant ability to affirm items across the scale. High reliability indicates a consistent measurement of the construct.

Survey validity and reliability evaluation procedure

The following sequence and types of statistical analyses were conducted using the Winsteps Version 5.2.1.0 Rasch modelling software (Linacre Citationn.d.a) to evaluate the validity and reliability of the survey instrument:

  1. Fit analysis: This involved examining item and participant fit to identify items or individuals that did not conform well to the Rasch model. We first analysed each of the components separately, removed non-fitting items, and re-checked for item fit within each component. We then analysed the fit of the whole model.

  2. Dimensionality analysis: This involved assessment of the underlying structure of the scale to ensure the dimensions align with the intended construct. This was assessed with a Principal Components Analysis (PCA) to analyse the dimensionality of the whole model, then each of the components separately.

  3. Response category threshold analysis: After assessing fit and dimensionality, we evaluated the functioning of response categories across the whole instrument to check the response category thresholds were ordered and clearly separated.

Results and discussion

Fit analyses

After analysing each component’s items for fit using Rasch modelling, 18 items were removed from the 100-item set because their fit statistics showed Infit or Outfit MNSQ values less than 0.6 or greater than 1.4. Five additional items had Item Fit statistics between 0.6 and 1.4; however, they had ZSTD scores > + 2.0 or < −2.0, suggesting that these items may not contribute to the construct. A closer examination of these items revealed that the researchers had initially written the statement in a double-barreled format. For example, item 7.7 from the Multimedia skills component read, ‘I can use an iPad and am able to download apps’. Consequently, some participants may have responded to one part of the question and some to the other part. As a result, these items were removed and as suggested by Bond et al. (Citation2021), may be rewritten and used in some other survey. All data, materials, source code and results for the Winsteps Rasch analysis can be accessed in OSF data repository:

https://osf.io/v5wru/?view_only=335c5e541cef4345b029bc0f9e5ad936.

Discussing ten sets of results from the Rasch analyses would be expansive and repetitious, so only examples of results that were obtained while testing Component 1, General computing skills, for construct validity will be described in the following sections. The results for each of the other nine components are summarized and presented in a table. Detailed results for each of the other nine components can be viewed in the OSF data repository. A summary of the Item Fit statistics obtained from the Rasch analyses for the 10 items in Component 1 are shown in .

Table 4. Item fit statistics for n = 10 items in Component 1, General computing skills (n = 296 respondents). Items marked * were identified as ill-fitting.

As indicated in , three items were removed. Item 3 and item 4 were removed due to MNSQ Outfit values less than 0.6. Item 9, ‘I can print out documents from the computer’, was removed for being highly affirmed (too easy). After removal of the three ill-fitting items, item 1.7 was identified as having an Infit ZSTD of – 2.94 and a MNSQ of .78. The outfit ZSTD was – 2.57 and the MNSQ was .80. Therefore, although the ZSTDs were outside of 2.0 and – 2.0, the MNSQs fell comfortably within the expected range 0.6–1.4. If MNSQs are acceptable, the ZSTD statistic can be ignored because mean squares near 1.00 indicate little distortion of the measurement system, regardless of the ZSTD value (Linacre Citationn.d.a). In short, removal of three ill-fitting items resulted in improved MNSQ values closer to the expected value of 1.00 while maintaining item reliability scores for Component 1 (see and ).

Table 5. Component 1, Item summary fit statistics, Separation and Reliability indices for n = 10 items and n = 296 respondents.

Table 6. Component 1, Item summary fit statistics, Separation and Reliability indices for n = 7 items and n = 296 respondents.

In and , Item Reliability and Participant Reliability are indices of precision and reproducibility of items and participant measures (Linacre Citation2023). The separation index gauges the number of statistically different levels of how difficult the item is to affirm, or how able a participant is to affirm an item (Linacre Citation2023), that is, it quantifies how reliably categories are distinguished across the latent trait.

A useful graphic that can be used to visualize the overall functioning of the survey and the fit of its items is a bubble chart. and are bubble chart plots of Item affirmation difficulty versus item Infit MNSQ. Each item is represented by a circle, whose size is proportional to its standard error. Items should be as close as possible to a modelled value of 1 for Infit MNSQ. In , the standard errors show a small range from 0.09 for items 6, 7 and 10, which were the items with the most precise estimates, to 0.12 for item 1, which was the item with the least precise estimate. These differences largely reflect differences in item targeting to the participant locations. In comparison, in , the non-fitting items had large and unacceptable standard errors of 0.2 and 0.21. After non-fitting items are removed, the remaining fitting items should cluster closer to the modelled value of 1 and the size of the remaining items’ standard error circles should be smaller because the standard error quantifies the precision of the measure, which has been improved.

Figure 1. All n = 10 Component 1 items displayed in a bubble chart.

Figure 1. All n = 10 Component 1 items displayed in a bubble chart.

Figure 2 Component 1 items (n = 7) displayed in a bubble chart after the three non-fitting items were removed.

Figure 2 Component 1 items (n = 7) displayed in a bubble chart after the three non-fitting items were removed.

Figure 3. Wright map of Component 1, General computing skills, showing the initial n = 10 items and n = 7 items that fitted the model.

Figure 3. Wright map of Component 1, General computing skills, showing the initial n = 10 items and n = 7 items that fitted the model.

Construct reliability analyses

The Rasch model allows unidimensional measurement of one attribute of the phenomenon under observation at a time, meaning reliability analyses can be separately conducted for the item affirmation difficulty and for the participants’ ability to affirm an item. The analysis of the participants’ ability to affirm the items (after removing the three ill-fitting items for Component 1, General computing skills) is presented in .

Table 7. Participants’ ability to affirm Component 1, general computing skills (n = 296).

In Rasch measurement, the participants’ ability to affirm an item and the item affirmation difficulty are set on a true interval (logit) scale from negative infinity to positive infinity, which avoids the ceiling and floor effect when using raw scores (Bond and Fox Citation2007). Also, in Rasch measurement the two parameters of Item and Participant are mutually independent, meaning the participants’ ability to affirm an item remains the same no matter how difficult the items are for the survey respondent to affirm, and the item affirmation difficulty remains invariant no matter how differently the respondents respond to the item. The benefit of using Rasch modelling in this evaluation is that it provides the ability to observe whether the participants’ ability to affirm an item is being inconsistently influenced by the affirmation difficulty of the survey items.

The Rasch model was used to measure estimates of the 10 items within each scale to clarify how far apart these items are in affirmation difficulty, relative to each other and for the 296 PSTs (Males  = 46, Females  = 250) who took the survey. The Winsteps programme can produce a Wright map which illustrates the relationship between the participants’ level of ability to affirm an item, and how difficult that item is to affirm. illustrates the item affirmation difficulty level in relation to the mean (+M, set at 0) in terms of logit values, alongside the PSTs’ Participant ability to affirm levels with their mean denoted at M for Component 1, General computing skills. The Wright map assists in targeting the survey to the sample. If large gaps between items are detected, new items of appropriate affirmation difficulty should be added. If too many items of the same affirmation difficulty are found, some of them can be removed. After removing three non-fitting items (items 1.3, 1.4 and 1.9), the M and +M moved closer together and the spread of affirmation difficulty between the items was increased, which means better unidimensionality was achieved.

shows the Rasch analysis results for Participant and Item Separation, Reliability and model fit for survey items within each component after removing n = 23 non-fitting items. The Rasch model Separation measurement assists the researcher to determine if the survey items were spread sufficiently along the Item difficulty to affirm continuum and how the PST survey respondents were spread along the Participant ability to affirm continuum.

Table 8. Participant and item separation, reliability and fit for survey items within each component after removing n = 23 non-fitting items.

The participant reliability measures were low for the Word processing (.03), and Coding and robotics components (.38). This occurred because the range of participant affirmation measures was narrow, as indicated by the Participant Separation measures of 0.79 for Word processing, and 0.66 for the Coding and robotics components. Similarly, the participant affirmation measures were bunched together along the vertical latent trait line in the Wright maps (Appendix 1 and 2).

Low participant separation (<2, participant reliability <0.8) usually implies that the instrument may not be sensitive enough to distinguish between high and low performers; more items may be needed (Linacre Citationn.d.a). However, for this cohort of PSTs, this is not the conclusion drawn. These results indicate that the Word processing component was comprised of skills and knowledge that many participants would have attained during their middle and high school years, or in jobs where these basic digital technology skills are commonly used.

The Wright map for this component (Appendix 1) shows the cohort were homogenous in their ability. The mean participant affirmation of the group was greater than the most difficult item for Word processing (item 4.8). Therefore, due to the high level of PST confidence in their ability to apply the Word processing component to their teaching, it is apparent that this set of skills had been previously attained by most of this cohort. In contrast, the Wright map for Component 8 Coding and robotics (Appendix 2) showed the mean participant affirmation of the group was less than the least difficult item (item 8.7), indicating that Coding and robotics skills had not been attained by most of the cohort during their schooling or elsewhere and it was clearly a component they needed to develop knowledge and skills in during their coursework.

Overall fit of all 77 items in the instrument

We ran fit statistics across all 77 items to determine the overall model fit (). The separation values for both Participants (4.98) and Items (14.60) are indicative of reasonably well-defined measures, and the reliability values are high (0.96 for Participants and 1.00 for Items), which are positive indicators of the model's overall quality. The fit statistics for both Participants and Items are generally within acceptable ranges, and the Reliability and Separation values suggest that the model is performing well in capturing the latent trait being measured.

Figure 4. Wright map of all 77 items and 10 components in the instrument.

Figure 4. Wright map of all 77 items and 10 components in the instrument.

Table 9. Participant and item fit for the 77 items across all components in the instrument.

The Wright Map () represents the relationships between participant affirmation levels and item affirmation difficulty across the ten components related to the construct of PST confidence to use technologies to support student learning. It shows good clustering of components and a good spread of affirmation difficulty across the latent trait between – 4 and +4 logits. Only 4 items sit outside 2 standard deviations (items 5.1, 8.4, 8.5, and 10.1 above and below ‘T’ on the vertical line).

Dimensionality analysis

To ensure the survey accurately captured the construct of PST confidence to use technologies to support student learning, we examined whether all the components collectively aligned with a single latent trait. First, we analysed the dimensionality of the whole instrument, then each of the components separately. Results for the whole instrument are presented and discussed in this section. Individual component results are summarized collectively, but also presented for the reader to view in the OSF.

To assess the overall dimensionality of the final instrument and determine if it measures a single latent trait or has multiple dimensions, a Principal Components Analysis (PCA) was performed on the ‘Standardized Residual Variance Principal Components’ (Table 23.0 in Winsteps). The PCA examines the relationships between all the components and assesses whether they form a coherent and internally consistent scale. and provide information about the eigenvalues and the proportion of variance explained by each principal component.

Table 10. PCA: standardized residual variance for all 77 items.

Table 11. PCA: standardized residual variance for each individual component.

The proportion of variance explained in the second row of the (Raw variance explained by measures = 61.4%) suggests that most of the variation in the data can be attributed to a single dimension. The term ‘measures’ refers to the components of the latent trait that the survey aims to assess, which are the 10 different aspects of confidence to use technologies to support student learning. The distinction between ‘measures’ and ‘items’ in the table indicates different levels of analysis. ‘Items’ refer to 77 statements in the survey, whereas ‘measures’ represent the higher level of aggregation. The row for ‘measures’ shows the variance explained by these broader constructs, while the row for ‘items’ shows the variance explained by individual survey items.

The decreasing unexplained variance from the first to the fifth contrast indicates that subsequent contrasts beyond the first contrast contribute less to the overall variance, aligning with the concept of unidimensionality; however, ideally, a greater decrease from the first to the second contrast would provide a stronger indication of unidimensionality.

The PCA analysis of the standardized residual variance for each individual component () found that the raw measure of variance explains a substantial portion of the variance within each component (50.5–65.2%), indicating unidimensionality within each component. The decreasing proportion of unexplained variance indicates that subsequent contrasts beyond the first contrast contribute less to the overall variance, aligning with the concept of unidimensionality. As the PCA analysis for the whole instrument shows, unexplained variance decreased from the first to the second contrast within the WWW and Spreadsheet components. A greater decrease for these components from the first to the second contrast would provide a stronger indication of unidimensionality for these two components.

Response category threshold analysis

In psychometric assessments, Andrich thresholds are pivotal points along the latent trait continuum where respondents’ probabilities of choosing one response category over another change. These thresholds delineate critical transitions, indicating shifts in respondents’ preferences or probabilities as their trait levels vary. In Rasch analysis, disordered Andrich thresholds are considered problematic and can have implications for the quality and validity of the measurement. Linacre (Citationn.d.b) makes the following point:

Disordered thresholds are no problem for the formulation of polytomous Rasch models, nor for estimating Rasch measures, nor do they cause misfit to the Rasch model. They are only a problem if the Andrich thresholds are conceptualized as the category boundaries on the latent variable.

Therefore, because the response category boundaries are the thresholds on the latent variable in this study, identifying disordered thresholds should be a consideration in the analysis. Identifying whether thresholds are ordered involves inspection of the category structure () and a plot of the response categories () along the latent variable continuum (participant ability to affirm relative to item difficulty to affirm measure). In , the observed average measures of the participants are strongly ordered and are correlated with the category labels. Infit and Outfit MNSQ are all very close to 1.0, so the responses in each category accord with Rasch-model expectations. Andrich thresholds advance strongly from – .74 to .69, so there is no threshold disordering.

Figure 5. Category probability curves.

Figure 5. Category probability curves.

Table 12. Summary of category structure.

In , the category probability curves display ordered and clearly separated thresholds. The middle two categories (2 and 3) have relatively high peaks and Andrich thresholds located above the Andrich threshold where categories 1 and 4 intersect.

The overall pattern of the category probability curves in exhibits separated thresholds between the middle and extreme categories with a general progression indicating that participants with higher latent trait levels are more likely to affirm higher response categories.

Conclusion

This paper discussed the development and validity testing of a 100-item survey instrument with PSTs (n = 296), designed to measure the latent trait of PST confidence to use technologies to support student learning. Rasch analyses returned fit statistics for each of the ten components which showed that 77 out of the 100 items can be relied upon to measure the latent trait, providing survey participants are selected from a not-too-dissimilar cohort of PSTs (Bond and Fox Citation2007).

Further analysis of fit for the whole 77 item survey instrument for both Participants and Items showed Infit and Outfit mean squares were very close to 1.0, so they accord with Rasch-model expectations, and the Reliability and Separation values suggested that the model performed well in capturing the latent trait being measured. Regarding the dimensionality of the instrument, the Principal Components Analysis of the whole model suggested unidimensionality within the instrument for most of the 77 items, as they predominantly load on the first contrast. Finally, the response category threshold analysis indicated that the Andrich thresholds were clearly separated and well-ordered, and the in accord with Rasch model expectations, the probability of endorsing higher response categories increased as the latent trait level increased.

The Rasch model analyses showed many participants would have attained knowledge and skills for some of the easier components during school years, or in jobs. For example, Component 4, Word processing skills had low participant separation = .79, and the mean ability of participants to affirm the items in the Wright map was greater than the most difficult to affirm item, indicating a homogenous and confident group in terms of word processing skills.

While items in easier components had been attained previously by most of this cohort, other items, such as those in the coding and robotics component were unfamiliar. Including more difficult and easier components in the survey is not necessarily a deficit in survey design; it is important to have a range of difficulty among components, just as it is desirable to have a range of difficulty among items within components. However, items or components may need to be modified or removed depending on how easy they are for particular cohorts, which is what the Rasch model allows researchers to do with a high level of confidence.

The self-audit survey is an instrument which can be used to evaluate PST confidence to use technologies to support student learning in initial teacher education, pre-and post-the implementation of a course, or as a stand-alone to determine PSTs’ readiness for the profession prior to graduation. Further, the PSTs can complete the survey at the start of a course and then benefit from receiving a copy of their responses. This could guide their self-directed learning, fostering agency and positively influencing their perception of their teacher self-efficacy. The survey could also be used as a stand-alone assessment with inservice teachers and university teacher educators as a self-audit tool to identify professional development requirements.

The complete survey scale has been made available in Appendix 3 so that other researchers may use it as a self-audit instrument with PSTs. Other researchers who wish to apply the survey and conduct their own analysis should also refer to (Martin et al. Citation2024). It is important to note that technologies used in education change rapidly and the self-audit components and items will need to be regularly updated and analysed to ensure they align to current technologies knowledge expectations for teachers. For example, the inclusion of Artificial Intelligence (AI) in education will be a new component for inclusion in an updated version of the self-audit survey in the future, once the literature can adequately identify its core features.

Ethical approval statement

This research was approved by the University of Sunshine Coast’s Human Research Ethics Committee (HREC A201431) and conducted in accordance with all required ethics protocols.

Supplemental material

Supplemental Material

Download MS Word (85.4 KB)

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The data that support the findings of this study are openly available in OSF at: https://osf.io/v5wru/?view_only=335c5e541cef4345b029bc0f9e5ad936

References

  • Australian Curriculum, Assessment and Reporting Authority (ACARA). 2022. Australian curriculum version 9, understand this learning area, technologies. Available from: https://v9.australiancurriculum.edu.au/teacher-resources/understand-this-learning-area/technologies.
  • Australian Institute for Teaching and School Leadership (AITSL). 2011. ICT elaborations for graduate teachers. Available from: http://technologiesforteaching.weebly.com/uploads/1/6/3/3/16335480/ttf_-_graduate_teacher_standards_-_ict_elaborations_-_200411.pdf.
  • Australian Institute for Teaching and School Leadership (AITSL). 2018. Australian professional standards for teachers. Available from: https://www.aitsl.edu.au/docs/default-source/national-policy-framework/australian-professional-standards-for-teachers.pdf?sfvrsn=5800f33c_64.
  • Blannin, J., et al., 2022. Positioning the technologies curriculum: a snapshot of Australian initial teacher education programs. The Australian educational researcher, 49, 979–999. doi:10.1007/s13384-021-00473-5.
  • Bond, T., and Fox, C., 2007. Applying the Rasch model: fundamental measurement in the human sciences. 2nd ed. Mahwah, New Jersey: Lawrence Erlbaum.
  • Bond, T.G., Yan, Z., and Heene, M., 2021. Applying the Rasch model: fundamental measurement in the human sciences. 4th ed. New York, NY: Routledge.
  • Bradley, K.D., et al., 2015. Rating scales in survey research: using the Rasch model to illustrate the middle category measurement flaw. Survey practice, 8 (1), 1–12. doi:10.29115/SP-2015-0001.
  • Brianza, E., et al., 2022. Situating TPACK: a systematic literature review of context as a domain of knowledge. Contemporary issues in technology and teacher education, 22 (4), 707–753. https://www.learntechlib.org/primary/p/221446/paper_221446.pdf.
  • Chai, C.S., Koh, J.H.L., and Tsai, C.C., 2013. A review of technological pedagogical content knowledge. Educational technology & society, 16 (2), 31–51. https://www.jstor.org/stable/jeductechsoci.16.2.31.
  • Crompton, H., and Sykora, C., 2021. Developing instructional technology standards for educators: a design-based research study. Computers and education open, 2, 100044. doi:10.1016/j.caeo.2021.100044.
  • Cruickshank, V., et al., 2015. Construction and validation of a survey instrument to determine the gender-related challenges faced by pre-service male primary teachers. International journal of research & method in education, 38 (2), 184–199. doi:10.1080/1743727X.2014.914165.
  • Fernández-Batanero, J.M., et al., 2022. Digital competences for teacher professional development. Systematic review. European journal of teacher education, 45 (4), 513–531. doi:10.1080/02619768.2020.1827389.
  • Finger, G., Jamieson-Proctor, R., and Grimbeek, P., 2013. Teaching teachers for the future project: building TPACK confidence and capabilities for elearning. Paper presented at the International Conference on Educational Technologies (ICEduTech), November 2013, Kuala Lumpur, Malaysia.
  • Jamieson-Proctor, R., et al., 2013. Development of the TPACK survey. Australian educational computing, special edition: teaching teachers for the future project, 27 (3), 26–35. Available from: https://www.researchgate.net/publication/259644162_Development_of_the_TTF_TPACK_survey_instrument.
  • Kayaalp, F., et al., 2022. The effect of digital material preparation training on technological pedagogical content knowledge self-confidence of pre-service social studies teachers. Kuramsal Eğitimbilim, 15 (3), 475–503. doi:10.30831/akukeg.1061527.
  • Koehler, M.J., Mishra, P., and Cain, W., 2013. What is technological pedagogical content knowledge (TPACK)? Journal of education, 193 (3), 13–19. doi:10.1177/002205741319300303.
  • Linacre, J.M. 2023. A user’s guide to WINSTEPS MINISTEP Rasch-model computer programs: Program Manual 4.4.7. Available from: https://www.winsteps.com/winman/copyright.htm.
  • Linacre, J.M. n.d.a. Help for Winsteps Rasch Measurement and Rasch Analysis Software. Available from: https://www.winsteps.com/winman/reliability.htm [Accessed 12 Sept 2022].
  • Linacre, J.M. n.d.b. Andrich Thresholds: Disordered rating or partial credit structures. Available from: https://www.winsteps.com/winman/disorder.htm [Accessed 1 Feb 2023].
  • Martin, D.A., Carey, M.D., McMaster, N., and Clarkin, M., 2024. Assessing primary school preservice teachers’ confidence to apply their TPACK in specific categories of technologies using a self-audit survey. The Australian educational researcher. doi:10.1007/s13384-023-00669-x.
  • Martin, D., and Jamieson-Proctor, R., 2020. Development and validation of a survey instrument for measuring pre-service teachers’ pedagogical content knowledge. International journal of research & method in education, 43 (5), 512–525. doi:10.1080/1743727X.2019.1687669.
  • Martin, D., McMaster, N., and Carey, M.D., 2020. Course design features influencing preservice teachers’ self-efficacy beliefs in their ability to support students’ use of ICT. Journal of digital learning in teacher education, 36 (4), 221–236. doi:10.1080/21532974.2020.1781000.
  • Mishra, P., 2019. Considering contextual knowledge: the TPACK diagram gets an upgrade. Journal of digital learning in teacher education, 35 (2), 76–78. doi:10.1080/21532974.2019.1588611.
  • Mishra, P., and Koehler, M.J., 2006. Technological pedagogical content knowledge: a framework for teacher knowledge. Teachers college record: the voice of scholarship in education, 108 (6), 1017–1054. doi:10.1111/j.1467-9620.2006.00684.x.
  • Puspitasari, J., et al., 2020. Validation of TTMC instrument of pre-service chemistry teacher’s TPACK using Rasch model application. Journal of physics: conference series, 1511, 012034. doi:10.1088/1742-6596/1511/1/012034.
  • Ramnarain, U., Pieters, A., and Wu, H.K., 2021. Assessing the technological pedagogical content knowledge of Pre-service science teachers at a South African university. International journal of information and communication technology education, 17 (3), 123–136. doi:10.4018/IJICTE.20210701.oa8.
  • Redmond, P., and Lock, J., 2019. Secondary pre-service teachers’ perceptions of technological pedagogical content knowledge (TPACK): what do they really think? Australasian journal of educational technology, 35 (3), 45–54. doi:10.14742/ajet.4214.
  • Romeo, G., Lloyd, M., and Downes, T., 2012. Teaching teachers for the future (TTF): building the ICT in education capacity of the next generation of teachers in Australia. Australasian journal of educational technology, 28 (6), 949–964. doi:10.14742/ajet.804.
  • Senocak, E., 2009. Development of an instrument for assessing undergraduate science students’ perceptions: the problem-based learning environment inventory. Journal of science education and technology, 18 (6), 560–569. doi:10.1007/s10956-009-9173-3.
  • Sofyan, S., et al., 2023. TPACK–UotI: the validation of an assessment instrument for elementary school teachers. Humanities and social sciences communications, 10 (1), 1–7. doi:10.1057/s41599-023-01533-0.
  • Thohir, M.A., Jumadi, J., and Warsono, W., 2022. Technological pedagogical content knowledge (TPACK) of pre-service science teachers: a Delphi study. Journal of research on technology in education, 54 (1), 127–142. doi:10.1080/15391523.2020.1814908.
  • Tondeur, J., et al., 2017. Understanding the relationship between teachers’ pedagogical beliefs and technology use in education: a systematic review of qualitative evidence. Educational technology research and development, 65, 555–575. doi:10.1007/s11423-016-9481-2.
  • United Nations Educational, Scientific and Cultural Organization. 2018. UNESCO ICT competency framework for teachers. Available from: https://unesdoc.unesco.org/ark:/48223/pf0000265721.locale=en.
  • Valtonen, T., et al., 2020. Fresh perspectives on TPACK: pre-service teachers’ own appraisal of their challenging and confident TPACK areas. Education and information technologies, 25 (4), 2823–2842. doi:10.1007/s10639-019-10092-4.
  • Voogt, J., et al., 2013. Technological pedagogical content knowledge – a review of the literature. Journal of computer assisted learning, 29 (2), 109–121. doi:10.1111/j.1365-2729.2012.00487.x.
  • Western Australian, Department of Education and Training. 2006. Teacher ICT skills: evaluation of the Information and Communication Technology (ICT) knowledge and skill levels of Western Australian government schoolteachers. Available from: https://docplayer.net/10374438-Teacher-ict-skills-evaluation-and-accountability-department-of-education-and-training-western-australia.html.
  • Willermark, S., 2018. Technological pedagogical and content knowledge: a review of empirical studies published from 2011 to 2016. Journal of educational computing research, 56 (3), 315–343. doi:10.1177/0735633117713114.
  • Wilson, M.L., Ritzhaupt, A.D., and Cheng, L., 2023. The impact of technology integration courses on preservice teacher attitudes and beliefs: a meta-analysis of teacher education research from 2007-2017. Journal of research on technology in education, 55 (2), 252–280. doi:10.1080/15391523.2021.1950085.
  • Wright, B., and Linacre, J.M., 1994. Reasonable mean-square fit values. Rasch measurement transactions, 8, 370–371. Available from: https://www.rasch.org/rmt/rmt83b.htm.
  • Xianhan, H., et al., 2022. Associations of different types of informal teacher learning with teachers’ technology integration intention. Computers & Education, 190, 104604. doi:10.1016/j.compedu.2022.104604.
  • Zile-Tamsen, C.V., 2017. Using Rasch analysis to inform rating scale development. Research in higher education, 58, 922–933. doi:10.1007/s11162-017-9448-0.

Appendices

Appendix 1. Component 4, Word Processing Wright map

Appendix 2. Component 8, Coding and Robotics Wright map

Appendix 3. Original survey 100 items. Ill-fitting items are marked*

Component 1. General computing skills

Component 2. Email skills

Component 3. World wide web skills

Component 4. Word processing skills

Component 5. Presentation skills

Component 6. Spreadsheet skills

Component 7. Multimedia skills

Component 8. Coding and robotics skills

Component 9. ICT and technologies pedagogies skills

Component 10. Common technologies skills