879
Views
0
CrossRef citations to date
0
Altmetric
HEALTH PSYCHOLOGY

Validation of the new brief 6-item version of the Shirom-Melamed Burnout MeasureOpen DataOpen Materials

ORCID Icon & ORCID Icon
Article: 2258476 | Received 01 Apr 2023, Accepted 06 Sep 2023, Published online: 08 Nov 2023

Abstract

The Shirom-Melamed Burnout Questionnaire/Measure (SMBQ/M) is one of the most commonly used measures of burnout. Using confirmatory factor analyses, the present study aimed to evaluate the model fit, composite reliability, and factorial (i.e. convergent and discriminant) validity of the new brief Swedish version of the scale-labeled SMBM-6. In addition, we used Cronbach’s α as an indicator of the internal consistency of the total scale. The SMBM-6 consists of two subscales: the emotional and physiological exhaustion subscale (three items) and the cognitive weariness subscale (three items). A total of 1251 teachers in Sweden were included in the study. The analyses showed that the Swedish version of the SMBM-6 has an excellent model fit and good convergent validity. The discriminant validity for the cognitive weariness subscale was good, but slightly inadequate for the physiological exhaustion subscale. Composite reliability and Cronbach’s α indicated high internal consistency for the subscales and the total scale, respectively. Multi-group invariance tests for age indicated no violation of invariance. These results are consistent with those of the study by Almén and Jansson (2021), in which the SMBM-6 was developed, and a subsequent psychometric study by Sundström et al. (2022). In conclusion, there is strong support for the Swedish version of the SMBM-6 as a reliable and valid scale for measuring burnout. Testing the scale in languages other than Swedish is warranted.

1. Introduction

Sustained resource depletion as a consequence of stress manifested by sustained exhaustion, often referred to as burnout (Shirom, Citation2003) is common in many countries around the world. In Sweden, where the present study was conducted, clinical levels of exhaustion are among the most common reasons for long-term sick leave (Lidwall & Olsson-Bohlin, Citation2017). Burnout is associated with a wide range of negative responses and events such as digestive problems, skin problems, headaches (Chakravorty & Singh, Citation2021), anxiety, depression (Koutsimani et al., Citation2019), allostatic overload, systemic inflammation, metabolic syndrome, cardiovascular disease, mortality (Bayes et al., Citation2021), and suicidal ideation (Andela, Citation2021; Wray & Jarrett, Citation2019).

It is important to have access to valid and user-friendly measurements to assess burnout, for example, to be able to detect (1) people at risk of suffering from the syndrome and associated negative experiences or problems, and (2) workplaces with employees with high levels of burnout. One of the most frequently used measures of burnout is the Shirom-Melamed Burnout Questionnaire/Measure (SMBQ/M; henceforth solely called SMBM; Qiao & Schaufeli, Citation2011). As different versions of the SMBM have been inadequately tested psychometrically, in a recent study, Almén and Jansson (Citation2021) validated several Swedish versions of the instrument. A four-factor model (SMBM-22) including the factors emotional and physiological exhaustion, cognitive weariness, listlessness, and tension, and a three-factor model (SMBM-18, including physiological exhaustion, cognitive weariness, and listlessness), reached a good model fit after the removal of three items. In addition, the two different two-factor models (labeled SMBM-11 and SMBM-12) solely covering the core dimensions of burnout—emotional, physiological, and cognitive resource depletion—via the physiological exhaustion subscale and the cognitive weariness subscale—reached good model fit without any modifications. All models showed evidence of good composite reliability and convergent validity in terms of how closely the items were associated with their factors. The study raised some concerns regarding discriminant validity with respect to physiological exhaustion unsatisfactory ability to differentiate itself from the overall measure. Additionally, recent studies by Michel et al. (Citation2022) and Sundström et al. (Citation2022) support the construct validity of different versions of the SMBM.

A frequently occurring problem in many research fields, not least when self-reports are requested, is the lack of response rates. In particular, the response rate tends to be low in online surveys (Sammut et al., Citation2021; Wu et al., Citation2022). One of the largest problems with insufficient response rates is uncertainty regarding whether the existing sample is representative of the population intended to be studied. When Sammut et al. (Citation2021) conducted a literature review to determine how to counteract the response rate problem, the authors suggested using short surveys. Accordingly, based on items from the SMBM-12, in their psychometric study, Almén and Jansson (Citation2021) developed and analyzed a new brief version of the SMBM, a two-factor model (SMBM-6) consisting of three items from the physiological exhaustion subscale and the cognitive weariness subscale, respectively. Items were selected based on face validity. SMBM-6 demonstrated results very similar, but slightly better, compared to the two factor-SMBM-12. The model fit was excellent, with satisfactory composite reliability for the factors, 0.78 for the physiological exhaustion subscale and .93 for the cognitive weariness subscale, and an excellent Cronbach’s α of .90 for the entire scale indicating excellent internal consistency for the total measure. The convergent validity was good for both subscales, while the discriminant validity was good for the cognitive weariness subscale and unsatisfactory for the physiological exhaustion subscale (which was the case for all SMBM-versions tested). In addition, the SMBM-6 correlated between .95 and .98 with the other tested SMBM scales. Following the first psychometric study of the SMBM-6 (Almén & Jansson, Citation2021), Sundström et al. (Citation2022) evaluated the validity of the same measure, which also demonstrated an excellent model fit for the SMBM-6. Convergent validity was concluded based on subscale intercorrelations and correlations ≥ 0.50 between the overall scale and its subscales, and between the overall scale and several stress- and ill-health-related scales, such as perceived stress, anxiety, and depression. However, some correlations were not strong, with the weakest correlation (.28) demonstrated for the cognitive weariness subscale and self-related health. Sundström et al. (Citation2022) did not test the reliability or discriminant validity of this scale. No measurement invariance tests were performed.

While the SMBM-6 seems to be a valid and reliable instrument for measuring burnout, this version of the SMBM needs to be further cross-validated to draw firmer conclusions regarding the validity and reliability of this scale. The two studies that have investigated the reliability and validity of the SMBM-6 have used general population samples; therefore, it is appropriate in the next step to test whether the scale is reliable and valid on a more specific population, and in particular, to test the scale on occupational groups where high levels of burnout are prevalent, such as nurses (Rudman et al., Citation2020) or teachers (Mijakoski et al., Citation2022). School teachers represent one of the largest professional cohorts in Sweden, and there is presently a shortage of qualified teachers in the Swedish labor market (Statistics Sweden, Citation2019). Burnout, which is associated with stress-related health issues, has emerged as a possible factor contributing to teacher attrition, retirement (Keller et al., Citation2014) and extended periods of sick leave (Lidwall & Olsson-Bohlin, Citation2017).

The aim of the present study was to empirically test the fit of the two-factor model of the Shirom-Melamed Burnout Measure (SMBM-6), using a specific population consisting of teachers. This population was appropriate in order to ascertain a wide range of responses, to capture scores towards both the lower and the higher ends of the scale. In addition, we aimed to evaluate the convergent and discriminant validity of the instrument by comparing estimates of average variance extracted and maximum shared squared variance of factors. Furthermore, we examined whether the scale showed similar structure between age groups (i.e., measurement invariance age groups).

2. Method

2.1. Recruitment

We used multiple platforms and methods as a strategy to increase diversity and inclusiveness in the sample. Recruitment was conducted via a link to a web survey published on social media (Facebook, Instagram, and LinkedIn). In addition, principals at 39 primary and secondary schools in Sweden were contacted via email. Five principals accepted the invitation as a link to the web survey was distributed by the principals via email to teachers at each school. In addition, acquaintances were asked to distribute the survey to teachers in their surroundings. In order to reach the target population and to enhance the credibility of the survey, contact information, information about the study, and information with respect to credentials were included as part of the invitation to participate in the study. No incentives were provided for participation.

2.2. Participants

The data collection terminated after 10 days, which at that point the number of participants in the study was 1251 (mean age, years = 43.87, SD = 9.68). 1141 (91.2%) stated that they were women and 101 (8.1%) stated that they were men, whereas 9 (0.7%) did not state any gender. All the participants worked as teachers. Due to a practical error during data collection, the first 333 (26.6%) participants did not have the opportunity to report the type of teacher they were. Of the remaining teachers, the majority (n = 607; 48.5%) were compulsory school teachers (students usually aged 6–16), while 133 (10.6%) worked at an upper secondary school (students usually aged 15–19), 10 (0.8%) at a folk high school (students usually aged 18 or older), and 95 (7.6%) at preschool (children usually aged 1–6). Teachers at a higher level (i.e., university) or training schools were not included in the study.

The study was conducted in accordance with the Declaration of Helsinki, and the participants were informed about the research purpose and issues concerning confidentiality, anonymity, and their rights were emphasized. Informed consent was obtained from all the participants.

2.3. The instrument

The items (see Table ) included in SMBM-6 were scored on a 7-point scale ranging from 1 (almost never) to 7 (almost always), with the scores on the two subscales, and the total score was averaged by dividing by the number of items of the scale. The person was given the following information before completing the questions: “Below are a number of conditions that everyone can experience occasionally. Describe the degree to which you experienced these during the past month”.

Table 1. The items included in SMBM-6 and its factor loadings

2.4. Analytical approach

First, we used Cronbach’s α as an indicator of the internal consistency of the total scale and ≥ 0.7 was used as the threshold for an acceptable α value (Taber, Citation2018).

Using confirmatory factor analyses with the Maximum Likelihood estimator, we tested whether the SMBM-6 structure was represented by two correlated first-order factors. There are several measures for evaluating the overall fit of a model (Hu & Bentler, Citation1999), and the use of multiple measures to interpret model fit is recommended. Using three to four fit indices provides adequate evidence of model fit, and reporting the χ2 value and degrees of freedom, the comparative fit index (CFI) and/or the Tucker-Lewis Index (TLI), and the Root mean squared error of approximation (RMSEA) will usually provide adequate information in order to be able to evaluate a model (Hair et al., Citation2019). Additionally, the standardized root mean square residual (SRMR) was used (Shi et al., Citation2019).

Regarding Chi-square statistics, a statistically significant value means that the model is not supported. With respect to RMSEA, values below .06 are considered a good model fit, and values below .08, an adequate fit. SRMR values around .08 or lower indicate a good fit to the data. With respect to the CFI and the TLI, while values above .90 suggest an acceptable fit, values above .95 suggests a close fit. See Hu and Bentler (Citation1999) for guidelines with respect to the cutoff criteria for fit indices.

Composite reliability was used as a measure of the internal consistency of the factors, and ≥ 0.7 a cut-off value for good reliability (Bacon et al., Citation1995). The criterion for discriminant validity is when the average variance extracted exceeds the maximum shared squared variance or average shared squared variance. For convergent validity, average variance extracted had to be greater than .50 and lower than composite reliability cale (i.e., variance explained by the construct should be greater than the measurement error and greater than the cross-loadings). See Hair et al. (Citation2019) for the suggested thresholds of these indices.

Lastly, measurement invariance tests were conducted across age groups: younger, 19–44 years (n = 633; 50.6%) versus older, 45–88 years (n = 618; 49.4%). A sequential strategy was used, and the invariance was tested at different levels. In the first model, the factor structure was specified identically across groups, with all parameters freely estimated across groups to establish configural invariance (i.e., equivalence in factor structure across the groups). Second, a metric (weak) invariance model was fitted in which the factor loadings were constrained to be equal, and this model fit was compared with the configural (baseline) model. Invariance exists when the fit of the metric invariance model is not substantially poorer than that of the configural model. Third, a scalar (strong) invariance model was fitted, in which factor loadings and item intercepts were constrained to be equal, and this fit was compared against the metric model. Finally, a residual (strict) invariance model was fitted in which factor loadings, intercepts, and residual variances were constrained to be equal, which was compared to the scalar measurement invariance model.

Although a scaled chi-square difference test for nested models can be used to index invariance between models, it suffers from the same dependency on sample size as the minimum fit function statistic; consequently, changes in model fit according to CFI and RMSEA were used. As suggested by Chen (Citation2007), a decrement in CFI of ≥ −.01 in addition to an increment in RMSEA of ≥ .015, corresponds to an adequate criterion indicative of a decrement in fit between models for sample sizes of > 300.

Analyses were conducted using JASP version 0.13.0 for Mac (JASP Team, 2020).

3. Results

The items for each factor and their corresponding factor loadings are presented in Table . The 2-factor-SMBM-6 demonstrated an excellent model fit with respect to all fit indices (Table ). Cronbach’s α for SMBM-6 was .927, indicating very good reliability. Composite reliability indices indicated very good reliability for both factors (both substantially above .70), and indices of convergent validity indicated no validity concerns (both factors’ average variance extracted were less than composite reliability and greater than .50; see Table ). While the discriminant validity for the cognitive weariness subscale was good, the average variance extracted for the physiological exhaustion subscale was lower than the maximum shared squared variance, which indicates slightly inadequate discriminant validity for the physiological exhaustion subscale.

Table 2. Model-fit indices for the analyzed SMBM-6

Table 3. Factor correlations and indicators of composite reliability, convergent validity and discriminant validity of the SMBM-6

With respect to invariance in age groups (see Table ), the results showed support for configural invariance (indicating a similar factor structure across age groups). There was no substantial decrease in the model fit in the metric model, indicating that full metric invariance was achieved (i.e., similar strength between the items and constructs across groups). Finally, the change in fit from scalar to residual model (fixing item loadings, intercepts, and residual variance to be equal across groups) passed the criteria for invariance.

Table 4. Results of the multi-group tests of invariance regarding age

4. Discussion

The aim of the present study was to further evaluate the Swedish version of the new brief, six-item version of the SMBM in order to draw firmer conclusions regarding reliability and factorial validity of the scale.

The Cronbach’s α value .93 for the new brief Swedish SMBM-6 is markedly higher than the commonly used threshold (≤0.7) for good reliability (i.e., internal consistency), indicating excellent reliability across the entire scale. Composite reliability indicated excellent reliability (i.e., internal consistency) for both the physiological exhaustion subscale (.85) and the cognitive weariness subscale (.95). The results clearly indicate that the two-factorial SMBM-6 has excellent reliability (i.e., internal consistency), an excellent model fit, and good convergent validity. Regarding discriminant validity, it was good for the cognitive weariness subscale and slightly inadequate for the physiological exhaustion subscale (because the indicators for this factor had less unique variance). The multi-group tests of invariance for age showed no decrement in model fit at any level, suggesting that the 6-item model obtained from the confirmatory factor analyses worked equally well for the two age groups. The results obtained in the present study confirm the conclusions made in previous studies of the SMBM-6 (Almén & Jansson, Citation2021; Sundström et al., Citation2022), that the Swedish SMBM-6 is a reliable and valid measure of burnout.

Based on the conclusion by Sammut et al. (Citation2021), the use of short surveys to counteract the common problem with low response rates, SMBM-6 could be beneficial for the response rate in comparison with the longer versions of the SMBM. Another advantage is the possibility of using SMBM-6 when conducting studies that have frequent assessments, for example, in diary or intervention studies that analyze change processes. In addition, as clinical levels of burnout can be difficult to treat, researchers recommend investing in preventive interventions (Glise et al., Citation2020) in which screening may be needed to capture people at risk, which could be done advantageously with fast-administering measurement methods. Moreover, because stress and burnout are related to many factors, many factors may need to be studied simultaneously, and the possibility of this increases if we have access to brief scales.

A limitation of the present study was the use of a non-randomized sample, which may limit the generalizability of the study’s findings. However, the present evaluation, along with the two previous evaluations of the Swedish version of the SMBM-6 (Almén & Jansson, Citation2021; Sundström et al., Citation2022) suggest that the results can be generalized to adults in general, as the results have pointed in the same direction when using a random and a non-random sample, and for general population samples and for a specific occupational group (teacher) sample. In line with this, the invariance testing in the present study and in the first study of the SMBM-6 (Almén & Jansson, Citation2021) indicates that the results hold for age groups. The first study on SMBM-6 demonstrated no violations of gender invariance. A limitation of the present study was the low proportion of men participating, which did not allow us to examine possible violations of gender invariance, and measurement invariance across gender should be considered in future validation studies.

There is strong empirical support for the conclusion that the Swedish version of the SMBM-6 is a reliable and valid scale for measuring burnout. The study demonstrates results that warrant further research, in particular, the testing of the scale in languages other than Swedish. If the SMBM is to be used for repeated measurements, for example, every week or every day, it is important to test the scale with an alternative instruction (in the instruction used in the three validation studies of the SMBM-6 conducted so far, the person was asked to base her/his estimate in the last month). For such use, in addition to the analyses made in the present study, test-retest reliability may be important to test.

Open Scholarship

This article has earned the Center for Open Science badges for Open Data, Open Materials and Preregistered. The data and materials are openly accessible at https://doi.org/10.1080/23311908.2023.2258476

Acknowledgements

We would like to express our gratitude to Mikaela Ekroth and Sofie Svensson, psychology students at Mid Sweden University, Sweden, for the collection of the data, and to the participants for participating in the study.

Data availability statement

Data are available from the author upon reasonable request. https://osf.io/qbmyr

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Notes on contributors

Niclas Almén

Niclas Almén is a clinical psychologist, specialized in Cognitive Behavior Therapy. He has a PhD in Psychology from at Mid Sweden University, Östersund, Sweden, and his research focuses on assessment and interventions in the field of stress and recovery.

Billy Jansson

Billy Jansson has a PhD from Stockholm University and currently working at Mid Sweden University, Östersund, Sweden. His research focus mainly on anxiety and trauma.

References

  • Almén, N., & Jansson, B. (2021). The reliability and factorial validity of different versions of the Shirom-Melamed burnout measure/Questionnaire and normative data for a general Swedish sample. International Journal of Stress Management, 28(4), 314–8. https://doi.org/10.1037/str0000235
  • Andela, M. (2021). Work-related stressors and suicidal ideation: The mediating role of burnout. Journal of Workplace Behavioral Health, 36(2), 125–145. https://doi.org/10.1037/str0000235
  • Bacon, D. R., Sauer, P. L., & Young, M. (1995). Composite reliability in Structural equations Modeling. Educational and Psychological Measurement, 55(3), 394–406. https://doi.org/10.1037/str0000235
  • Bayes, A., Tavella, G., & Parker, G. (2021). The biology of burnout: Causes and consequences. The World Journal of Biological Psychiatry, 1–13. https://doi.org/10.1080/15622975.2021.1907713
  • Chakravorty, A., & Singh, P. (2021). Correlates of burnout among Indian primary school teachers. International Journal of Organizational Analysis, 30(2), 589–605. https://doi.org/10.1108/IJOA-09-2020-2420
  • Chen, F. F. (2007). Sensitivity of goodness of fit indexes to lack of measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 14(3), 464–504. https://doi.org/10.1037/str0000235
  • Glise, K., Wiegner, L., & Jonsdottir, I. H. (2020). Long-term follow-up of residual symptoms in patients treated for stress-related exhaustion. BMC Psychology, 8(1), 26. https://doi.org/10.1037/str0000235
  • Hair, J., Black, W., Babin, B., & Anderson, R. (2019). Multivariate data analysis (8th ed.). Prentice-Hall.
  • Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1–55. https://doi.org/10.1037/str0000235
  • Keller, M. M., Chang, M.-L., Becker, E. S., Goetz, T., & Frenzel, A. C. (2014). Teachers’ emotional experiences and exhaustion as predictors of emotional labor in the classroom: An experience sampling study. Frontiers in Psychology, 5, 5. https://doi.org/10.1037/str0000235
  • Koutsimani, P., Montgomery, A., & Georganta, K. (2019). The relationship between burnout, depression, and anxiety: A systematic review and meta-analysis. Frontiers in Psychology, 10, 284. https://doi.org/10.1037/str0000235
  • Lidwall, U., & Olsson-Bohlin, C. (2017). Psykiatriska diagnoser: lång tid tillbaka till arbete vid sjukskrivning. (Korta analyser 2017:1). Försäkringskassan, avdelningen för analys och prognos. Retrieved November 27, 2022, from https://doi.org/10.1037/str0000235
  • Michel, J. S., Shifrin, N. V., Postier, L. E., Rotch, M. A., & McGoey, K. M. (2022). A meta-analytic validation study of the Shirom–Melamed burnout measure: Examining variable relationships from a job demands–resources perspective. Journal of Occupational Health Psychology, 27(6), 566–584. Advance online publication. https://doi.org/10.1037/ocp0000334
  • Mijakoski, D., Cheptea, D., Marca, S. C., Shoman, Y., Caglayan, C., Bugge, M. D., Gnesi, M., Godderis, L., Kiran, S., McElvenny, D. M., Mediouni, Z., Mesot, O., Minov, J., Nena, E., Otelea, M., Pranjic, N., Mehlum, I. S., van der Molen, H. F., & Canu, I. G. (2022). Determinants of burnout among teachers: A systematic review of longitudinal studies. International Journal of Environmental Research and Public Health, 19(9), 5776. Article 9. https://doi.org/10.3390/ijerph19095776
  • Qiao, H., & Schaufeli, W. B. (2011). The convergent validity of four burnout measures in a Chinese sample: A confirmatory factor-analytic approach. Applied Psychology: An International Review, 60(1), 87–111. https://doi.org/10.1111/j.1464-0597.2010.00428.x
  • Rudman, A., Arborelius, L., Dahlgren, A., Finnes, A., & Gustavsson, P. (2020). Consequences of early career nurse burnout: A prospective long-term follow-up on cognitive functions, depressive symptoms, and insomnia. EClinicalMedicine, 27, 27. https://doi.org/10.1016/j.eclinm.2020.100565
  • Sammut, R., Griscti, O., & Norman, I. J. (2021). Strategies to improve response rates to web surveys: A literature review. International Journal of Nursing Studies, 123, 104058. https://doi.org/10.1037/str0000235
  • Shi, D., Lee, T., & Maydeu-Olivares, A. (2019). Understanding the model size effect on SEM fit indices. Educational and Psychological Measurement, 79(2), 310–334. https://doi.org/10.1037/str0000235
  • Shirom, A. (2003). Job-related burnout. In J. C. Quick & L. E. Tetrick (Eds.), Handbook of occupational health psychology (pp. 245–265). American Psychological Association. https://doi.org/10.1037/10474-012
  • Statistics Sweden. (2019). Fortsatt brist på lärare. Retrieved January 24, 2023 from https://doi.org/10.1037/str0000235
  • Sundström, A., Söderholm, A., Nordin, M., & Nordin, S. (2022). Construct validation and normative data for different versions of the Shirom-Melamed burnout questionnaire/measure in a Swedish population sample. Stress and Health, 39(3), 499–515. https://doi.org/10.1002/smi.3200
  • Taber, K. S. (2018). The use of Cronbach’s alpha when developing and Reporting research Instruments in Science education. Research in Science Education, 48(6), 1273–1296. https://doi.org/10.1037/str0000235
  • Wray, C. A., & Jarrett, S. B. (2019). The relationship between burnout and suicidal ideations among Jamaican police officers. International Journal of Police Science & Management, 21(3), 181–189. https://doi.org/10.1177/1461355719856026
  • Wu, M.-J., Zhao, K., & Fils-Aime, F. (2022). Response rates of online surveys in published research: A meta-analysis. Computers in Human Behavior Reports, 7, 100206. https://doi.org/10.1016/j.chbr.2022.100206