414
Views
0
CrossRef citations to date
0
Altmetric
Research Article

The impact of normative argument quality variations on claim acceptance: empirical evidence from the US and the UK

ORCID Icon
Received 13 Apr 2023, Accepted 29 Feb 2024, Published online: 12 Mar 2024

Abstract

In the examination of what makes for a strong argument, a stream of research has used normative criteria from argumentation theory to compare high-quality with low-quality arguments. While a larger body of research on argument quality and persuasion has involved US and UK samples, no study with these samples has reported on the impact of argument quality manipulated with normative criteria. Therefore, an experiment was conducted in which American (N = 237) and British (N = 231) participants assessed several claims supported by arguments varying in quality. Results show that high-quality arguments lead to higher claim acceptance than low-quality arguments for British but not for Americans. The results are discussed in the context of a meta-analysis of similar studies.

The question of how beliefs are formed or changed is almost intrinsically connected to the question of what role arguments play in the process (Carpenter Citation2015; Hahn Citation2020; Park et al. Citation2007; Petty and Cacioppo Citation1986). Arguments and their characteristics are indeed important for the degree to which people change their minds.

One long time research interest has been what type of arguments bring out what kind of effects. In communication research, type of arguments (where “type” specifies the relationship between the argument and the claim) has mostly been labeled type of evidence (where “type” specifies the kind of information in the argument). Research on evidence types has generated numerous experimental studies (mostly on anecdotal versus statistical evidence, see, for instance, Chinn and Weeks Citation2021; Hornikx and De Best Citation2011; Wojcieszak et al. Citation2017), as well as reviews summarizing such studies (e.g. Allen and Preiss Citation1997; Zebregs et al. Citation2015).

A related research interest in the domain of arguments and belief formation is the effect of varying degrees of argument quality, with the general conclusion being that high-quality arguments lead to better claim acceptance than low-quality arguments (Carpenter Citation2015). One approach in studying in argument quality has become more popular in recent years, namely the approach of manipulating argument quality with respect to normative criteria from argumentation theory (cf. Hoeken, Hornikx, and Linders Citation2020). Following this approach, high-quality arguments respect norms for sound argumentation, whereas low-quality arguments respect such norms to a lesser degree. Empirical studies with this approach corroborate other argument quality studies in showing that arguments of normatively high quality lead to higher claim acceptance that arguments of normatively lower quality (e.g. Demir and Hornikx Citation2022; Hoeken, Šorm, and Schellens Citation2014). While these studies have involved participants from different countries and continents (e.g. France, Germany, India, the Netherlands), they have neglected Anglo-Saxon participants so far. Therefore, this study was conducted to examine the differential impact of normatively strong and normatively weak arguments for participants in the US and the UK.

Argument quality and persuasiveness

In the research paradigm of comparing normatively sound argumentation and actual persuasive success (O’Keefe Citation2007), a prominent interest has been on examining whether arguments of normatively high quality are more persuasive than arguments of normatively low quality. This interest is motivated by the quest for objective standards to assess the quality of an argument. What, in effect, makes for a persuasive argument? As Hahn (Citation2020, 364) put it, “Argument evaluation requires an independently motivated, general, normative standard that indicates what arguments we should find convincing.” Many standards have been proposed, under which logical validity (see Hahn Citation2020, for a discussion), argumentation schemes (Walton, Reed, and Macagno Citation2008), Bayesian probability theory (Hahn and Oaksford Citation2007), or attribute degree centrality and attribute tie strength (Russell and Reimer Citation2018).

In empirical research comparing normative criteria with persuasive success, the argumentation scheme perspective has been most widely used (Lumer Citation2022, for a critical discussion). This perspective links different types of argument to specific normative critical questions. Walton, Reed, and Macagno (Citation2008) have grouped a large set of types of argument, argumentation schemes, with associated critical questions. For the argument from expert opinion, a critical question is whether the expert has expertise in the field of the claim he underlines – leading to “relevant field of expertise” as a normative criterion for high-quality arguments from expert opinion. In Hornikx and Hoeken (Citation2007), for instance, the persuasiveness of arguments from expert opinion were examined that respected the field of expertise criterion (high quality) or that disrespected it (low quality). Their results demonstrated that high-quality arguments generated higher claim acceptance than low-quality arguments, but only for Dutch participants and not for French participants. This study has inspired more studies on the relationship between argument quality and persuasiveness in different cultural settings (e.g. Demir and Hornikx Citation2022; Hornikx and Ter Haar Citation2013; Karaslaan et al. Citation2018). While some studies show variation in this relationship, the overall picture resulting from the analysis of Demir and Hornikx (Citation2022, 106) shows “a generalizable effect of argument quality variations on claim acceptance.” The analysis includes data from the France, Germany, India, Netherlands, Turkey. Although many studies on argument quality and/or persuasion have been conducted in the UK and particularly in the US (see, for instance, the studies included in Carpenter Citation2015), no study in either country has reported on the persuasiveness of high-quality and low-quality arguments following normative criteria from argumentation theory. Therefore, the current study’s aim was to answer the following research question:

Research question: Is there an impact of argument quality (high versus low) on claim acceptance in US and UK?

Method

American and British participants assessed several claims supported by arguments that varied in quality (high-quality, low-quality) and in type (statistical evidence, expert evidence).

Material

First, a pretest was conducted to make sure that the claims to be used in the material would not be too improbable or too probable. When participants have a extreme position towards a claim, they are likely to show motivated reasoning, potentially leading them to assess the quality of arguments in accordance with their acceptance of the claim (cf. Edwards and Smith Citation1996; Mercier and Sperber Citation2017). All 50 claims from Hornikx and Hoeken (Citation2007) were translated from Dutch into (American and British) English, and were presented in random orders to two populations on Prolific (reward: $0.90 or £0.75): 50 American residents from the US (age M = 38.04, SD = 14.20; 54% male; 38% secondary school, 38% Bachelor’s degree), and 49 English residents from the UK (age M = 44.04, SD = 15.95; 74% male; 37% secondary school, 35% Bachelor’s degree). All participants were asked how probable they found the 50 claims on a 5-point scale (very improbable – very probable). Across the two samples, eight claims were found not to significantly deviate from the 3.00 midpoint of the scale, and therefore were considered suitable as experimental claims. An example of such a neutral claim is “Playing computer games has a positive impact on people’s sense of direction.” For each target group (US or UK), the two unique other claims with the largest non-significant p-value were added, such as “Listening to classical music helps students to absorb a lot of knowledge in a short period of time” for the US target group, and “Playing slow music in supermarkets raises their turnover” for the UK target group. This resulted in a set of 10 neutral claims for both the US (M = 3.03, SD = 0.18) and the UK (M = 3.01, SD = 0.18).

For each of the selected claims, four different variations of evidence were constructed: statistical evidence of high quality and low quality, and expert evidence of high quality and low quality. Statistical evidence presented the results of an American or a British study that demonstrated the effect presented in the claim. The quality of statistical evidence was manipulated through the number of cases in the study, and through the percentage of people in the study who had experienced the effect (cf. Hornikx and Hoeken Citation2007; Hornikx and Ter Haar Citation2013). Different percentages and sample sizes were used. An example of the high(low)-quality statistical evidence for the claim “Playing computer games has a positive impact on people’s sense of direction” is “A study among 314 (46) American/British people shows that playing computer games had a positive impact on their sense of direction for 78% (35%) of them.”

Expert evidence simply consisted of a university professor repeating the exact claim. The quality of this type of evidence was manipulated through the (in)congruence between the field of expertise of the professor on the one hand and the topic of the claim on the other. An example of high-(low)-quality expert evidence was “According to Dr. Ramos, a full professor in game research (linguistics) at the University of Pennsylvania, playing computer games has a positive impact on people’s sense of direction” for the US, and “According to Dr. Powell, a professor in game research (linguistics) at the University of Bath, playing computer games has a positive impact on people’s sense of direction” for the UK. Following earlier manipulations (e.g. Hornikx and Hoeken Citation2007; Hornikx and Ter Haar Citation2013), the low-quality evidence variants were relatively low in quality as opposed to the high-quality evidence (i.e. they may still be considered quite high in quality in absolute terms given the support of a professor).

In a second pretest, the manipulation of the fields of expertise was checked. For all twelve different claims (eight similar, two unique for the US, two unique for the UK), normatively strong and normatively weak expert evidence was constructed. Across two versions, each participant evaluated five normatively strong and five normatively instantiations of weak expert evidence (reward: $0.90 or £0.75). The 140 participants (UK: N = 70, age M = 39.01, SD = 14.40; 68% female; 53% Bachelor’s degree; US: N = 70, age M = 37.59, SD = 13.93; 53% male; 47% Bachelor’s degree) assessed on a 5-point scale to degree to which the professor possessed sufficient expertise to make the statement that he supported. Except for the first claim in the American sample, all manipulations were successful: the professors in the normatively strong expert evidence were found to possess a significantly higher level of expertise (M = 3.81, SD = 0.88) than the professors in the normatively weak expert evidence (M = 2.64, SD = 1.10). In the expert evidence that was not successfully manipulated, the strong and weak fields of expertise were modified in preparation for the main experiment.

Next to the experimental claims with normatively strong or weak statistical or expert evidence, there were six filler claims supported either with anecdotal evidence (three times) or causal evidence (three times) (taken from Hornikx and Hoeken Citation2007). In the total of 16 claims presented to each participant in the experiment, names, cities, and universities were mentioned. The cross-cultural equivalence of these elements (see Harkness, Van de Vijver, and Mohler Citation2003) was ensured by using databases to select (1) given names and names that take the same relative position in frequency use in the two countries, (2) cities that have the same relative position in number of inhabitants in the two countries, and (3) universities that have the same relative reputation in the two countriesFootnote1. As a result, the British equivalent of the American expert Dr. Howard from the University of Chicago was Dr. Foster from Durham University, and the British equivalent of Randy Brooks from San Diego, used in anecdotal evidence, was Louis Gray from Manchester.

Participants

In response to a call to participate on Prolific (reward: $1.80), 237 American people took part in the study. On average, they were 37.87 years old (SD = 14.28; range: 19–77), and most of them were male (57.0%). Educational level varied between primary school and PhD, with a Bachelor’s degree as most frequent category (44.3%). Participants were randomly assigned to one of the five versions of the material. Mean age (F (4, 232) <1), sex distribution (χ2 (8) = 6.79, p = .56), and educational level (χ2 (16) = 5.28, p = .99) were not found to differ between participants in the five groups. UK citizens on Prolific were also invited to take part in the study (reward: £1.50); 231 people accepted the invitation. On average, the British participants were 38.30 years old (SD = 13.14; range: 18–76), and most of them were female (60.6%). Educational level varied between primary school and PhD, with a Bachelor’s degree as most frequent category (46.3%). Participants were randomly assigned to one of the five versions of the material. Mean age (F (4, 226) = 1.282, p = .28), sex distribution (χ2 (8) = 5.86, p = .66), and educational level (χ2 (16) = 20.12, p = .22) were not found to differ between participants in the five groups.

Design

The experiment had a 2 (nationality: American, British) × 2 (quality of evidence: high, low) × 2 (type of evidence: expert, statistical) with Nationality as between-subject factor, and Quality of evidence and Type of evidence as within-subject factors. The distribution of Quality of evidence and Type of evidence over the five different versions followed the Latin square design that was presented in Hornikx and Ter Haar (Citation2013).

One single order of presentation of the ten experimental claims and the six filler claims was employed in all five versions (all claims can be found at https://osf.io/ja4hw). In each version, two experimental claims were supported by high-quality statistical evidence, two claims by low-quality statistical evidence, two claims by high-quality expert evidence, two claims by low-quality expert evidence, and two claims were not supported. Across the five versions, each claim was either supported by one of the four evidence variations, or not supported by evidence. For example, claim 1 was supported by strong statistical evidence in version 1, by strong expert evidence in version 2, by weak statistical evidence in version 3, by weak expert evidence in version 4, and was not supported in version 5. The cases in which the claim was not supported by evidence were crucial for measuring the persuasiveness of evidence. These cases served as baseline, in which the acceptance of the claim was measured. This information was needed to compute the impact of the evidence in the other versions in which the same claim was indeed supported by evidence. This impact was the difference score between the claim acceptance with evidence on the one hand, and the claim acceptance without support on the other.

Instrumentation

To the participants, the experiment was introduced as a survey on social issues. First, the 16 claims with evidence were presented, each time followed by a repetition of the claim and the 5-point semantic differential “I think this claim is” with as endpoints “very improbable” and “very probable.”

After the 16 claim acceptance questions, participants filled in eight 6-point Likert items selected by Fernandes, Lynch, and Netemeyer (Citation2014) of the original 20 items of the Preference for Numerical Information scale (Viswanathan Citation1993); this preference might be related to the sensitivity to the quality of statistical evidence. Examples of this scale, which was reliable in the American (α = 0.90) and the British (α = 0.92) sample, are “I enjoy work that requires the use of numbers” and “I find it satisfying to solve day-to-day problems involving numbers.” With a similar aim, the Understanding of Generalization scale was borrowed from Hornikx and Ter Haar (Citation2013). Participants had to indicate which of two examples on 5-point scales they would prefer as proof for the generality of the occurrence of an effect: “the effect occurs in 35% of 46 persons – the effect occurs in 78% of 314 persons,” “the effect occurs in 78% of 46 persons – the effect occurs in 78% of 314 persons,” and “the effect occurs in 35% of 46 persons – the effect occurs in 35% of 314 persons.” The scale was just reliable (US: α = 0.73; UK: α = 0.63).

Next, the manipulation of the relevant and irrelevant fields of expertise was checked with 5-point Likert scales preceded by statements about the four specific experts that participants had encountered in their version of the material, such as “Dr. Powell is a professor in linguistics at the University of Bath. In that capacity, he possesses sufficient expertise to make a statement about the relationship between playing computer games and sense of direction.”

The final questions were about participants’ age, sex, nationality, and current education.

Procedure and statistical tests

Potential American and British participants were invited to respond to a survey presented on Prolific. After participation, they were thanked by the researcher, and reimbursed through Prolific.

The results section below presents the impact of evidence quality on claim acceptance. This impact was computed through the difference scores between a given claim with evidence, and the same claim without evidence. Scores across the five versions were recoded, so that each participant had two differences scores for high-quality statistical evidence, two scores for low-quality statistical evidence, two scores for high-quality expert evidence, and two scores for low-quality expert evidence. For each of the four variants, the two difference scores were averaged.

Analyses of variance with repeated measures with nationality (between-subjects), type of evidence (within-subjects), and quality of evidence (within-subjects) were conducted to address the research question.

Results

Preliminary analyses

It was first checked whether the ten experimental claims without evidence were indeed moderately probable as pretested. The average claim acceptance of claims without evidence on a 5-point scale was M = 3.18 (SD = 0.83) in the US sample, and M = 3.01 (SD = 0.79) in the UK sample. While this mean acceptance did not significantly differ from the scale midpoint in the UK sample (t(230) = 0.25, p = .80), it did differ in the US sample (t(236) = 3.31, p<.001). With an average of M = 3.18 for the claims without evidence in the US, however, there is still much room for better claim acceptance when evidence was added as support.

Second, the manipulation of high- and low-quality expert evidence was tested. US participants found the experts manipulated as strong (relevant field of expertise) to possess a higher level of expertise (M = 3.68 on a 5-point scale, SD = 0.83) than the experts manipulated as weak (irrelevant field of expertise) (M = 2.78, SD = 1.05), F(1, 236) = 115.92, p<.001, n2 = 0.33. Similarly, the UK participants found the experts manipulated as strong to also possess a higher level of expertise (M = 3.82, SD = 0.69) than the experts manipulated as weak (M = 2.54, SD = 0.89), F(1, 230)=329.13, p<.001, n2 = 0.59.

Main analyses

In order to address the research question of the impact of argument quality on claim acceptance, first a repeated measures ANOVA with Nationality, Quality of evidence, and Type of evidence was conducted. The main effect of Quality of evidence (F (1, 466) = 24.43, p<.001, n2 = 0.05; high-quality evidence: M = 0.32, SD = 0.58, low-quality evidence: M = 0.15, SD = 0.61) was qualified by a significant interaction between Quality of evidence and Nationality (F (1, 466) = 6.82, p = .009, n2 = 0.01). Therefore, the impact of evidence quality on claim acceptance was assessed for the American and British participants separately.

For the American participants, the main effect of Quality of evidence on claim acceptance was not significant (F (1, 236) = 3.12, p = .080), and this was not qualified by Type of evidence as the interaction between Type of evidence and Quality of evidence was not significant (F (1, 236) <1). The main effect of Type of evidence was also not significant F (1, 236) = 2.33, p = .128).

For the British participants, the main effect of Quality of evidence on claim acceptance was significant (F (1, 230) = 25.12, p<.001, n2 = 0.10). This effect was qualified by a significant interaction between Quality of evidence and Type of evidence (F (1, 466) = 9.07, p = .003, n2 = .04). Analyses for the two types of evidence separately showed that the effect of Quality of evidence was significant both for statistical evidence (F (1, 230) = 30.76, p<.001, n2 = 0.12) and for expert evidence (F (1, 230) = 4.49, p=.035, n2 = 0.02). The main effect of Type of evidence was not significant (F (1, 230) = 1.94, p = .165). The impact of evidence in function of quality and type for the American and British samples is shown in .

Table 1. Persuasiveness of evidence in function of quality, type, and nationality (significant differences between high- and low-quality within a national sample are indicated with different superscripts).

Additional analyses

Participants’ distinction between the persuasiveness of strong and weak statistical evidence was expressed as a difference score. This score was examined for its potential relationship with the preference for numerical information, and the understanding of generalization. On the Preference for Numerical Information scale, participants from both the US (M = 4.25, SD = 0.97) and the UK (M = 4.27, SD = 0.96) scored just above the scale midpoint of 3.5; their scores did not correlate with the difference score (US: r (237)= −0.03, p = .69; UK: r (231) = −0.01, p = .88). On the Understanding of Generalization scale, participants from both the US (M = 3.93, SD = 1.08) and the UK (M = 3.59, SD = 1.15) scored well above the scale midpoint of 3.0; the correlation with the difference score was significant for the British participants (r (229) = 0.216, p<.001), but not for the American participants (r (236) = 0.079, p = .225). For the British, a better understanding of generalization was associated with a larger difference between the persuasiveness of strong and weak statistical evidence.

Conclusion and discussion

There has been a longstanding research interest on the role of argument quality in belief formation, belief change, and persuasion. Several studies have taken normative criteria from argumentation theory to manipulate high-quality and low-quality arguments, and to assess their relative impact on claim acceptance (e.g. Hoeken, Šorm, and Schellens Citation2014; Hornikx and Hoeken Citation2007). While an impressive number of empirical studies on argument quality and persuasion involved participants from the US and UK (see, for a meta-analysis of studies, Carpenter Citation2015), no such study used these normative criteria as a basis for comparing variations in argument quality. For this reason, the impact of argument quality on claim acceptance using normative criteria from argumentation theory was investigated in the US and the UK.

Following criteria from the argumentation schemes perspective (cf. Walton, Reed, and Macagno Citation2008), arguments were constructed that were (not) in line with an important criterion for good argument. These arguments were used in support of different claims that were pretested to be neutral to the participants (which was confirmed in a pretest). Participants from the UK and the US assessed these claims with argument and by comparing their acceptance of claims with and without arguments, the impact of argument quality on claim acceptance was computed. Results of the experiment showed that, for the UK, high-quality arguments were associated with higher claim acceptance than low-quality arguments, both in case of expert evidence and statistical evidence. For the US, however, this association was not significant.

Two other cases – Hornikx and Hoeken (Citation2007, S2, France) and Hornikx and Ter Haar (Citation2013, S1 and S2, Germany) – have also reported non-significant results. That is, high-quality arguments were not found to lead to higher claim acceptance than low-quality arguments. While no cultural explanation was found in the latter case (Hornikx and Ter Haar Citation2013), for the former case (Hornikx and Hoeken Citation2007), another study found people’s cultural-educational background to be a partial explanation in case of expert evidence (Hornikx Citation2011). It has proven very difficult to statistically attribute nationally observed differences to the notion of culture (see, e.g. Barrett Citation2020; Van de Vijver and Leung Citation1997). In the case of the US data in the present study, there does not seem to be a cultural reason for why the impact of argument quality was absent. In addition, other potential explanations can be discarded: the manipulation of field of expertise (as criterion for expert evidence) was successful, and participants had a good understanding of the notion of generalization (related to the manipulated criterion for statistical evidence).

The means for high- and low-quality arguments for the US sample are compatible with the general effect but the difference is simply just not significant. Rather than testing for significance of individual studies, meta-analytically compiling effect sizes of studies presents a more generalizable picture of the effect under investigation (Allen Citation2009). Therefore, the data of similar studies on normative argument quality variations and claim acceptance reported in Demir and Hornikx (Citation2022) were used, and combined with the data of the current paper into a random-effects meta-analytic procedure (JASP Team Citation2023). The meta-analysis finds strong evidence that claim acceptance is higher with high-quality than with low-quality arguments (mean r = 0.197, p<.001, 95% CI [0.146; 0.248], k = 26, N = 5,052).Footnote2 The current study with American and British participants as additional data has not affected the overall effect of argument quality on claim acceptance.Footnote3

This study therefore seems to further underline the universality of norms for good argumentation (cf. Demir and Hornikx Citation2022; Oaksford Citation2014) by expanding the cultural contexts in which the relationship between normative argument quality variations and claim acceptance was examined.

Disclosure statement

This project was funded by the Centre for Languages Studies (RG2022-4).

Notes

1 Names selected took positions 75, 76, 77, 79, 80, 81, and 82 in a US database https://www.thoughtco.com/most-common-us-surnames-1422656, or in a UK database https://en.wiktionary.org/wiki/Appendix:English_surnames_(England_and_Wales). The given names took positions 75, 77, and 78 in the US database https://namecensus.com/first-names/common-male-first-names, or in the UK database https://www.netmums.com/pregnancy/100-traditional-english-names-for-boys. The cities took the positions 5, 8 and 11 in the US database https://worldpopulationreview.com/countries/cities/united-states, or in the UK database https://worldpopulationreview.com/countries/cities/united-kingdom. Finally, the universities took positions 6, 8, 14 and 18 in the US database https://www.usnews.com/best-colleges/rankings/national-universities, or in the UK database https://www.thecompleteuniversityguide.co.uk/league-tables/rankings.

2 In this analysis, some effect sizes were dependent as participants in some circumstances assessed high- and low-quality variations of more than one type of evidence. In an additional meta-analysis, effect sizes were collapsed across comparisons for a given human sample. The effect from the main analysis was corroborated, thereby excluding dependency of effect size as potential factor: mean r = 0.211, p<.001, 95% CI [0.146; 0.275], k = 18, N = 3,345).

3 A random effects meta-analysis with the raw data provided by Demir and Hornikx (Citation2022) shows: mean r = 0.204, p<.001, 95% CI [0.148; 0.260], k = 22, N = 4,116).

References

  • Allen, M. 2009. “Meta-Analysis.” Communication Monographs 76 (4): 398–407. https://doi.org/10.1080/03637750903310386.
  • Allen, M., and R. W. Preiss. 1997. “Comparing the Persuasiveness of Narrative and Statistical Evidence Using Meta-Analysis.” Communication Research Reports 14 (2): 125–131. https://doi.org/10.1080/08824099709388654.
  • Barrett, H. C. 2020. “Towards a Cognitive Science of the Human: Cross-Cultural Approaches and Their Urgency.” Trends in Cognitive Sciences 24 (8): 620–638. https://doi.org/10.1016/j.tics.2020.05.007.
  • Carpenter, C. J. 2015. “A Meta-Analysis of the ELM’s Argument Quality × Processing Type Predictions.” Human Communication Research 41 (4): 501–534. https://doi.org/10.1111/hcre.12054.
  • Chinn, S., and B. E. Weeks. 2021. “Effects of Competing Statistical and Testimonial Evidence in Debates about Science.” Environmental Communication 15 (3): 353–368. https://doi.org/10.1080/17524032.2020.1837900.
  • Demir, Y., and J. Hornikx. 2022. “Sensitivity to Argument Quality: Adding Turkish Data to the Question of Cultural Variability versus Universality.” Communication Research Reports 39 (2): 104–113. https://doi.org/10.1080/08824096.2022.2045930.
  • Edwards, K., and E. E. Smith. 1996. “A Disconfirmation Bias in the Evaluation of Arguments.” Journal of Personality and Social Psychology 71 (1): 5–24. https://doi.org/10.1037/0022-3514.71.1.5.
  • Fernandes, D., J. G. Lynch, Jr. and R. G. Netemeyer. 2014. “Financial Literacy, Financial Education, and Downstream Financial Behaviors.” Management Science 60 (8): 1861–1883. https://doi.org/10.1287/mnsc.2013.1849.
  • Hahn, U. 2020. “Argument Quality in Real World Argumentation.” Trends in Cognitive Sciences 24 (5): 363–374. https://doi.org/10.1016/j.tics.2020.01.004.
  • Hahn, U., and M. Oaksford. 2007. “The Rationality of Informal Argumentation: A Bayesian Approach to Reasoning Fallacies.” Psychological Review 114 (3): 704–732. https://doi.org/10.1037/0033-295X.114.3.704.
  • Harkness, J. A., F. J. R. Van de Vijver, and P. P. Mohler. 2003. Cross-Cultural Survey Methods. Hoboken, NJ: John Wiley and Sons.
  • Hoeken, H., J. Hornikx, and Y. Linders. 2020. “The Importance and Use of Normative Criteria to Manipulate Argument Quality.” Journal of Advertising 49 (2): 195–201. https://doi.org/10.1080/00913367.2019.1663317.
  • Hoeken, H., E. Šorm, and P. J. Schellens. 2014. “Arguing about the Likelihood of Consequences: Laypeople’s Criteria to Distinguish Strong Arguments from Weak Ones.” Thinking & Reasoning 20 (1): 77–98. https://doi.org/10.1080/13546783.2013.807303.
  • Hornikx, J. 2011. “Epistemic Authority of Professors and Researchers: Differential Perceptions by Students from Two Cultural-Educational Systems.” Social Psychology of Education 14 (2): 169–183. https://doi.org/10.1007/s11218-010-9139-6.
  • Hornikx, J., and J. De Best. 2011. “Persuasive Evidence in India: An Investigation of the Impact of Evidence Types and Evidence Quality.” Argumentation and Advocacy 47 (4): 246–257. https://doi.org/10.1080/00028533.2011.11821750.
  • Hornikx, J., and M. Ter Haar. 2013. “Evidence Quality and Persuasiveness: Germans Are Not Sensitive to the Quality of Statistical Evidence.” Journal of Cognition and Culture 13 (5): 483–501. https://doi.org/10.1163/15685373-12342105.
  • Hornikx, J., and H. Hoeken. 2007. “Cultural Differences in the Persuasiveness of Evidence Types and Evidence Quality.” Communication Monographs 74 (4): 443–463. https://doi.org/10.1080/03637750701716578.
  • JASP Team. 2023. “JASP (V. 0.17.1).” https://jasp-stats.org.
  • Karaslaan, H., A. Hohenberger, H. Demir, S. Hall, and M. Oaksford. 2018. “Cross-Cultural Differences in Informal Argumentation: Norms, Inductive Biases and Evidentiality.” Journal of Cognition and Culture 18 (3–4): 358–389. https://doi.org/10.1163/15685373-12340035.
  • Lumer, C. 2022. “An Epistemological Approach Appraisal of Walton’s Argument Schemes.” Informal Logic 42 (1): 203–290. https://doi.org/10.22329/il.v42i1.7224.
  • Mercier, H., and D. Sperber. 2017. The Enigma of Reason. Cambridge, MA: Harvard University Press.
  • Oaksford, M. 2014. “Normativity, Interpretation, and Bayesian Models.” Frontiers in Psychology 5: 332. https://doi.org/10.3389/fpsyg.2014.00332.
  • O’Keefe, D. J. 2007. “Potential Conflicts between Normatively-Responsible Advocacy and Successful Social Influence: Evidence from Persuasion Effects Research.” Argumentation 21 (2): 151–163. https://doi.org/10.1007/s10503-007-9046-y.
  • Park, H. S., T. R. Levine, C. Y. Kingsley Westerman, T. Orfgen, and S. Foregger. 2007. “The Effects of Argument Quality and Involvement Type on Attitude Formation and Attitude Change: A Test of Dual-Process and Social Judgment Predictions.” Human Communication Research 33 (1): 81–102. https://doi.org/10.1111/j.1468-2958.2007.00290.x.
  • Petty, R. E., and J. T. Cacioppo. 1986. Communication and Persuasion: Central and Peripheral Routes to Attitude Change. New York: Springer.
  • Russell, T., and T. Reimer. 2018. Using semantic networks to define the quality of arguments. Communication Theory 28 (1): 46–68. https://doi.org/10.1093/ct/qty003.
  • Van de Vijver, F. J. R, and K. Leung. 1997. Methods and Data Analysis for Cross-Cultural Research. Thousand Oaks, CA: Sage.
  • Viswanathan, M. 1993. “Measurement of Individual Differences in Preference for Numerical Information.” Journal of Applied Psychology 78 (5): 741–752. https://doi.org/10.1037/0021-9010.78.5.741.
  • Walton, D. N., C. Reed, and F. Macagno. 2008. Argumentation Schemes. Cambridge: Cambridge University Press.
  • Wojcieszak, M., R. Azrout, H. Boomgaarden, A. P. Alencar, and P. Sheets. 2017. “Integrating Muslim Immigrant Minorities: The Effects of Narrative and Statistical Messages.” Communication Research 44 (4): 582–607. https://doi.org/10.1177/0093650215600490.
  • Zebregs, S., B. Van den Putte, P. Neijens, and A. De Graaf. 2015. “The Differential Impact of Statistical and Narrative Evidence on Beliefs, Attitude, and Intention: A Meta-Analysis.” Health Communication 30 (3): 282–289. https://doi.org/10.1080/10410236.2013.842528.