377
Views
0
CrossRef citations to date
0
Altmetric
Articles

Community-level ethnic diversity and community-level socio-economic development: evidence from 20 African countries

Pages 1-19 | Received 23 Jan 2023, Accepted 14 Oct 2023, Published online: 18 Nov 2023

ABSTRACT

There is evidence for a strong negative association between country-level socio-economic development and country-level ethnic diversity. One explanation for this association is that diversity is associated with less social capital, hampering co-operative economic activity and good governance. However, evidence at lower levels of geographical aggregation is mixed, with some evidence that community-level development is positively associated with community-level diversity. One explanation for this difference is that repeated inter-group contact mitigates the negative consequences of diversity and promotes the adoption of capacity-enhancing innovative practices. This paper uses household survey data from 20 African countries to explore the association of community-level development with both community-level diversity and diversity at a higher level of geographical aggregation. Within this single dataset, we find strong evidence that the first association is positive and the second is negative. We also provide some evidence that the positive association is at least partly explained by an innovation channel.

JEL CLASSIFICATIONS:

1. Introduction

Following Easterly and Levine (Citation1997), there is a large literature demonstrating a negative association between national socio-economic outcomes and national-level ethno-linguistic fractionalization (ELF). ELF is measured as the probability that two randomly selected individuals will have different mother tongues. Possible causal mechanisms include a negative effect of ELF on social capital and the quality of public goods and a positive effect of ELF on the incidence of civil conflict. However, the relationship between community-level ELF and community-level development could be quite different from the national-level relationship. In social psychology, inter-group contact theory predicts that under certain conditions, repeated interaction between individuals from different ethnic groups will reduce the prevalence of prejudice towards out-groups. Moreover, this interaction can erode resistance to innovations that enhance labour productivity and community-level development. This paper uses household-level data from the Demographic Health Surveys to estimate the association of community-level socio-economic outcomes with both community-level ELF and ELF at a higher level of geographical aggregation. We find strong evidence that the first association is positive and the second is negative. There is some evidence that this association is largely because of a causal effect of ELF on the outcomes, and that this effect is at least partly because diversity is associated with less resistance to innovation.

2. Literature review

Easterly and Levine’s finding of a negative association between ELF and economic growth has been confirmed using a variety of statistical models and datasets: see for example Alesina and Ferrara (Citation2005) and Campos, Saleh, and Kuzeyev (Citation2011). Other studies have explored the channels through which there might be a (negative) causal effect of ELF on economic growth, including public goods provision, social capital, and the incidence of civil conflict.

Habyarimana, Humphreys, Posner, and Weinstein (Citation2007) outline the different ways in which ELF could affect public goods provision. Firstly, greater ethnic diversity could be associated with greater heterogeneity in preferences. For example, cultural differences could create variation across ethnic groups in the relative value placed on certain types of public good. Disagreement about which goods to provide could lead to a non-co-operative equilibrium with under-provision of all types of good. When individual groups are concentrated in specific parts of a country or province, the non-co-operative equilibrium could also arise from disagreement about the geographical distribution of public expenditure. Secondly, there may be less reciprocal trust or altruism across groups than there is within groups, and this may mean less support for public good provision in ethnically diverse polities. Thirdly, a shared culture and language may facilitate strategies for promoting collective action, and shared membership of a social network based on ethnic identity may facilitate the punishment of non-co-operators. Cross-country studies examining the association of ELF with public goods provision, trust and social capital include Alesina, Baqir, and Easterly (Citation1999), Alesina and Ferrara (Citation2000), and Mavridis (Citation2015).Footnote1

A closely related literature explores the association between ethnic diversity and civil conflict. Montalvo and Reynal-Querol (Citation2005) show that there is a negative cross-country association between polarization and the prevalence of conflict, where polarization is a measure of how close a country is to comprising two equally sized ethnic groups, while Anyanwu (Citation2014) reports a positive association of ELF with conflict prevalence in Africa. Desmet, Ortuño-Ortín, and Wacziarg (Citation2012) show that conflict prevalence is higher when there is a high level of fractionalization based language groups rather than individual languages. One interpretation of this result is that societies speaking very different languages (i.e. societies whose common ancestral language was spoken in the very distant past) have very different preferences.

In the literature surveyed so far, the focus is largely on the association between ELF and socio-economic outcomes at the national level, but there are also some studies of ELF at lower levels of geographical aggregation. (The two ELF concepts are very different: a country comprised entirely of villages with a 50–50 mix of Groups A and B has the same national ELF as a country where half of the villages contain only Group A and half contain only Group B.) When the unit of aggregation is a region or province within a country, the results are mixed. For example, Gerring, Thacker, Lu, and Huang (Citation2015) analyze data for around 270 regions in 36 developing countries, finding a positive ssociation between regional development and regional ELF. On the other hand, Gershman and Rivera (Citation2018) analyze data for around 400 provinces in 36 African countries, finding a negative association. However, consistent with Desmet et al. (Citation2012), this negative association is much stronger when ELF is measured using language groups rather than individual languages.

Results using community-level data are also mixed, but a pattern emerges. In a number of countries, there is evidence for a negative association of community-level ELF with local public goods provision: see for example Khwaja (Citation2009) and Okten and Osili (Citation2004). However, there is evidence from other countries for a positive association of community-level ethnic diversity with other outcomes, including women’s empowerment and maternal and child health: see for example Cloward (Citation2015), Dinku and Regasa (Citation2021), Fielding and Lepine (Citation2017), and Lépine and Strobl (Citation2013). Glennerster, Miguel, and Rothenberg (Citation2013) report a precisely estimated zero.Footnote2 Gisselquist, Leiderer, and Nino- Zarazua (Citation2016) find a negative association of community-level ELF with public goods provision in Zambia, but nevertheless find a positive association with health outcomes.

Why might there be a positive association of community-level ELF with development outcomes, despite a negative association with the quality of local public goods? First, the theory of inter-group contact originating with Allport (Citation1954) suggests that under certain conditions, repeated contact between members of different groups will reduce the intensity of inter-group prejudice, so that the negative mechanisms discussed above in relation to country-level ELF are weaker.Footnote3 Inter-group contact theory suggests that in a country where each town and village is ethnically diverse, there may be lower levels of ethnic prejudice than in a country with the same national level of ELF but with ethnically homogeneous towns and villages. In the first case, the lower levels of prejudice will tend to mitigate the mechanisms associated with the negative association of ELF with socio-economic development.

Second, repeated personal interaction between heterogeneous individuals might facilitate higher rates of labour productivity and economic development. Here, the literature on urbanisation, social norms and economic development is relevant. Theories about the relationship between urbanisation and social norms date back to Wirth (Citation1938) and have been further developed by Fischer (Citation1975), who makes the following argument.

The more urban a place, the greater its subcultural variety … The more urban a place, the more numerous the sources of diffusion and the greater the diffusion into a subculture. Diffusion refers to the adoption by members of one subculture of beliefs or behaviors of another … [R]ates of unconventionality will be increased in larger communities by the process of diffusion into the mainstream culture of behaviors and beliefs from the periphery.

In this way, the cultural heterogeneity of the city tends to erode conservative social norms. These norms may have evolved because they promote social cohesion and co-operative solutions to collective action problems, but they may also inhibit economic development during periods of rapid technological innovation. Such innovation could provide new employment opportunities (e.g. skilled jobs for women) or new products (e.g. vaccines) that more conservative communities are slower to adopt. A large number of country-specific studies have found circumstantial evidence for Fischer’s proposition in the form of a negative association between population density and socially conservative beliefs. Using World Values Survey data, both Huggins and Debies-Carl (Citation2015) and Luca, Terrero-Davila, Stein, and Lee (Citation2023) present international evidence for such an association; several of their measures of social conservatism are related to intolerance towards minority groups. At the same time, higher levels of tolerance may be associated with the adoption of innovations that raise labour productivity, either because they remove barriers to the communication of new ideas, or because they erode suspicion towards these ideas. Authors reporting an association of local levels of tolerance with local innovation include Bernard, Collion, De Janvry, Rondot, and Sadoulet (Citation2008) for Burkina Faso, Schmutzler and Lorenz (Citation2018) for Latin America, and Wang, Wei, and Lin (Citation2019) for China. Using national-level data, Ruck, Bentley, and Lawson (Citation2020) find that measures of cosmopolitanism predict future increases in per capita GDP per capita and school enrolment rates. Evans (Citation2018, Citation2019) focuses on the gender dimension of this association, showing that in both Cambodia and Zambia, women are more likely to perform non-traditional social and economic roles in urban areas than they are in rural ones, and that this influences the attitudes of migrants to urban areas.

However, Meili and Shearmur (Citation2019) point out that although on average cities are more culturally diverse than rural areas, diversity is not a uniquely urban phenomenon: some rural areas exhibit both high diversity and high rates of innovation. The processes outlined by Fischer (Citation1975) could also pertain to culturally diverse rural areas. Across rural areas, greater diversity could be associated with the erosion of socially conservative norms and less resistance to innovation. In this paper, we show that across rural communities in 20 African countries, higher community-level ELF is indeed associated with better community-level development outcomes, on average, even though ELF at a higher level of geographical aggregation (the province) is negatively associated with these outcomes. We believe that ours is the first paper to compare the effects of ELF at different levels of geographical aggregation within the same dataset. We also present evidence of a positive association of ELF with the adoption of new practices, and evidence that these positive associations are smaller among communities with deeply rooted conservative beliefs, consistent with the conjecture that ELF erodes cultural barriers to innovation only in communities where social norms are somewhat plastic. The next section describes the data that we use.Footnote4

3. Data

3.1. Overview

Our analysis employs data from waves 6–7 of the USAID Demographic Health Surveys (DHS) for 20 countries in Sub-Saharan Africa (USAID, Citation2021). Our variables are measured at the DHS enumeration area level; these enumeration areas are usually based on population census enumeration areas, one area typically comprising a single large village or a group of small villages, or a suburb of a town or city (USAID, Citation2012). The stratified sampling design distinguishes between rural and urban areas at the outset, so that the rural sample in a given province is representative of the province’s rural population and the urban sample is representative of the province’s urban population. The probability that a rural (urban) enumeration area is selected into the sample is proportional to its population in the most recent census, relative to the total rural (urban) population of the province. The households to be surveyed are selected at random from within an enumeration area.

Later on, we will address questions about the endogeneity of ELF in our sample. The potential for endogeneity is much greater in urban areas, where rates of immigration from other parts of the country (and possibly from abroad) are typically very much higher than in rural areas. For this reason, our analysis is confined to the rural enumeration areas in the DHS. Our analysis employs four alternative socio-economic outcome variables drawn from the DHS, and one measure of ethnic diversity. We restrict our attention to data based on the DHS so that observations at the enumeration area level are comparable across countries. From now on, DHS enumeration areas are called ‘clusters’; we use the term ‘community-level’ when discussing a theoretical relationship between variables and the term ‘cluster-level’ when discussing the corresponding empirical measures.

3.2. The socio-economic outcomes

Although there is some variation across countries in the set of questions in the DHS, all surveys are based on a common model questionnaire, and the four socio-economic outcome variables that we use are based on questions common to all surveys. These variables are drawn from different components of the DHS for each country. First, for each woman aged between 15 and 49, the individual-level component of the survey contains an indicator variable for whether she can read basic sentences in her mother tongue. The fraction of women in a cluster who can read is denoted literacy: a histogram for this variable appears in Figure , with summary statistics in Table . Clusters with very low literacy rates are more frequent than clusters with very high rates, but a substantial proportion of clusters have literacy rates above 75%. The main text focuses on the results for this outcome, with results for the other three outcomes appearing in the Supplementary Materials. The other three outcomes are (i) a cluster-level average of household wealth based on information in the household-level component of the DHS, (ii) a cluster-level child mortality rate based on information in individual-level component of the DHS, and (iii) a measure of average night-time luminosity in the cluster based on information in the DHS geospatial covariates dataset.Footnote5

Figure 1. Histogram for literacy.

Figure 1. Histogram for literacy.

Table 1. Descriptive statistics.

3.3. Cluster-level fractionalization

Our measure of cluster-level fractionalization is based on information in the individual-level component of the survey. The survey records the self-reported ethnicity of each respondent in the sample. Denoting the fraction of individuals in a cluster identifying with ethnic group n as fn, the cluster-level measure of fractionalization is computed as 1 – Σn fn2. Existing evidence from microeconomic studies suggests that the positive effects of inter-group contact are evident at a low level of aggregation, i.e. among groups defined in terms of individual languages rather than language groups; this is consistent with the evidence in Gershman and Rivera (Citation2018) that negative provincial-level associations are strongest at a high level of language aggregation. We therefore focus on cluster-level fractionalization defined in terms of individual ethnicity, denoted ethfrac, without any aggregation of ethnic groups. The availability of ethfrac restricts our sample to 20 countries: in some countries, the DHS omits the question about ethnicity. The Supplementary Materials contain a list of the countries in waves 6–7 of the DHS that are included in our sample, along with the number of clusters in each country. In total, there are 6782 rural clusters across the 20 countries.Footnote6 (As explained below, some of our results are based on a sample of 6175 clusters in 19 countries, excluding the Democratic Republic of Congo.)

Figures 2–3 and Table  provide information about the distribution of ethfrac. Figure  shows that some countries have relatively high average values of cluster-level fractionalization (for example, Burkina Faso), and others have relatively low average values (for example, Ethiopia). Nevertheless, in each country, there are individual clusters with fractionalization values close to zero and others with fractionalization values close to one. In other words, there is substantial variation within countries and substantial variation between countries. Similarly, Table  shows that across all 20 countries, the within-province standard variation of ethfrac is similar in size to the between-province standard deviation. (Here, ‘province’ refers to the highest level of sub-national administration.) The histogram in Figure  shows that the modal value of ethfrac is zero, and about one third of all clusters are completely ethnically homogenous. Nevertheless, 19 percent of clusters have a value of ethfrac ≥ 0.5, i.e. the probability that two randomly selected individuals will be from different ethnic groups is at least as high as the probability that they will be from the same group.

Figure 2. Values of ethfrac. Each point on the map corresponds to a rural cluster in the DHS. Dark blue points correspond to the lowest values of ethfrac, and dark red points correspond to the highest values.

Figure 2. Values of ethfrac. Each point on the map corresponds to a rural cluster in the DHS. Dark blue points correspond to the lowest values of ethfrac, and dark red points correspond to the highest values.

Figure 3. Histogram for ethfrac.

Figure 3. Histogram for ethfrac.

Figure  shows a scatterplot of cluster-level literacy rates (relative to the province-level mean) against ethfrac (relative to the province-level mean). It can be seen that there is a great deal of variation in literacy rates, and cluster-level fractionalization explains only a small proportion of this variation. Nevertheless, the least-squares regression line (shown in green in the figure) is positively sloped and implies that on average, a maximally diverse cluster has a literacy rate that is about five percentage points higher than does a minimally diverse cluster.

Figure 4. Scatterplot of literacy and ethfrac (with the least-squares regression line in green).

Figure 4. Scatterplot of literacy and ethfrac (with the least-squares regression line in green).

Figure 5. Unconditional correlations of ethfrac-p with G&R’s ELF measures.

Figure 5. Unconditional correlations of ethfrac-p with G&R’s ELF measures.

Figure 6. Unconditional correlations of ethfrac with G&R’s ELF measures and with ethfrac-p.

Figure 6. Unconditional correlations of ethfrac with G&R’s ELF measures and with ethfrac-p.

3.4. Province-level ELF

It is possible to construct a province-level analogue of ethfrac: denoting the fraction of individuals in all of the clusters of a province belonging to group n as gn, the province-level fractionalization measure is computed as 1 – Σn gn2. The level of fractionalization across the province in which a cluster is located is denoted ethfrac-p. Figures 5–6 show the strength of the association between (i) ethfrac, (ii) ethfrac-p, and (iii) Gershman and Rivera (Citation2018)’s alternative measures of ELF for the province. (Gershman and Rivera do not include data from the Democratic Republic of Congo, so these figures are based on the sample of 6,175 clusters.) The alternative measures are based on differing degrees of aggregation across languages, and in Figure , ELF(k) denotes a province-level ELF measure using the kth level of language aggregation, where a higher value of k indicates a lower level of aggregation, i.e. a lower taxon in the taxonomy of languages. Gershman and Rivera use province-level data on mother tongues from both the DHS and other sources. This means that ethfrac-p is not identical to Gershman and Rivera’s measure of ELF with low language aggregation, denoted ELF(12) in their paper. Nevertheless, as shown in Figure , the coefficient of correlation between ethfrac-p and ELF(12) is equal to about 0.7, so there is some degree of consistency between the two measures. Correlations of ethfrac-p with other measures in Gershman and Rivera are much smaller: for ELF(1), which employs the highest degree of language aggregation, the coefficient of correlation is equal to about 0.1. Figure  shows coefficients of correlation between ethfrac and the different province-level fractionalization measures. These correlation coefficients are generally much smaller than the ones in Figure , but they are all positive. Given the existing evidence that the negative association of development outcomes with ELF at higher levels of geographical aggregation is stronger when the ELF measure employs a higher degree of language aggregation, our main statistical model will use ELF(2) as the measure of province-level fractionalization. Descriptive statistics for ELF(2) appear in Table . Results using other measures are available on request; note that there is very little variation in ELF(1) across provinces.

4. Quantifying the association between literacy, cluster-level fractionalization, and province-level fractionalization

It is possible that both cluster-level fractionalization and socio-economic outcomes such as literacy are associated with other cluster-level characteristics. We are not directly interested in these associations, so we focus on the association of literacy with cluster-level fractionalization (ethfrac) and province-level fractionalizarion (ELF(2)) conditional on other observable cluster-level characteristics. Since literacy is a fractional variable bounded in the interval [0,1],we fit a fractional Probit regression equation: (1) literacyi=Φ(kδkxik+cγc+βethfraci+θELF(2)p+εi)(1)

Here, Φ(.) is the cumulative Normal density function; ipc denotes a particular cluster i located in province p in country c; the term xik stands for the kth cluster-level characteristic; γ c is a country-level fixed effect; ϵ i is a regression residual, and other Greek letters stand for parameters to be estimated.Footnote7 These characteristics include mean annual precipitation, temperature, and evapotranspiration within a ten-kilometre radius of the geographic centre of the cluster; the distance to the nearest international border; the travel time to a town of at least 50,000 people; and the distance to the nearest lake or coastline. Also included is a measure of the total population within a ten-kilometre radius. The Supplementary Materials contain further details about these variables, including descriptive statistics and explanations of why each of them might be correlated with both literacy and ethfrac; all of the control variables are taken from the DHS geospatial covariates datasets. Note that the fixed effects in equation (1) capture all heterogeneity at the country level, including variation in country-level fractionalization.

It is possible to estimate the parameters in equations (1) using an equal weight for each observation, and results employing such an approach are reported below. However, there are two reasons why these results may be misleading. First, the aggregate sample (with data from all of the countries) is not representative of the total rural population of this part of Africa, because the size of the DHS is not proportional to the size of a country’s total rural population. Larger countries tend to have larger surveys, but they are not proportionately larger, so a rural community in a small country is more likely to be selected into the sample than is a rural community in a large country. One way to deal with this problem is to weight each observation by the ratio of the province’s rural population to the number of rural sample clusters in the province. The results from this weighted regression will be appropriate for hypotheses about the population of all rural communities across all of the countries in our sample, and we report a second set of results using this approach. However, these results are heavily influenced by observations in the two largest countries (Nigeria and Ethiopia), which constitute over one third of the sample by total population. We therefore report a third set of results with weights equal to the inverse of the number of clusters in the country, i.e. a regression in which each country has equal weight.Footnote8

The second potential problem is that the residual ϵ i might be spatially correlated. For this reason, we report a fourth set of results based on a linear regression equation for which we can compute spatially robust standard errors using the method of Conley (Citation1999):Footnote9 (2) literacyi=kδkxik+cγc+βethfraci+θ×ELF(2)p+εi(2) Estimates of δ^k appear in the Supplementary Materials but are not reported in the main text. Here we focus on the estimates of β^ and θ^, and, when considering estimates using equation (1), the corresponding marginal effects Φ’·β^ and Φ’·θ^.

Table  shows our four alternative estimates of β^ and θ^, along with the corresponding standard errors and marginal effects evaluated at the mean value of literacy. In all cases, β^ > 0 > θ^, and the estimates are significantly different from zero at the five percent level. Estimates of Φ’· β^ in the fractional Probit model are between 0.03 and 0.05, and in the linear model, β^ ≈ 0.05. This implies that the literacy rate in a maximally diverse cluster (ethfrac = 1) can be expected to be between three and five percentage points higher than in a minimally diverse cluster (ethfrac = 0). In other words, the results from our model are very similar to the simple scatterplot in Figure . Estimates of Φ’· θ^ in the fractional Probit model are between –0.10 and –0.11, and in the linear model, θ^ ≈ –0.12. This implies that the literacy rate in cluster with maximal province-level diversity (ELF(2) = 1) can be expected to be between ten and twelve percentage points lower than in a minimally diverse cluster (ELF(2) = 0). The standard deviations of the two diversity variables are similar in size, implying that the negative consequences of province-level diversity are somewhat larger than the positive consequences of cluster-level diversity, on average. Tables in the Supplementary Materials show similar results for wealth and night-time light intensity, except that the estimates of β^ are larger than the estimates of θ^. In the results for child mortality, only the estimates of θ^ are significantly different from zero.

Table 2. Results from the models of literacy with country fixed effects (N = 6175).

5. Exploring the endogeneity of fractionalization

The strong association of literacy with ethfrac and ELF(2) does not necessarily entail a causal effect of diversity on literacy rates: even when considering rural areas, where immigration is unlikely to be a major factor, it is still possible that the association is a consequence of unobserved heterogeneity, and that the diversity variables are endogenous regressors. We explore the likely extent of unobserved heterogeneity in two ways. First, in relation to the potential endogeneity of ethfrac, it is possible to replace the country fixed effects with province fixed effects, fitting the following fractional Probit model: (3) literacyi=Φ(kδkxik+pφp+βethfraci+εi)(3) and the corresponding linear model: (4) literacyi=kδkxik+pφp+βethfraci+εi(4) Note that ELF(2) is excluded from these regression equations because province-level diversity is perfectly correlated with the province fixed effects φ p. Equations (3-4) are robust to all province-level unobserved heterogeneity. Table , which is presented in the same format as Table , presents the estimates of β^ and Φ’· β^ conditional on province fixed effects. These estimates are significantly different from zero at the five percent level and almost identical in size to the corresponding estimates in Table . In other words, if the estimates of β^ are influenced by unobserved heterogeneity, this must be idiosyncratic cluster-level heterogeneity. Results in the Supplementary Materials show similar results for the other three development outcomes.

Table 3. Results from the models of literacy with province fixed effects (N = 6782).

Second, we explore the question of unobserved heterogeneity by adopting the approach of Altonji, Elder, and Taber (Citation2005), Nunn and Wantchekon (Citation2011), and Oster (Citation2019). This involves comparing the estimate of θ^ in equation (1) and the estimate of β^ in equation (3) with estimates in equations that omit all of the control variables xk: (1a) literacyi=Φ(cγc+θELF(2)p+εi)(1a) (3a) literacyi=Φ(pjp+βethfraci+εi)(3a) The estimate of θ^ in equation (1) is denoted θ^1 and the estimate of θ^ in equation (1a) is denoted θ^0. Similarly, the estimate of β^ in equation (3) is denoted β^1 and the estimate of β^ in equation (3a) is denoted β^0. The following explanation is phrased in terms of β and ethfrac but also applies to θ and ELF(2). If (β^1τ)/(β^0β^1) = m, then for the causal effect of ethfrac on literacy to be equal to τ (or smaller), the bias due to the omission of the unobserved cluster-level effects would have to be (at least) m times as large as the bias due to the omission of the control variables.Footnote10 If m is very large in absolute value, then the likelihood that the causal effect is no greater than τ is correspondingly small. Considering the case of a least-squares regression equation, Oster notes that this logic depends on the assumption that the variance of the unobserved effects is equal to the variance of the observed effects captured by xk. If the variance of the observed effects is smaller, then m will overestimate how large the bias due to the unobserved effects needs to be for the causal effect of ethfrac to be equal to τ. Oster therefore proposes a correction using the R2 statistics in the two regression equations: [(R2maxR21)/(R21R20)]·(β^1τ)/(β^0β^1), where R21 is the R2 statistic corresponding to the estimate of β^1, R20 is the R2 statistic corresponding to the estimate of β^0, and R2max is the value of the R2 statistic in a hypothetical regression equation including variables that capture all of the bias-inducing heterogeneity. Oster suggests that a conservative estimate of R2max would be R2max = 1.3R21, but the maximum theoretically possible value is R2max = 1. While it is not possible to apply this adjustment to estimates from a fractional Probit model (for which there is no R2 statistic), it is possible to apply the adjustment using the linear model in equation (4).

The upper left-hand part of Table  reports estimates of β^1, (β^1τ)/(β^0β^1), and [(R2maxR21)/(R21R20)]·(β^1τ)/(β^0β^1) for τ = 0 and two values of R2max: R2max = 1.3 R21 and R2max = 1. Six sets of estimates are shown: the fractional Probit model with the three alternative sets of weights discussed at the beginning of section 4, and the linear model with the three alternative sets of weights. The upper right-hand part of Table  reports corresponding estimates for τ = 0.04 (with the fractional Probit model) and τ = 0.01 (with the linear model). Note that an estimate of 0.04 in the fractional Probit model corresponds to a marginal effect of around 0.01, i.e. a one percentage point difference in literacy rates between the most diverse and least diverse clusters. The lower part of Table  reports results for θ instead of β.

Table 4. Unobeserved heterogeneity statistics for the models of literacy.

The upper part of Table  shows that for the true marginal effect of ethfrac on literacy to be as small as 0.01, the bias due to the omission of the unobserved cluster-level effects would have to be very much larger than the bias due to the omission of the control variables xk. This is true whichever model we use, and if we apply an R2 adjustment, our results imply that the required bias from unobserved effects would have to be over ten times as large. We conclude that it is very unlikely that the true (causal) marginal effect is as small as 0.01. The results for θ and ELF(2) in the lower half of the table are less clear. Nevertheless, if we apply an R2 adjustment and restrict our attention to the estimates with uniform weights or with an equal weight on each country (cases 1 and 3), our results imply that the required bias from unobserved effects would have to be at least twice as large as a bias due to the omission of the control variables for the true marginal effect to be as small as 0.01. Results in the Supplementary Materials show similar results for the other three development outcomes.

6. Evidence relating to a Fischer-style transmission channel

In section 2, we drew on the work of Fischer (Citation1975) to suggest one mechanism through which community-level fractionalization might be positively associated with social and economic development: diversity erodes conservative social norms that inhibit the adoption of new technologies. It is difficult to test this explanation directly, because social norms are likely to be endogenous to development. Nevertheless, we can explore circumstantial evidence for the mechanism. If our explanation is correct, then we should observe (i) a positive association between community-level fractionalization and observed correlates of liberal social norms, and (ii) a weaker association with fractionalization across communities in which conservative social norms are deeply entrenched and less susceptible to erosion. We acknowledge that these are necessary rather than sufficient conditions for the explanation to be correct, because there are multiple possible explanations for the observation of (i) and (ii). For example, condition (i) would be observed if social norms were endogenous to socio-economic development and if fractionalization influenced development through some other channel. Nevertheless, if the two conditions hold in our dataset then the existence of a Fischer-style transmission channel is at least possible.

Table  presents evidence relating to condition (i). This evidence is based on results from the following regression equations, which take the same form as equations (3--4): (5) zi=Φ(kδkxik+pjp+βethfraci+εi)(5) (6) zi=kδkxik+pφp+βethfraci+εi(6) Here, zi represents one of two correlates of the prevalence of liberal social norms in cluster i: workingi, which is the proportion of women in the cluster who have worked outside the home in the past year (excluding seasonal agricultural labour), and planningi, which is the proportion of women in the cluster who have ever practised any kind of birth control (including natural family planning). Both variables are based on responses in the individual-level DHS survey; descriptive statistics appear in Table . The upper half of Table  includes results relating to working and the lower half includes results relating to planning. Results in both halves are presented in the same form as in Table , showing three different estimates of β^ and Φ’·β^ using equation (5) with the three alternative sets of weights discussed at the beginning of section 4, and one estimate of β^ using equation (6) with standard errors that allow for spatial clustering. The table shows that a maximally diverse cluster can be expected to have a female employment rate that is 8–10 percentage points higher than a minimally diverse one and a birth control rate that is 2–5 percentage points higher. Seven out of the eight estimates of β^ are significantly different from zero at the five percent level. This is evidence for condition (i).

Table 5. Ethfrac coefficients models of working and planning with province fixed effects.

In order to explore evidence relating to condition (ii), we must find an observable correlate of deep entrenchment. One possible correlate is belief in witches, which is still widespread across Africa (Gershman, Citation2016, Citation2022) and which embodies a deeply rooted social norm enforcing conformity within the community, because non-conformists (especially successful ones) risk being accused of witchcraft (Bernard, De Janvry, & Sadoulet, Citation2010; Brown & Hutt, Citation2018; Golooba-Mutebi, Citation2005). Figure  shows a histogram of proportion of respondents in each province who believe that HIV infection is sometimes caused by witchcraft or other supernatural practices; the data come from responses in the individual-level DHS. The mean value of this province-level variable is 0.27 and the median is 0.20; over ten percent of provinces have a value of zero and over ten percent have a value greater than 0.6.Footnote11 Denoting this variable as witchp, we fit the following regression equation: (7) literacyi=Φ(kδkxik+pφp+[β+πwitchp]×ethfraci+εi)(7) Note that there is no linearly separable term in witchp in equation (7), because this variable is collinear with the province fixed effects φ p: only the interaction term is included. The cluster-level prevalence of belief in witchcraft may be endogenous to cluster-level literacy, but equation (7) includes province fixed effects and therefore models within-province variation in literacy, and the province-level prevalence of belief in witchcraft will be exogenous to this variation. Condition (ii) implies that π < 0, i.e. that the association between literacy and fractionalization is smaller in provinces where more people believe in witchcraft and conservative social norms are more deeply entrenched. Figure  shows estimates of β^ + π^ · witchp for different values of witchp, along with the corresponding 95 percent confidence intervals. There are three sets of estimates, each based on a different sets of weights, as discussed at the beginning of section 4. The estimated value of β^ in equation (7) (i.e. the association between literacy and fractionalization in provinces where no-one believes in witchcraft) is between 0.3 and 0.4: in other words, between two and three times as large as the estimate of β^ in Table  (i.e. the average size of the association across all provinces). In all cases, π^ is significantly less than zero, and in provinces where at least half of the respondents believe in witchcraft, there is no significant association between literacy and fractionalization. This is evidence for condition (ii).

Figure 7. Histogram for belief in witchcraft.

Figure 7. Histogram for belief in witchcraft.

Figure 8. The association between literacy and ethfrac at different values of witch (95 percent confidence intervals are shown in grey).

Figure 8. The association between literacy and ethfrac at different values of witch (95 percent confidence intervals are shown in grey).

7. Conclusion

Using household survey data from over 6,000 rural communities in 20 African countries, we find a strong positive association between community-level socio-economic development and community-level ethnic fractionalization, conditional on other community-level characteristics. In the same dataset, we find a strong negative association between community-level socio-economic development and province-level fractionalization. These results are consistent with the existing literature, in which positive associations tend to be found at lower levels of geographical aggregation and negative associations at higher levels, but we believe that ours is the first study to find both results in a single dataset, showing that the difference is not a consequence of idiosyncrasies of individual datasets. Our results also suggest that much of the association with community-level fractionalization is likely to result from a causal effect on the outcomes. We also report circumstantial evidence suggesting that one channel for this effect is that community-level diversity tends to erode conservative social norms.

Our data suggest that ethnic fractionalization within communities will mitigate the negative effects of national-level or province-level ethnic fractionalization. If a country or province is ethnically diverse, it is better for members of the different ethnic groups to be living next door to each other, not in separate villages. To the extent that governments can influence internal migration patterns by providing incentives to individual households to relocate (or not to relocate) within provinces, promoting community-level diversity is a reasonable policy goal. However, we acknowledge that in many countries, policymakers are unlikely to be able to do much to influence community-level diversity. Nevertheless, further research into the channels through which community-level diversity promotes development may be able to guide policies that replicate these channels. If, as suggested by our results, an important part of the process is the erosion of social norms which inhibit the adoption of new opportunities and practices, then policymakers may consider interventions to change norms. Such interventions must always be carefully considered, because they have the potential to undermine community autonomy and identity, but this may be a price worth paying to promote the autonomy and identity of disempowered individuals within the community in ways that enhance overall community outcomes, for example by increasing female labour force participation and reducing fertility rates. One starting point may be to understand more about the drivers of belief in witchcraft, in order to design policies to encourage critical reflection on this belief.

The results presented in this paper focus on average effects, and there could be considerable variation around the mean. In social psychology, for example, inter-group contact theory suggests that personal contact with members of an out-group will erode prejudice only under certain conditions, and there could be substantial geographical variation in the extent to which these conditions pertain. For instance, there may be variation in the level of inter-group economic inequality or in the length of inter-group contact. It may take several generations for the beneficial effects of an increase in community-level diversity to be established, and these effects may be weakened by persistent economic inequality. Hodler, Srisuma, Vesperoni, and Zurlinden (Citation2020) show that overall, there is a negative association between community-level ELF and community-level interpersonal trust in Africa, and that this association is rooted in economic inequality between ethnic groups. This suggests that the overall positive associations that we uncover exist despite, rather than because of, the relationship between ELF and trust. Understanding the conditions under which community-level ethnic diversity promotes development is an important area for future research.

Supplemental material

Supplemental Material

Download MS Word (91.5 KB)

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 There is also evidence for a negative association between ELF and the efficiency of public good provision. For example, Anyanwu and Erhijakpor (Citation2009) find that conditional on health expenditure, health outcomes are worse in countries with higher ELF.

2 The results in these country level studies are consistent with the results in Montalvo and Reynal-Querol (Citation2021), who find that across Africa, night-time light intensity is higher in points on the boundary of two or more ethnic homelands.

3 These conditions include a low level of economic competition between the groups and an equal socio-economic status: see Hewstone and Swart (Citation2011) for a review of this literature.

4 One limitation of our data (shared with the data used in many other papers on ELF), is that we have no measure of second-language fluency. Ceteris paribus, socio-outcomes may be better in communities where members of the minority group are fluent in the majority language or in which both groups are fluent in a lingua franca. One mechanism by which community-level ELF might promote socio-economic development is through encouraging second-language literacy, and we have no way of identifying this specific channel in our data.

5 See Henderson, Storeygard, and Weil (Citation2012) for a discussion of luminosity as an alternative measure of local economic development.

6 Data on the covariates discussed in the next section are missing for a small number of clusters, and these clusters are excluded from the sample. The sample also excludes 183 clusters coded as “rural” but with a reported population density of over 1,000 people per square kilometre.

7 Equation Equation(1) assumes that literacy is a monotonic function of diversity. In the Supplementary Materials, we explore the possibility that the relationship is non-monotonic. There is some (weak) evidence that literacy is highest when ethfrac is between 0.7 and 0.8, and that literacy declines sharply thereafter. However, clusters where ethfrac > 0.8 make up only four percent of the sample, and excluding them makes no substantial difference to our results.

8 Another approach is to give an equal weight to each country overall, but within each country, to give more weight to clusters in provinces where there is a higher ratio of the provincial population to the total number of clusters. The results using such weights are very similar to those using the third method.

9 All estimates were produced using Stata 15.0. Conley standard errors are computed using a 50km a cut-off.

10 It is possible for m to be negative, implying that the bias from unobserved effects would have to work in the opposite direction to the bias from observed effects.

11 This measure gives equal weight to each cluster in the province: i.e. we calculate the proportion for each cluster and then the average of these proportions across the province. Results are very similar if we give equal weight to each DHS respondent in the province. The proportion includes people who respond “don’t know” to the question about whether HIV infection is sometimes caused by witchcraft. Excluding the “don’t know” group reduces the sample mean to 0.18 and the median to 0.11, but using this alternative measure makes no substantial different to our regression results.

References

  • Alesina, A., Baqir, R., & Easterly, W. (1999). Public goods and ethnic divisions. The Quarterly Journal of Economics, 114(4), 1243–1284.
  • Alesina, A., & Ferrara, E. L. (2005). Ethnic diversity and economic performance. Journal of Economic Literature, 43(3), 762–800.
  • Alesina, A., & La Ferrara, E. (2000). Participation in heterogeneous communities*. Quarterly Journal of Economics, 115(3), 847–904.
  • Allport, G. W. (1954). The nature of prejudice. Reading, MA: Addison-Wesley.
  • Altonji, J. G., Elder, T. E., & Taber, C. R. (2005). Selection on observed and unobserved variables: Assessing the effectiveness of Catholic schools. Journal of Political Economy, 113(1), 151–184.
  • Anyanwu, J. C. (2014). Oil wealth, ethno-religious-linguistic fractionalization and civil wars in Africa: Cross-country evidence. African Development Review, 26(2), 209–236.
  • Anyanwu, J. C., & Erhijakpor, A. E. O. (2009). Health expenditures and health outcomes in Africa*. African Development Review, 21(2), 400–433.
  • Bernard, T., Collion, M. H., De Janvry, A., Rondot, P., & Sadoulet, E. (2008). Do village organizations make a difference in African rural development? A study for Senegal and Burkina Faso. World Development, 36(11), 2188–2204.
  • Bernard, T., De Janvry, A., & Sadoulet, E. (2010). When does community conservatism constrain village organizations? Economic Development and Cultural Change, 58(4), 609–641.
  • Brown, G. G., & Hutt, A. M. D. B (2018). Anthropology in action. London, UK: Routledge.
  • Campos, N. F., Saleh, A., & Kuzeyev, V. (2011). Dynamic ethnic fractionalization and economic growth. The Journal of International Trade & Economic Development, 20(2), 129–152.
  • Cloward, K. (2015). Elites, exit options and social barriers to norm change: The complex case of female genital mutilation. Studies in Comparative International Development, 50(3), 378–407.
  • Conley, T. G. (1999). GMM estimation with cross sectional dependence. Journal of Econometrics, 92(1), 1–45.
  • Desmet, K., Ortuño-Ortín, I., & Wacziarg, R. (2012). The political economy of linguistic cleavages. Journal of Development Economics, 97(2), 322–338.
  • Dinku, Y., & Regasa, R. (2021). Ethnic diversity and local economies. South African Journal of Economics, 89(3), 348–367.
  • Easterly, W., & Levine, R. (1997). Africa’s growth tragedy: Policies and ethnic divisions. The Quarterly Journal of Economics, 112(4), 1203–1250.
  • Evans, A. (2018). Cities as catalysts of gendered social change? Reflections from Zambia. Annals of the American Association of Geographers, 108(4), 1096–1114.
  • Evans, A. (2019). How cities erode gender inequality: A new theory and evidence from Cambodia. Gender & Society, 33(6), 961–984.
  • Fielding, D., & Lepine, A. (2017). Women’s empowerment and wellbeing: Evidence from Africa. The Journal of Development Studies, 53(6), 826–840.
  • Fischer, C. S. (1975). Toward a subcultural theory of urbanism. American Journal of Sociology, 80(6), 1319–1341.
  • Gerring, J., Thacker, S. C., Lu, Y., & Huang, W. (2015). Does diversity impair human development? A multi-level test of the diversity debit hypothesis. World Development, 66, 166–188.
  • Gershman, B. (2016). Witchcraft beliefs and the erosion of social capital: Evidence from Sub-Saharan Africa and beyond. Journal of Development Economics, 120, 182–208.
  • Gershman, B. (2022). Witchcraft beliefs around the world: An exploratory analysis. Plos One, 17(11), e0276872.
  • Gershman, B., & Rivera, D. (2018). Subnational diversity in Sub-saharan Africa: Insights from a new dataset. Journal of Development Economics, 133, 231–263.
  • Gisselquist, R. M., Leiderer, S., & Nino- Zarazua, M. (2016). Ethnic heterogeneity and public goods pro-vision in Zambia: Evidence of a subnational “diversity dividend”. World Development, 78, 308–323.
  • Glennerster, R., Miguel, E., & Rothenberg, A. D. (2013). Collective action in diverse Sierra Leone communities. The Economic Journal, 123(568), 285–316.
  • Golooba-Mutebi, F. (2005). Witchcraft, social cohesion and participation in a South African village. Development and Change, 36(5), 937–958.
  • Habyarimana, J., Humphreys, M., Posner, D. N., & Weinstein, J. M. (2007). Why does ethnic diversity undermine public goods provision? American Political Science Review, 101(4), 709–725.
  • Henderson, J. V., Storeygard, A., & Weil, D. N. (2012). Measuring economic growth from outer space. American Economic Review, 102(2), 994–1028.
  • Hewstone, M., & Swart, H. (2011). Fifty-odd years of inter-group contact: From hypothesis to integrated theory. British Journal of Social Psychology, 50(3), 374–386.
  • Hodler, R., Srisuma, S., Vesperoni, A., & Zurlinden, N. (2020). Measuring ethnic stratification and its effect on trust in Africa. Journal of Development Economics, 146, 102475.
  • Huggins, C. M., & Debies-Carl, J. S. (2015). Tolerance in the city: The multilevel effects of urban environments on permissive attitudes. Journal of Urban Affairs, 37(3), 255–269.
  • Khwaja, A. I. (2009). Can good projects succeed in bad communities? Journal of Public Economics, 93(7-8), 899–916.
  • Lépine, A., & Strobl, E. (2013). The effect of women’s bargaining power on child nutrition in rural Senegal. World Development, 45(1), 17–30.
  • Luca, D., Terrero-Davila, J., Stein, J., & Lee, N. (2023). Progressive cities: Urban-rural polarisation of social values and economic development around the world. Urban Studies, 60(12), 2329–2350.
  • Mavridis, D. (2015). Ethnic diversity and social capital in Indonesia. World Development, 67, 376–395.
  • Meili, R., & Shearmur, R. (2019). Diverse diversities—Open innovation in small towns and rural areas. Growth and Change, 50(2), 492–514.
  • Montalvo, J. G., & Reynal-Querol, M. (2005). Ethnic polarization, potential conflict and civil wars. American Economic Review, 95(3), 796–816.
  • Montalvo, J. G., & Reynal-Querol, M. (2021). Ethnic diversity and growth: Revisiting the evidence. The Review of Economics and Statistics, 103(3), 521–532.
  • Nunn, N., & Wantchekon, L. (2011). The slave trade and the origins of mistrust in Africa. American Economic Review, 101(7), 3221–3252.
  • Okten, C., & Osili, U. O. (2004). Contributions in heterogeneous communities: Evidence from Indonesia. Population Economics, 17(4), 603–626.
  • Oster, E. (2019). Unobservable selection and coefficient stability: Theory and evidence. Journal of Business & Economic Statistics, 37(2), 187–204.
  • Ruck, D. J., Bentley, R. A., & Lawson, D. J. (2020). Cultural prerequisites of socioeconomic development. Royal Society Open Science, 7(2), 190725.
  • Schmutzler, J., & Lorenz, E. (2018). Tolerance, agglomeration, and enterprise innovation performance: a multilevel analysis of Latin American regions. Industrial and Corporate Change, 27(2), 243–268.
  • USAID. (2012). Sampling and household listing manual. Calverton, MD: ICF International.
  • USAID. (2021, March). Available datasets. USAID. https://dhsprogram.com/data/available-datasets.cfm
  • Wang, J., Wei, Y. D., & Lin, B. (2019). How does tolerance affect urban innovative capacities in China? Growth and Change, 50(4), 1242–1259.
  • Wirth, L. (1938). Urbanism as a way of life. American Journal of Sociology, 44(1), 1–24.