24
Views
0
CrossRef citations to date
0
Altmetric
Articles

Confounded Local Inference: Extending Local Moran Statistics to Handle Confounding

ORCID Icon
Received 18 Oct 2022, Accepted 09 Dec 2023, Published online: 12 Apr 2024
 

Abstract

Local statistical analysis has long been of interest to social and environmental scientists who analyze geographic data. Research into local spatial statistics experienced a step-change in the mid-1990s, which provided a large class of local statistical methods and models. The local Moran statistic is one commonly used local indicator of spatial association, able to detect both areas of similarity and observations that are very dissimilar from their surroundings. From this, many further local statistics have been developed to characterize spatial clusters and outliers. These statistics have seen limited adoption because they do not sufficiently model the relationships involved in confounded spatial data, where the analyst seeks to understand the local spatial structure of a given outcome variable that is influenced by one or more additional factors. Recent innovations used to do joint multivariate local analysis also do not model this kind of conditional local structure in data. This article provides tools to rigorously characterize confounded local inference and a new and different class of multivariate conditional local Moran statistics that can account for confounding. To do this, we return to the Moran scatterplot as the critical tool for local Moran-style covariance statistics. Extending this concept, a new method is available directly from a “Moran-form” multiple regression. We show the empirical and theoretical properties of this statistic, show how some existing heuristic approaches arise naturally from this framework, and show how the use of conditional inference can change interpretations in an empirical analysis of rent and housing stock in a rapidly changing neighborhood.

长期以来, 分析地理数据的社会科学家和环境科学家一直对局部统计分析感兴趣。局部空间统计研究在20世纪90年代中期取得了阶段性进展, 出现了许多局部统计方法和模型。局部Moran统计是常用的局部空间关联指标, 能够检测相似区域以及与邻域迥然不同的观测值。随后出现的许多局部统计方法, 旨在表征空间聚类和异常值。然而, 这些统计方法并没有得到广泛应用, 其原因是: 在试图理解受一个或多个额外因素影响的结果变量的局部空间结构时, 这些方法不能充分模拟混淆空间数据中的关系。联合多变量局部分析也未能模拟数据中的这种条件局部结构。为了解释数据混淆, 本文提出了能严格表征混淆局部推理的方法, 提出了新的多变量条件局部Moran统计。为此, 我们以Moran散点图为局部Moran协方差统计的主要方法, 在此基础上, 从“Moran形式”多元回归中直接获得一种新方法。我们展示了该统计的经验和理论特点, 展示了某些现有的启发式方法如何源自于该框架。本文对快速变化社区的租金和住房存量进行了实证分析, 展示了条件推理如何改变对混淆数据的解读。

Durante mucho tiempo el análisis estadístico local ha interesado a los científicos sociales y ambientalistas que analizan datos geográficos. La investigación sobre la estadística espacial local experimentó un radical cambio a mediados de los 1990, que proveyó una amplia clase de métodos y modelos estadísticos. La estadística local de Moran es un indicador local de asociación espacial comúnmente usado, capaz de detectar tanto áreas de similitud como observaciones que son muy diferentes de sus alrededores. A partir de esto, se han desarrollado muchas estadísticas locales para caracterizar los conglomerados espaciales y los atípicos. Estas estadísticas han encontrado solo adopciones limitadas porque son incapaces de modelar suficientemente bien las relaciones involucradas en los datos espaciales confusos, donde los analistas buscan entender la estructura espacial local, de un resultado variable dado, que es influido por uno o varios factores adicionales. Innovaciones recientes usadas para realizar análisis locales multivariantes tampoco modelan este tipo de estructura local condicionada en los datos. Este artículo provee herramientas para caracterizar rigurosamente la inferencia local de confusión, y una nueva y diferente clase de estadística condicional local de Moran, que puede tomar en consideración lo confuso. Para ello, retornamos al diagrama de dispersión de Moran como herramienta básica para abordar las estadísticas de covarianza local, al estilo Moran. Al ampliar este concepto, un nuevo método queda a disposición, derivado directamente de una regresión múltiple “en forma Moran”. Mostramos las propiedades empíricas y teóricas de esta estadística, mostramos cómo enfoques heurísticos existentes emergen de modo natural desde este enfoque, y mostramos cómo el uso de inferencia condicional puede cambiar las interpretaciones de un análisis empírico del alquiler y el parque de vivienda, en un vecindario que cambia rápidamente.

Acknowledgments

Thank you to Professors Sergio Rey, Luc Anselin, and Alex Marsh for helping me develop the idea beyond the initial proof of concept and giving feedback on previous drafts.

Disclosure Statement

No potential conflict of interest was reported by the author.

Supplemental Material

Supplemental data for this article can be accessed on the publisher’s site at: https://doi.org/10.1080/24694452.2024.2326541

Notes

1 For the following discussion, we only consider row-standardized weights matrices for simplicity.

2 Other relationships besides covariance that are used in other statistics (e.g., distances in the local Geary’s Ci statistic or sums in the Getis–Ord Gi statistic) provide different interpretations.

3 See, for example, LeSage and Pace’s (Citation2014) argument about the distinction between β and the marginal effect in spatial models.

4 W is generally not symmetric when it is row standardized, so xWyyWx.

5 Derivation of this statistic is provided in the Supplemental Material for this article.

6 This ignores the simultaneity between y and Wy just like the univariate Moran-form regression.

7 Swapping DWy for DWy would create an N×P matrix sitting in the midst of a P×P covariance matrix ([DD]1) and an N×1 spatial lag of y. Tiling removes this shape mismatch.

8 This is derived in the Supplemental Material for a bivariate example and for three variables.

9 Each pathway in the full Ixy statistic is shown separately in the Supplemental Material, for clarity.

10 A full treatment of this is provided in the Supplemental Material.

11 Although closed-form estimators might become available following the strategies outlined in Sauer et al. (Citation2021). The exact procedure for local permutation testing on conditional Moran statistics is shown in the Supplemental Material. Implementations of the estimators will be made available in production-ready R (Bivand Citation2022) and Python (Rey et al. Citation2022) packages after submission. For peer review, estimators are provided alongside the submission.

12 This includes the government commission exploring “living rent” policies that supports this work.

13 These data are provided on request from the Urban Big Data Center.

14 The substantive results we show are consistent with sparse k-nearest neighbor graphs as well (k{1,2,10}). Kernel functions were also explored and exhibit the same qualitative behavior but tended to oversmooth the Wy values, so the Delaunay triangulation was chosen to make the examples clear.

Additional information

Funding

This work is part of an ESRC funded Impact Acceleration project though the University of Bristol, Bristol City Council Living Rent Commission: Understanding the Local Structure of Affordable Housing, and further by PolicyBristol from the Research England QR Policy Support Fund (QR PSF) 2022-24.

Notes on contributors

Levi John Wolf

LEVI JOHN WOLF is an Associate Professor of Spatial Analysis in the School of Geographical Sciences, University of Bristol, Clifton BS8 1SS, UK. E-mail: [email protected]. His research interests include fundamental innovations in spatial analysis using formal mathematical modeling methods (e.g., probabilistic programming and mathematical optimization) with applications in elections, economic growth, labor and industrial strategy, regional inequality, and spatial distribution modeling.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 312.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.