302
Views
0
CrossRef citations to date
0
Altmetric
Sparse Learning

Modeling Massive Highly Multivariate Nonstationary Spatial Data with the Basis Graphical Lasso

ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Pages 1472-1487 | Received 03 Dec 2021, Accepted 08 Jan 2023, Published online: 30 May 2023

References

  • Alegria, A., Porcu, E., Furrer, R., and Mateu, J. (2019), “Covariance Functions for Multivariate Gaussian Fields Evolving Temporally over Planet Earth,” Stochastic Environmental Research and Risk Assessment, 33, 1593–1608. DOI: 10.1007/s00477-019-01707-w.
  • Apanasovich, T., and Genton, M. G. (2010), “Cross-Covariance Functions for Multivariate Random Fields based on Latent Dimensions,” Biometrika, 97, 15–30. DOI: 10.1093/biomet/asp078.
  • Apanasovich, T. V., Genton, M. G., and Sun, Y. (2012), “A Valid Matérn Class of Cross-Covariance Functions for Multivariate Random Fields with any Number of Components,” Journal of the American Statistical Association, 107, 180–193. DOI: 10.1080/01621459.2011.643197.
  • Baker, A. H., Hammerling, D. M., Levy, M. N., Xu, H., Dennis, J. M., Eaton, B. E., Edwards, J., Hannay, C., Mickelson, S. A., Neale, R. B., Nychka, D., Shollenberger, J., Tribbia, J., Vertenstein, M., and Williamson, D. (2015), “A New Ensemble-based Consistency Test for the Community Earth System Model (pyCECT v1.0),” Geoscientific Model Development, 8, 2829–2840. https://gmd.copernicus.org/articles/8/2829/2015/. DOI: 10.5194/gmd-8-2829-2015.
  • Bradley, J. R., Holan, S. H., and Wikle, C. K. (2015), “Multivariate Spatio-Temporal Models for High-Dimensional Areal Data with Application to Longitudinal Employer-Household Dynamics,” The Annals of Applied Statistics, 9, 1761–1791. DOI: 10.1214/15-AOAS862.
  • Bradley, J. R., Holan, S. H., and Wikle, C. K. (2018), “Computationally Efficient Multivariate Spatio-Temporal Models for High-Dimensional Count-Valued Data,” (with Discussion), Bayesian Analysis, 13, 253–310. DOI: 10.1214/17-BA1069.Includes comments and discussions by ten discussants and a rejoinder by the authors.
  • Bruinsma, W. P., Perim, E., Tebbutt, W., Hosking, J. S., Solin, A., and Turner, R. E. (2020), “Scalable Exact Inference in Multi-Output Gaussian Processes,” in Proceedings of the 37th International Conference on Machine Learning, ICML’20. JMLR.org.
  • Calder, C. A. (2007), “Dynamic Factor Process Convolution Models for Multivariate Space-Time Data with Application to Air Quality Assessment,” Environmental and Ecological Statistics, 14, 229–247. DOI: 10.1007/s10651-007-0019-y.
  • Chen, W., Genton, M. G., and Sun, Y. (2021), “Space-Time Covariance Structures and Models,” Annual Review of Statistics and Its Application, 8, 191–215. DOI: 10.1146/annurev-statistics-042720-115603.
  • Cressie, N., and Zammit-Mangion, A. (2016), “Multivariate Spatial Covariance Models: A Conditional Approach,” Biometrika, 103, 915–935. DOI: 10.1093/biomet/asw045.
  • Danaher, P., Wang, P., and Witten, D. M. (2014), “The Joint Graphical Lasso for Inverse Covariance Estimation Across Multiple Classes,” Journal of the Royal Statistical Society, Series B, 76, 373–397. DOI: 10.1111/rssb.12033.
  • Datta, A., Banerjee, S., Finley, A. O., and Gelfand, A. E. (2016), “Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets,” Journal of the American Statistical Association, 111, 800–812. DOI: 10.1080/01621459.2015.1044091.
  • Dey, D., Datta, A., and Banerjee, S. (2021), “Graphical Gaussian Process Models for Highly Multivariate Spatial Data,” Biometrika, 109, 993–1014. DOI: 10.1093/biomet/asab061.
  • Dey, D., Datta, A., and Banerjee, S. (2022), “On the Relationship between Graphical Gaussian Processes and Functional Gaussian Graphical Models,” https://arxiv.org/abs/2209.06294.
  • Ekanayaka, A., Kang, E., Kalmus, P., and Braverman, A. (2022), “Statistical Downscaling of Model Projections with Multivariate Basis Graphical Lasso,” https://arxiv.org/abs/2201.13111.
  • Eyring, V., Bony, S., Meehl, G. A., Senior, C. A., Stevens, B., Stouffer, R. J., and Taylor, K. E. (2016), “Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization,” Geoscientific Model Development, 9, 1937–1958. https://gmd.copernicus.org/articles/9/1937/2016/. DOI: 10.5194/gmd-9-1937-2016.
  • Fattahi, S., Zhang, R. Y., and Sojoudi, S. (2019), “Linear-Time Algorithm for Learning Large-Scale Sparse Graphical Models,” IEEE Access, 7, 12658–12672. DOI: 10.1109/ACCESS.2018.2890583.
  • Fontanella, L., Fontanella, S., Ignaccolo, R., Ippoliti, L., and Valentini, P. (2020), “G-Lasso Network Analysis for Functional Data,” in Functional and High-Dimensional Statistics and Related Fields, eds. G. Aneiros, I. Horová, M. Hušková, and P. Vieu), pp. 91–98, Cham: Springer.
  • Friedman, J., Hastie, T., and Tibshirani, R. (2008), “Sparse Inverse Covariance Estimation with the Graphical Lasso,” Biostatistics, 9, 432–441. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3019769/. DOI: 10.1093/biostatistics/kxm045.
  • Furrer, R., and Genton, M. G. (2011), “Aggregation-Cokriging for Highly Multivariate Spatial Data,” Biometrika, 98, 615–631. DOI: 10.1093/biomet/asr029.
  • Gaspari, G., and Cohn, S. E. (1999), “Construction of Correlation Functions in Two and Three Dimensions,” Quarterly Journal of the Royal Meteorological Society, 125, 723–757. DOI: 10.1002/qj.49712555417.
  • Gelfand, A. E., Schmidt, A. M., Banerjee, S., and Sirmans, C. F. (2004), “Nonstationary Multivariate Process Modeling through Spatially Varying Coregionalization,” Test, 13, 263–312. DOI: 10.1007/BF02595775.
  • Genton, M. G., and Kleiber, W. (2015), “Cross-Covariance Functions for Multivariate Geostatistics,” Statistical Science, 30, 147–163. DOI: 10.1214/14-STS487.
  • Gneiting, T., Kleiber, W., and Schlather, M. (2010), “Matérn Cross-Covariance Functions for Multivariate Random Fields,” Journal of the American Statistical Association, 105, 1167–1177. DOI: 10.1198/jasa.2010.tm09420.
  • Goulard, M., and Voltz, M. (1992), “Linear Coregionalization Model: Tools for Estimation and Choice of Cross-Variogram Matrix,” Mathematical Geology, 24, 269–286. DOI: 10.1007/BF00893750.
  • Guinness, J. (2022), “Nonparametric Spectral Methods for Multivariate Spatial and Spatial–Temporal Data,” Journal of Multivariate Analysis, 187, 104823. https://www.sciencedirect.com/science/article/pii/S0047259X21001019. DOI: 10.1016/j.jmva.2021.104823.
  • Harville, D. A. (1997), Matrix Algebra from a Statistician’s Perspective, New York: Springer-Verlag. DOI: 10.1007/b98818.
  • Hsieh, C.-J., Dhillon, I. S., Ravikumar, P. K., Becker, S., and Olsen, P. A. (2014a), “QUIC & DIRTY: A Quadratic Approximation Approach for Dirty Statistical Models,” in Advances in Neural Information Processing Systems 27, pp. 2006–2014.
  • Hsieh, C.-J., Sustik, M. A., Dhillon, I. S., and Ravikumar, P. (2014b), “QUIC: Quadratic Approximation for Sparse Inverse Covariance Estimation,” Journal of Machine Learning Research, 15, 2911–2947.
  • Ippoliti, L., Valentini, P., and Gamerman, D. (2012), “Space-Time Modelling of Coupled Spatiotemporal Environmental Variables,” Journal of the Royal Statistical Society, Series C, 61, 175–200. DOI: 10.1111/j.1467-9876.2011.01011.x.
  • Jun, M. (2011), “Non-Stationary Cross-Covariance Models for Multivariate Processes on a Globe,” Scandinavian Journal of Statistics, 38, 726–747. DOI: 10.1111/j.1467-9469.2011.00751.x.
  • Kleiber, W. (2017), “Coherence for Multivariate Random Fields,” Statistica Sinica, 27, 1675–1697. DOI: 10.5705/ss.202015.0309.
  • Kleiber, W., and Genton, M. G. (2013), “Spatially Varying Cross-Correlation Coefficients in the Presence of Nugget Effects,” Biometrika, 100, 213–220. DOI: 10.1093/biomet/ass057.
  • Kleiber, W., and Nychka, D. (2012), “Nonstationary Modeling for Multivariate Spatial Processes,” Journal of Multivariate Analysis, 112, 76–91. DOI: 10.1016/j.jmva.2012.05.011.
  • Kleiber, W., Nychka, D., and Bandyopadhyay, S. (2019), “A Model for Large Multivariate Spatial Data Sets,” Statistica Sinica, 29, 1085–1104. DOI: 10.5705/ss.202017.0365.
  • Kleiber, W., and Porcu, E. (2014), “Nonstationary Matrix Covariances: Compact Support, Long Range Dependence and Quasi-Arithmetic Constructions,” Stochastic Environmental Research and Risk Assessment, 29, 193–204. DOI: 10.1007/s00477-014-0867-6.
  • Krock, M., Kleiber, W., and Becker, S. (2021), “Nonstationary Modeling with Sparsity for Spatial Data via the Basis Graphical Lasso,” Journal of Computational and Graphical Statistics, 30, 375–389. DOI: 10.1080/10618600.2020.1811103.
  • Le, N. D., and Zidek, J. V. (2006), Statistical Analysis of Environmental Space-Time Processes, Springer Series in Statistics, New York: Springer.
  • Liu, H., Ding, J., Xie, X., Jiang, X., Zhao, Y., and Wang, X. (2021), “Scalable Multi-Task Gaussian Processes with Neural Embedding of Coregionalization,” https://arxiv.org/abs/2109.09261.
  • Majumdar, A., and Gelfand, A. E. (2007), “Multivariate Spatial Modeling for Geostatistical Data Using Convolved Covariance Functions,” Mathematical Geosciences, 39, 225–245. DOI: 10.1007/s11004-006-9072-6.
  • Majumdar, A., Paul, D., and Bautista, D. (2010), “A Generalized Convolution Model for Multivariate Nonstationary Spatial Processes,” Statistica Sinica, 20, 675–695.
  • Meng, R., Lee, H., and Bouchard, K. (2021a), “Stochastic Collapsed Variational Inference for Structured Gaussian Process Regression Network,” arXiv: Machine Learning.
  • Meng, R., Soper, B., Lee, H. K., Liu, V. X., Greene, J. D., and Ray, P. (2021b), “Nonstationary Multivariate Gaussian Processes for Electronic Health Records,” Journal of Biomedical Informatics, 117, 103698. https://www.sciencedirect.com/science/article/pii/S1532046421000277. DOI: 10.1016/j.jbi.2021.103698.
  • Nychka, D., Bandyopadhyay, S., Hammerling, D., Lindgren, F., and Sain, S. (2015), “A Multiresolution Gaussian Process Model for the Analysis of Large Spatial Datasets,” Journal of Computational and Graphical Statistics, 24, 579–599. DOI: 10.1080/10618600.2014.914946.
  • Pollice, A., and Jona Lasinio, G. (2010), “A Multivariate Approach to the Analysis of Air Quality in a High Environmental Risk Area,” Environmetrics, 21, 741–754. DOI: 10.1002/env.1059.
  • Porcu, E., Furrer, R., and Nychka, D. (2020), “30 Years of Space-Time Covariance Functions,” Wiley Interdisciplinary Reviews: Computational Statistics, 13, e1512.
  • Qadir, G. A., Euán, C., and Sun, Y. (2021), “Flexible Modeling of Variable Asymmetries in Cross-Covariance Functions for Multivariate Random Fields,” Journal of Agricultural, Biological and Environmental Statistics, 26, 1–22. DOI: 10.1007/s13253-020-00414-2.
  • Qadir, G. A., and Sun, Y. (2020), “Semiparametric Estimation of Cross-Covariance Functions for Multivariate Random Fields,” Biometrics, 77, 547–560. DOI: 10.1111/biom.13323.
  • Qiao, X., Guo, S., and James, G. M. (2019), “Functional Graphical Models,” Journal of the American Statistical Association, 114, 211–222. DOI: 10.1080/01621459.2017.1390466.
  • Rue, H., and Held, L. (2005), Gaussian Markov Random Fields: Theory and Applications (Monographs on Statistics and Applied Probability), New York: Chapman & Hall/CRC.
  • Salvaña, M., Abdulah, S., Huang, H., Ltaief, H., Sun, Y., Genton, M. M., and Keyes, D. (2021), “High Performance Multivariate Geospatial Statistics on Manycore Systems,” IEEE Transactions on Parallel and Distributed Systems, 32, 2719–2733. DOI: 10.1109/TPDS.2021.3071423.
  • Salvaña, M. L. O., and Genton, M. G. (2020), “Nonstationary Cross-Covariance Functions for Multivariate Spatio-Temporal Random Fields,” Spatial Statistics, 37, 100411. https://www.sciencedirect.com/science/article/pii/S2211675320300051. Frontiers in Spatial and Spatio-temporal Research. DOI: 10.1016/j.spasta.2020.100411.
  • Schmidt, A. M., and Gelfand, A. E. (2003), “A Bayesian Coregionalization Approach for Multivariate Pollutant Data,” Journal of Geophysical Research: Atmospheres, 108. DOI: 10.1029/2002JD002905.
  • Shaddick, G., and Wakefield, J. (2002), “Modelling Daily Multivariate Pollutant Data at Multiple Sites,” Journal of the Royal Statistical Society, Series C, 51, 351–372. DOI: 10.1111/1467-9876.00273.
  • Stein, M. L. (1999), Interpolation of Spatial Data: Some Theory for Kriging, New York: Springer-Verlag.
  • Stein, M. L. (2014), “Limitations on Low Rank Approximations for Covariance Matrices of Spatial Data,” Spatial Statistics, 8, 1–19. DOI: 10.1016/j.spasta.2013.06.003.
  • Taylor-Rodriguez, D., Finley, A., Datta, A., Babcock, C., Andersen, H., Cook, B., Morton, D., and Banerjee, S. (2019), “Spatial Factor Models for High-Dimensional and Large Spatial Data: An Application in Forest Variable Mapping,” Statistica Sinica, 29, 1155–1180. DOI: 10.5705/ss.202018.0005.
  • Teh, Y., Seeger, M. W., and Jordan, M. I. (2005), “Semiparametric Latent Factor Models,” in AISTATS.
  • Titsias, M. (2009), “Variational Learning of Inducing Variables in Sparse Gaussian Processes,” in Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics, Vol. 5 of Proceedings of Machine Learning Research, pp. 567–574. PMLR. http://proceedings.mlr.press/v5/titsias09a.html.
  • Titsias, M., and Lawrence, N. D. (2010), “Bayesian Gaussian Process Latent Variable Model,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, eds. Y. W. Teh and M. Titterington, Vol. 9 of Proceedings of Machine Learning Research, pp. 844–851. Chia Laguna Resort, Sardinia, Italy: PMLR. http://proceedings.mlr.press/v9/titsias10a.html.
  • Ver Hoef, J. M., and Barry, R. P. (1998), “Constructing and Fitting Models for Cokriging and Multivariable Spatial Prediction,” Journal of Statistical Planning and Inference, 69, 275–294. DOI: 10.1016/S0378-3758(97)00162-6.
  • Vu, Q., Zammit-Mangion, A., and Cressie, N. (2020), “Modeling Nonstationary and Asymmetric Multivariate Spatial Covariances via Deformations,” arXiv: Statistics.
  • Wackernagel, H. (2003), Multivariate Geostatistics: An Introduction with Applications, Berlin Heidelberg: Springer. https://books.google.com/books?id=Rhr7bgLWxx4C.
  • Wikle, C. K. (2010), “Low Rank Representations for Spatial Processes,” in Handbook of Spatial Statistics, pp. 107–118, Boca Raton: Chapman & Hall/CRC.
  • Yang, S., Lu, Z., Shen, X., Wonka, P., and Ye, J. (2015), “Fused Multiple Graphical Lasso,” SIAM Journal on Optimization, 25, 916–943. DOI: 10.1137/130936397.
  • Zapata, J., Oh, S. Y., and Petersen, A. (2021), “Partial Separability and Functional Graphical Models for Multivariate Gaussian Processes,” Biometrika, 109, 665–681. DOI: 10.1093/biomet/asab046.
  • Zhang, L., and Banerjee, S. (2021), “Spatial Factor Modeling: A Bayesian Matrix-Normal Approach for Misaligned Data,” Biometrics, 109, 665–681. DOI: 10.1111/biom.13452.
  • Zhang, L., Banerjee, S., and Finley, A. O. (2021), “High-Dimensional Multivariate Geostatistics: A Bayesian Matrix-Normal Approach,” Environmetrics, 32, e2675. DOI: 10.1002/env.2675.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.