Scalable Marginalization of Correlated Latent Variables with Applications to Learning Particle Interaction Kernels

Gu, Mengyang; Liu, Xubo; Fang, Xinyi; Tang, Sui

doi:10.51387/22-NEJSDS13

The New England Journal of Statistics in Data Science

Scalable Marginalization of Correlated Latent Variables with Applications to Learning Particle Interaction Kernels

Volume 1, Issue 2 (2023), pp. 172–186

Mengyang Gu Xubo Liu Xinyi Fang All authors (4)

https://doi.org/10.51387/22-NEJSDS13

Pub. online: 18 October 2022 Type: Methodology Article

Open Access

Area: Statistical Methodology

Accepted
29 September 2022

Published
18 October 2022

Abstract

Marginalization of latent variables or nuisance parameters is a fundamental aspect of Bayesian inference and uncertainty quantification. In this work, we focus on scalable marginalization of latent variables in modeling correlated data, such as spatio-temporal or functional observations. We first introduce Gaussian processes (GPs) for modeling correlated data and highlight the computational challenge, where the computational complexity increases cubically fast along with the number of observations. We then review the connection between the state space model and GPs with Matérn covariance for temporal inputs. The Kalman filter and Rauch-Tung-Striebel smoother were introduced as a scalable marginalization technique for computing the likelihood and making predictions of GPs without approximation. We introduce recent efforts on extending the scalable marginalization idea to the linear model of coregionalization for multivariate correlated output and spatio-temporal observations. In the final part of this work, we introduce a novel marginalization technique to estimate interaction kernels and forecast particle trajectories. The computational progress lies in the sparse representation of the inverse covariance matrix of the latent variables, then applying conjugate gradient for improving predictive accuracy with large data sets. The computational advances achieved in this work outline a wide range of applications in molecular dynamic simulation, cellular migration, and agent-based models.

References

[1]

Adrian, R. J. and Westerweel, J. Particle image velocimetry 30. Cambridge university press (2011).

[2]

Anderson, K. R., Johanson, I. A., Patrick, M. R., Gu, M., Segall, P., Poland, M. P., Montgomery-Brown, E. K. and Miklius, A. Magma reservoir failure and the onset of caldera collapse at Klauea Volcano in 2018. Science 366(6470) (2019).

[3]

Banerjee, S., Carlin, B. P. and Gelfand, A. E. Hierarchical modeling and analysis for spatial data. Crc Press (2014). MR3362184

[4]

Barbieri, M. M. and Berger, J. O. Optimal predictive model selection. The annals of statistics 32(3) 870–897 (2004). https://doi.org/10.1214/009053604000000238. MR2065192

[5]

Bayarri, M. J., Berger, J. O., Paulo, R., Sacks, J., Cafeo, J. A., Cavendish, J., Lin, q H, C. and Tu, J. A framework for validation of computer models. Technometrics 49(2) 138–154 (2007). https://doi.org/10.1198/004017007000000092. MR2380530

[6]

Berger, J. O. and Pericchi, L. R. The intrinsic Bayes factor for model selection and prediction. Journal of the American Statistical Association 91(433) 109–122 (1996). https://doi.org/10.2307/2291387. MR1394065

[7]

Berger, J. O., De Oliveira, V. and Sansó, B. Objective Bayesian analysis of spatially correlated data. Journal of the American Statistical Association 96(456) 1361–1374 (2001). https://doi.org/10.1198/016214501753382282. MR1946582

[8]

Berger, J. O., Liseo, B. and Wolpert, R. L. Integrated likelihood methods for eliminating nuisance parameters. Statistical science 14(1) 1–28 (1999). https://doi.org/10.1214/ss/1009211803. MR1702200

[9]

Couzin, I. D., Krause, J., Franks, N. R. and Levin, S. A. Effective leadership and decision-making in animal groups on the move. Nature 433(7025) 513–516 (2005).

[10]

Cressie, N. and Johannesson, G. Fixed rank kriging for very large spatial data sets. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 70(1) 209–226 (2008). https://doi.org/10.1111/j.1467-9868.2007.00633.x. MR2412639

[11]

Datta, A., Banerjee, S., Finley, A. O. and Gelfand, A. E. Hierarchical nearest-neighbor Gaussian process models for large geostatistical datasets. Journal of the American Statistical Association 111(514) 800–812 (2016). https://doi.org/10.1080/01621459.2015.1044091. MR3538706

[12]

De Finetti, B. La prévision: ses lois logiques, ses sources subjectives. Annales de l’institut Henri Poincaré 7. 1–68 (1937). MR1508036

[13]

Feng, J., Ren, Y. and Tang, S. Data-driven discovery of interacting particle systems using Gaussian processes (2021). arXiv preprint 2106.02735.

[14]

Gelfand, A. E., Banerjee, S. and Gamerman, D. Spatial process modelling for univariate and multivariate dynamic spatial data. Environmetrics: The official journal of the International Environmetrics Society 16(5) 465–479 (2005). https://doi.org/10.1002/env.715. MR2147537

[15]

Gramacy, R. B. and Apley, D. W. Local Gaussian process approximation for large computer experiments. Journal of Computational and Graphical Statistics 24(2) 561–578 (2015). https://doi.org/10.1080/10618600.2014.914442. MR3357395

[16]

Gramacy, R. B. and Lee, H. K. Cases for the nugget in modeling computer experiments. Statistics and Computing 22(3) 713–722 (2012). https://doi.org/10.1007/s11222-010-9224-x. MR2909617

[17]

Gu, M. and Li, H. Gaussian Orthogonal Latent Factor Processes for Large Incomplete Matrices of Correlated Data. Bayesian Analysis. 1–26 (2022). https://doi.org/10.1214/21-BA1295

[18]

Gu, M. and Shen, W. Generalized probabilistic principal component analysis of correlated data. Journal of Machine Learning Research 21(13) (2020). MR4071196

[19]

Gu, M., Palomo, J. and Berger, J. O. RobustGaSP: Robust Gaussian Stochastic Process Emulation in R. The R Journal 11(1) 112–136 (2019). https://doi.org/10.32614/RJ-2019-011. MR3851764

[20]

Gu, M., Wang, X. and Berger, J. O. Robust Gaussian stochastic process emulation. Annals of Statistics 46(6A) 3038–3066 (2018). https://doi.org/10.1214/17-AOS1648. MR3851764

[21]

Hackbusch, W. Iterative solution of large sparse systems of equations 95. Springer (1994). https://doi.org/10.1007/978-1-4612-4288-8. MR1247457

[22]

Hartikainen, J. and Sarkka, S. Kalman filtering and smoothing solutions to temporal Gaussian process regression models. In 2010 IEEE International Workshop on Machine Learning for Signal Processing 379–384. IEEE (2010).

[23]

Henkes, S., Fily, Y. and Marchetti, M. C. Active jamming: Self-propelled soft particles at high density. Physical Review E 84(4), 040301 (2011).

[24]

Hestenes, M. R. and Stiefel, E. Methods of conjugate gradients for solving. Journal of research of the National Bureau of Standards 49(6) 409 (1952). MR0060307

[25]

Higdon, D., Gattiker, J., Williams, B. and Rightley, M. Computer model calibration using high-dimensional output. Journal of the American Statistical Association 103(482) 570–583 (2008). https://doi.org/10.1198/016214507000000888. MR2523994

[26]

Kalman, R. E. A new approach to linear filtering and prediction problems. Journal of Basic Engineering 82(1) 35–45 (1960). MR3931993

[27]

Katzfuss, M. and Guinness, J. A general framework for Vecchia approximations of Gaussian processes. Statistical Science 36(1) 124–141 (2021). https://doi.org/10.1214/19-STS755. MR4194207

[28]

Kazianka, H. and Pilz, J. Objective Bayesian analysis of spatial data with uncertain nugget and range parameters. Canadian Journal of Statistics 40(2) 304–327 (2012). https://doi.org/10.1002/cjs.11132. MR2927748

[29]

Lakshminarayanan, B., Pritzel, A. and Blundell, C. Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in neural information processing systems 30 (2017).

[30]

Lam, C. and Yao, Q. Factor modeling for high-dimensional time series: inference for the number of factors. The Annals of Statistics 40(2) 694–726 (2012). https://doi.org/10.1214/12-AOS970. MR2933663

[31]

Lam, C., Yao and and Bathia N, Q. Estimation of latent factors for high-dimensional time series. Biometrika 98(4) 901–918 (2011). https://doi.org/10.1093/biomet/asr048. MR2860332

[32]

Lee, J., Bahri, Y., Novak, R., Schoenholz, S. S., Pennington, J. and Sohl-Dickstein, J. Deep neural networks as gaussian processes (2017). arXiv preprint 1711.00165.

[33]

Lindgren, F., Rue, H. and Lindström, J. An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73(4) 423–498 (2011). https://doi.org/10.1111/j.1467-9868.2011.00777.x. MR2853727

[34]

Lu, F., Zhong, M., Tang, S. and Maggioni, M. Nonparametric inference of interaction laws in systems of agents from trajectory data. Proc. Natl. Acad. Sci. U.S.A. 116(29) 14424–14433 (2019). https://doi.org/10.1073/pnas.1822012116. MR3984488

[35]

Marchetti, M. C., Joanny, q F, J., Ramaswamy, S., Liverpool, T. B., Prost, J., Rao, M. and Simha, R. A. Hydrodynamics of soft active matter. Reviews of modern physics 85(3) 1143 (2013). https://doi.org/10.1017/jfm.2012.131. MR2969140

[36]

Motsch, S. and Tadmor, E. Heterophilious dynamics enhances consensus. SIAM review 56(4) 577–621 (2014). https://doi.org/10.1137/120901866. MR3274797

[37]

Muré, J. Propriety of the reference posterior distribution in Gaussian process modeling. The Annals of Statistics 49(4) 2356–2377 (2021). https://doi.org/10.1214/20-aos2040. MR4319254

[38]

Neal, R. M. Bayesian learning for neural networks 118. Springer (2012).

[39]

Paulo, R. Default priors for Gaussian processes. Annals of statistics 33(2) 556–582 (2005). https://doi.org/10.1214/009053604000001264. MR2163152

[40]

Paulo, R., García-Donato, G. and Palomo, J. Calibration of computer models with multivariate output. Computational Statistics and Data Analysis 56(12) 3959–3974 (2012). https://doi.org/10.1016/j.csda.2012.05.023. MR2957846

[41]

Petris, G., Petrone, S. and Campagnoli, P. Dynamic linear models with. Springer (2009). https://doi.org/10.1007/b135794. MR2730074

[42]

Raftery, A. E., Madigan, D. and Hoeting, J. A. Bayesian model averaging for linear regression models. Journal of the American Statistical Association 92(437) 179–191 (1997). https://doi.org/10.2307/2291462. MR1436107

[43]

Rapaport, D. C. and Rapaport, D. C. R. The art of molecular dynamics simulation. Cambridge university press (2004).

[44]

Rasmussen, C. E. Gaussian processes for machine learning. MIT Press (2006). MR2514435

[45]

Rauch, H. E., Tung, F. and Striebel, C. T. Maximum likelihood estimates of linear dynamic systems. AIAA journal 3(8) 1445–1450 (1965). https://doi.org/10.2514/3.3166. MR0181489

[46]

Ren, C., Sun, D. and He, C. Objective Bayesian analysis for a spatial model with nugget effects. Journal of Statistical Planning and Inference 142(7) 1933–1946 (2012). https://doi.org/10.1016/j.jspi.2012.02.034. MR2903403

[47]

Roustant, O., Ginsbourger, D. and Deville, Y. DiceKriging, DiceOptim: Two R Packages for the Analysis of Computer Experiments by Kriging-Based Metamodeling and Optimization. Journal of Statistical Software 51(1) 1–55 (2012). https://doi.org/10.18637/jss.v051.i01

[48]

Rue, H., Martino, S. and Chopin, N. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the royal statistical society: Series B (statistical methodology) 71(2) 319–392 (2009). https://doi.org/10.1111/j.1467-9868.2008.00700.x. MR2649602

[49]

Saad, Y. Iterative methods for sparse linear systems. SIAM (2003). https://doi.org/10.1016/S1570-579X(01)80025-2. MR1853234

[50]

Sacks, J., Welch, W. J., Mitchell, T. J. and Wynn, H. P. Design and analysis of computer experiments. Statistical science 4(4) 409–423 (1989). MR1041765

[51]

Sanchez-Gonzalez, A., Godwin, J., Pfaff, T., Ying, R., Leskovec, J. and Battaglia, P. Learning to simulate complex physics with graph networks. In International Conference on Machine Learning 8459–8468. PMLR (2020).

[52]

Santner, T. J., Williams, B. J. and Notz, W. I. The design and analysis of computer experiments. Springer (2003). https://doi.org/10.1007/978-1-4757-3799-8. MR2160708

[53]

Seeger, M., Teh, Y. Q. W. and Jordan, M. Semiparametric latent factor models. Technical Report (2005).

[54]

Snelson, E. and Ghahramani, Z. Sparse Gaussian processes using pseudo-inputs. Advances in neural information processing systems 18 1257 (2006).

[55]

Stroud, J. R., Stein, M. L. and Lysen, S. Bayesian and maximum likelihood estimation for Gaussian processes on an incomplete lattice. Journal of computational and Graphical Statistics 26(1) 108–120 (2017). https://doi.org/10.1080/10618600.2016.1152970. MR3610412

[56]

Surjanovic, S. and Bingham, D. Virtual Library of Simulation Experiments: Test Functions and Datasets (2017).

[57]

Thomas, L. H. Elliptic problems in linear difference equations over a network. Watson Sci. Comput. Lab. Rept., Columbia University, New York 1–71. (1949).

[58]

Tipping, M. E. and Bishop, C. M. Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61(3) 611–622 (1999). https://doi.org/10.1111/1467-9868.00196. MR1707864

[59]

Vecchia, A. V. Estimation and model identification for continuous spatial processes. Journal of the Royal Statistical Society: Series B (Methodological) 50(2) 297–312 (1988). MR0964183

[60]

Wen, Z. and Yin, W. A feasible method for optimization with orthogonality constraints. Mathematical Programming 142(1–2) 397–434 (2013). https://doi.org/10.1007/s10107-012-0584-1. MR3127080

[61]

West, M. and Harrison, P. J. Bayesian Forecasting & Dynamic Models 2nd ed. Springer (1997). MR1482232

[62]

West, M. and Harrison, J. Bayesian forecasting and dynamic models. Springer (2006). https://doi.org/10.1007/978-1-4757-9365-9. MR1020301

[63]

Whittle, P. Stochastic process in several dimensions. Bulletin of the International Statistical Institute 40(2) 974–994 (1963). MR0173287

[64]

Wilson, A. G. and Izmailov, P. Bayesian deep learning and a probabilistic perspective of generalization. Advances in neural information processing systems 33 4697–4708 (2020).

Full article Related articles

Open access article under the CC BY license.

Keywords

Marginalization Bayesian inference Scalable computation Gaussian process Kalman filter Particle interaction

Funding

The work is partially supported by the National Institutes of Health under Award No. R01DK130067. Gu and Liu acknowledge the partial support from National Science Foundation (NSF) under Award No. DMS-2053423. Fang acknowledges the support from the UCSB academic senate faculty research grants program. Tang is partially supported by Regents Junior Faculty fellowship, Faculty Early Career Acceleration grant, Hellman Family Faculty Fellowship sponsored by UCSB and the NSF under Award No. DMS-2111303.

Metrics

since December 2021

777

Article info
views

573

Full article
views

228

PDF
downloads

103

XML
downloads

RSS

Authors

Abstract

References

Export citation

Copy and paste formatted citation

Download citation in file