Scalable Marginalization of Correlated Latent Variables with Applications to Learning Particle Interaction Kernels
Volume 1, Issue 2 (2023), pp. 172–186
Pub. online: 18 October 2022
Type: Methodology Article
Open Access
Area: Statistical Methodology
Accepted
29 September 2022
29 September 2022
Published
18 October 2022
18 October 2022
Abstract
Marginalization of latent variables or nuisance parameters is a fundamental aspect of Bayesian inference and uncertainty quantification. In this work, we focus on scalable marginalization of latent variables in modeling correlated data, such as spatio-temporal or functional observations. We first introduce Gaussian processes (GPs) for modeling correlated data and highlight the computational challenge, where the computational complexity increases cubically fast along with the number of observations. We then review the connection between the state space model and GPs with Matérn covariance for temporal inputs. The Kalman filter and Rauch-Tung-Striebel smoother were introduced as a scalable marginalization technique for computing the likelihood and making predictions of GPs without approximation. We introduce recent efforts on extending the scalable marginalization idea to the linear model of coregionalization for multivariate correlated output and spatio-temporal observations. In the final part of this work, we introduce a novel marginalization technique to estimate interaction kernels and forecast particle trajectories. The computational progress lies in the sparse representation of the inverse covariance matrix of the latent variables, then applying conjugate gradient for improving predictive accuracy with large data sets. The computational advances achieved in this work outline a wide range of applications in molecular dynamic simulation, cellular migration, and agent-based models.
References
Banerjee, S., Carlin, B. P. and Gelfand, A. E. Hierarchical modeling and analysis for spatial data. Crc Press (2014). MR3362184
Barbieri, M. M. and Berger, J. O. Optimal predictive model selection. The annals of statistics 32(3) 870–897 (2004). https://doi.org/10.1214/009053604000000238. MR2065192
Bayarri, M. J., Berger, J. O., Paulo, R., Sacks, J., Cafeo, J. A., Cavendish, J., Lin, q H, C. and Tu, J. A framework for validation of computer models. Technometrics 49(2) 138–154 (2007). https://doi.org/10.1198/004017007000000092. MR2380530
Berger, J. O. and Pericchi, L. R. The intrinsic Bayes factor for model selection and prediction. Journal of the American Statistical Association 91(433) 109–122 (1996). https://doi.org/10.2307/2291387. MR1394065
Berger, J. O., De Oliveira, V. and Sansó, B. Objective Bayesian analysis of spatially correlated data. Journal of the American Statistical Association 96(456) 1361–1374 (2001). https://doi.org/10.1198/016214501753382282. MR1946582
Berger, J. O., Liseo, B. and Wolpert, R. L. Integrated likelihood methods for eliminating nuisance parameters. Statistical science 14(1) 1–28 (1999). https://doi.org/10.1214/ss/1009211803. MR1702200
Cressie, N. and Johannesson, G. Fixed rank kriging for very large spatial data sets. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 70(1) 209–226 (2008). https://doi.org/10.1111/j.1467-9868.2007.00633.x. MR2412639
Datta, A., Banerjee, S., Finley, A. O. and Gelfand, A. E. Hierarchical nearest-neighbor Gaussian process models for large geostatistical datasets. Journal of the American Statistical Association 111(514) 800–812 (2016). https://doi.org/10.1080/01621459.2015.1044091. MR3538706
De Finetti, B. La prévision: ses lois logiques, ses sources subjectives. Annales de l’institut Henri Poincaré 7. 1–68 (1937). MR1508036
Feng, J., Ren, Y. and Tang, S. Data-driven discovery of interacting particle systems using Gaussian processes (2021). arXiv preprint 2106.02735.
Gelfand, A. E., Banerjee, S. and Gamerman, D. Spatial process modelling for univariate and multivariate dynamic spatial data. Environmetrics: The official journal of the International Environmetrics Society 16(5) 465–479 (2005). https://doi.org/10.1002/env.715. MR2147537
Gramacy, R. B. and Apley, D. W. Local Gaussian process approximation for large computer experiments. Journal of Computational and Graphical Statistics 24(2) 561–578 (2015). https://doi.org/10.1080/10618600.2014.914442. MR3357395
Gramacy, R. B. and Lee, H. K. Cases for the nugget in modeling computer experiments. Statistics and Computing 22(3) 713–722 (2012). https://doi.org/10.1007/s11222-010-9224-x. MR2909617
Gu, M. and Li, H. Gaussian Orthogonal Latent Factor Processes for Large Incomplete Matrices of Correlated Data. Bayesian Analysis. 1–26 (2022). https://doi.org/10.1214/21-BA1295
Gu, M. and Shen, W. Generalized probabilistic principal component analysis of correlated data. Journal of Machine Learning Research 21(13) (2020). MR4071196
Gu, M., Palomo, J. and Berger, J. O. RobustGaSP: Robust Gaussian Stochastic Process Emulation in R. The R Journal 11(1) 112–136 (2019). https://doi.org/10.32614/RJ-2019-011. MR3851764
Gu, M., Wang, X. and Berger, J. O. Robust Gaussian stochastic process emulation. Annals of Statistics 46(6A) 3038–3066 (2018). https://doi.org/10.1214/17-AOS1648. MR3851764
Hackbusch, W. Iterative solution of large sparse systems of equations 95. Springer (1994). https://doi.org/10.1007/978-1-4612-4288-8. MR1247457
Hestenes, M. R. and Stiefel, E. Methods of conjugate gradients for solving. Journal of research of the National Bureau of Standards 49(6) 409 (1952). MR0060307
Higdon, D., Gattiker, J., Williams, B. and Rightley, M. Computer model calibration using high-dimensional output. Journal of the American Statistical Association 103(482) 570–583 (2008). https://doi.org/10.1198/016214507000000888. MR2523994
Kalman, R. E. A new approach to linear filtering and prediction problems. Journal of Basic Engineering 82(1) 35–45 (1960). MR3931993
Katzfuss, M. and Guinness, J. A general framework for Vecchia approximations of Gaussian processes. Statistical Science 36(1) 124–141 (2021). https://doi.org/10.1214/19-STS755. MR4194207
Kazianka, H. and Pilz, J. Objective Bayesian analysis of spatial data with uncertain nugget and range parameters. Canadian Journal of Statistics 40(2) 304–327 (2012). https://doi.org/10.1002/cjs.11132. MR2927748
Lam, C. and Yao, Q. Factor modeling for high-dimensional time series: inference for the number of factors. The Annals of Statistics 40(2) 694–726 (2012). https://doi.org/10.1214/12-AOS970. MR2933663
Lam, C., Yao and and Bathia N, Q. Estimation of latent factors for high-dimensional time series. Biometrika 98(4) 901–918 (2011). https://doi.org/10.1093/biomet/asr048. MR2860332
Lee, J., Bahri, Y., Novak, R., Schoenholz, S. S., Pennington, J. and Sohl-Dickstein, J. Deep neural networks as gaussian processes (2017). arXiv preprint 1711.00165.
Lindgren, F., Rue, H. and Lindström, J. An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73(4) 423–498 (2011). https://doi.org/10.1111/j.1467-9868.2011.00777.x. MR2853727
Lu, F., Zhong, M., Tang, S. and Maggioni, M. Nonparametric inference of interaction laws in systems of agents from trajectory data. Proc. Natl. Acad. Sci. U.S.A. 116(29) 14424–14433 (2019). https://doi.org/10.1073/pnas.1822012116. MR3984488
Marchetti, M. C., Joanny, q F, J., Ramaswamy, S., Liverpool, T. B., Prost, J., Rao, M. and Simha, R. A. Hydrodynamics of soft active matter. Reviews of modern physics 85(3) 1143 (2013). https://doi.org/10.1017/jfm.2012.131. MR2969140
Motsch, S. and Tadmor, E. Heterophilious dynamics enhances consensus. SIAM review 56(4) 577–621 (2014). https://doi.org/10.1137/120901866. MR3274797
Muré, J. Propriety of the reference posterior distribution in Gaussian process modeling. The Annals of Statistics 49(4) 2356–2377 (2021). https://doi.org/10.1214/20-aos2040. MR4319254
Paulo, R. Default priors for Gaussian processes. Annals of statistics 33(2) 556–582 (2005). https://doi.org/10.1214/009053604000001264. MR2163152
Paulo, R., García-Donato, G. and Palomo, J. Calibration of computer models with multivariate output. Computational Statistics and Data Analysis 56(12) 3959–3974 (2012). https://doi.org/10.1016/j.csda.2012.05.023. MR2957846
Petris, G., Petrone, S. and Campagnoli, P. Dynamic linear models with. Springer (2009). https://doi.org/10.1007/b135794. MR2730074
Raftery, A. E., Madigan, D. and Hoeting, J. A. Bayesian model averaging for linear regression models. Journal of the American Statistical Association 92(437) 179–191 (1997). https://doi.org/10.2307/2291462. MR1436107
Rasmussen, C. E. Gaussian processes for machine learning. MIT Press (2006). MR2514435
Rauch, H. E., Tung, F. and Striebel, C. T. Maximum likelihood estimates of linear dynamic systems. AIAA journal 3(8) 1445–1450 (1965). https://doi.org/10.2514/3.3166. MR0181489
Ren, C., Sun, D. and He, C. Objective Bayesian analysis for a spatial model with nugget effects. Journal of Statistical Planning and Inference 142(7) 1933–1946 (2012). https://doi.org/10.1016/j.jspi.2012.02.034. MR2903403
Roustant, O., Ginsbourger, D. and Deville, Y. DiceKriging, DiceOptim: Two R Packages for the Analysis of Computer Experiments by Kriging-Based Metamodeling and Optimization. Journal of Statistical Software 51(1) 1–55 (2012). https://doi.org/10.18637/jss.v051.i01
Rue, H., Martino, S. and Chopin, N. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the royal statistical society: Series B (statistical methodology) 71(2) 319–392 (2009). https://doi.org/10.1111/j.1467-9868.2008.00700.x. MR2649602
Saad, Y. Iterative methods for sparse linear systems. SIAM (2003). https://doi.org/10.1016/S1570-579X(01)80025-2. MR1853234
Sacks, J., Welch, W. J., Mitchell, T. J. and Wynn, H. P. Design and analysis of computer experiments. Statistical science 4(4) 409–423 (1989). MR1041765
Santner, T. J., Williams, B. J. and Notz, W. I. The design and analysis of computer experiments. Springer (2003). https://doi.org/10.1007/978-1-4757-3799-8. MR2160708
Stroud, J. R., Stein, M. L. and Lysen, S. Bayesian and maximum likelihood estimation for Gaussian processes on an incomplete lattice. Journal of computational and Graphical Statistics 26(1) 108–120 (2017). https://doi.org/10.1080/10618600.2016.1152970. MR3610412
Tipping, M. E. and Bishop, C. M. Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61(3) 611–622 (1999). https://doi.org/10.1111/1467-9868.00196. MR1707864
Vecchia, A. V. Estimation and model identification for continuous spatial processes. Journal of the Royal Statistical Society: Series B (Methodological) 50(2) 297–312 (1988). MR0964183
Wen, Z. and Yin, W. A feasible method for optimization with orthogonality constraints. Mathematical Programming 142(1–2) 397–434 (2013). https://doi.org/10.1007/s10107-012-0584-1. MR3127080
West, M. and Harrison, P. J. Bayesian Forecasting & Dynamic Models 2nd ed. Springer (1997). MR1482232
West, M. and Harrison, J. Bayesian forecasting and dynamic models. Springer (2006). https://doi.org/10.1007/978-1-4757-9365-9. MR1020301
Whittle, P. Stochastic process in several dimensions. Bulletin of the International Statistical Institute 40(2) 974–994 (1963). MR0173287