Effect of Model Space Priors on Statistical Inference with Model Uncertainty

Porwal, Anupreet; Raftery, Adrian E.

doi:10.51387/22-NEJSDS14

The New England Journal of Statistics in Data Science

Effect of Model Space Priors on Statistical Inference with Model Uncertainty

Volume 1, Issue 2 (2023), pp. 149–158

Anupreet Porwal Adrian E. Raftery

https://doi.org/10.51387/22-NEJSDS14

Pub. online: 16 November 2022 Type: Methodology Article

Open Access

Area: Statistical Methodology

Accepted
18 October 2022

Published
16 November 2022

Abstract

Bayesian model averaging (BMA) provides a coherent way to account for model uncertainty in statistical inference tasks. BMA requires specification of model space priors and parameter space priors. In this article we focus on comparing different model space priors in the presence of model uncertainty. We consider eight reference model space priors used in the literature and three adaptive parameter priors recommended by Porwal and Raftery [37]. We assess the performance of these combinations of prior specifications for variable selection in linear regression models for the statistical tasks of parameter estimation, interval estimation, inference, point and interval prediction. We carry out an extensive simulation study based on 14 real datasets representing a range of situations encountered in practice. We found that beta-binomial model space priors specified in terms of the prior probability of model size performed best on average across various statistical tasks and datasets, outperforming priors that were uniform across models. Recently proposed complexity priors performed relatively poorly.

Supplementary material

Supplementary Material

The supplementary material contains detailed summary results for each metric and dataset used in the study. It also contains a summary of data-generating models for each of the datasets.

References

[1]

Bartlett, M. S. (1957). A Comment on D. V. Lindley’s Statistical Paradox. Biometrika 44 533–534. https://doi.org/10.1093/biomet/52.3-4.507. MR0207142

[2]

Brock, W., Durlauf, S. N. and West, K. D. (2003). Policy evaluation in uncertain economic environments. National Bureau of Economic Research Cambridge, Mass., USA.

[3]

Castillo, I., Schmidt-Hieber, J. and Van der Vaart, A. (2015). Bayesian linear regression with sparse priors. The Annals of Statistics 43(5) 1986–2018. https://doi.org/10.1214/15-AOS1334. MR3375874

[4]

Celeux, G., El Anbari, M., Marin, J. q. M. and Robert, C. P. (2012). Regularization in Regression: Comparing Bayesian and Frequentist Methods in a Poorly Informative Situation. Bayesian Analysis 7 477–502. https://doi.org/10.1214/12-BA716. MR2934959

[5]

Clyde, M. (2020). BAS: Bayesian Variable Selection and Model Averaging using Bayesian Adaptive Sampling. R package version 1.5.5.

[6]

Clyde, M. and George, E. I. (2000). Flexible empirical Bayes estimation for wavelets. Journal of the Royal Statistical Society: Series B, Statistical Methodology 62(4) 681–698. https://doi.org/10.1111/1467-9868.00257. MR1796285

[7]

Clyde, M. and George, E. I. (2004). Model uncertainty. Statistical Science 19(1) 81–94. https://doi.org/10.1214/088342304000000035. MR2082148

[8]

Deckers, T. and Hanck, C. (2014). Variable Selection in Cross-Section Regressions: Comparisons and Extensions. Oxford Bulletin of Economics and Statistics 76(6) 841–873.

[9]

Dellaportas, P., Forster, J. J. and Ntzoufras, I. (2012). Joint specification of model space and parameter space prior distributions. Statistical Science 27(2) 232–246. https://doi.org/10.1214/11-STS369. MR2963994

[10]

Durlauf, S. N., Kourtellos, A. and Tan, C. M. (2008). Are any growth theories robust? The Economic Journal 118(527) 329–346.

[11]

Eicher, T. S., Papageorgiou, C. and Raftery, A. E. (2011). Default priors and predictive performance in Bayesian model averaging, with application to growth determinants. Journal of Applied Econometrics 26(1) 30–55. https://doi.org/10.1002/jae.1112. MR2759908

[12]

Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. Journal of the Royal Statistical Society: Series B — Statistical Methodology 70(5) 849–911. https://doi.org/10.1111/j.1467-9868.2008.00674.x. MR2530322

[13]

Fernández, C., Ley, E. and Steel, M. F. J. (2001). Benchmark priors for Bayesian model averaging. Journal of Econometrics 100(2) 381–427. https://doi.org/10.1016/S0304-4076(00)00076-2. MR1820410

[14]

Filzmoser, P. and Varmuza, K. (2017). chemometrics: Multivariate Statistical Analysis in Chemometrics. R package version 1.4.2. https://CRAN.R-project.org/package=chemometrics.

[15]

Forte, A., Garcia-Donato, G. and Steel, M. F. J. (2018). Methods and tools for Bayesian variable selection and model averaging in normal linear regression. International Statistical Review 86(2) 237–258. https://doi.org/10.1111/insr.12249. MR3852410

[16]

Foster, D. P. and George, E. I. (1994). The risk inflation criterion for multiple regression. Annals of Statistics 22(4) 1947–1975. https://doi.org/10.1214/aos/1176325766. MR1329177

[17]

George, E. (1999). Discussion of “Model averaging and model search strategies” by M. Clyde. In Bayesian Statistics 6–Proceedings of the Sixth Valencia International Meeting. MR1723497

[18]

George, E. I. (2010). Dilution priors: Compensating for model space redundancy. In Borrowing Strength: Theory Powering Applications–A Festschrift for Lawrence D. Brown 158–165 Institute of Mathematical Statistics. MR2798517

[19]

George, E. I. and Foster, D. P. (2000). Calibration and empirical Bayes variable selection. Biometrika 87(4) 731–747. https://doi.org/10.1093/biomet/87.4.731. MR1813972

[20]

George, E. I. and McCulloch, R. E. (1993). Variable selection via Gibbs sampling. Journal of the American Statistical Association 88(423) 881–889.

[21]

Gneiting, T. and Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association 102(477) 359–378. https://doi.org/10.1198/016214506000001437. MR2345548

[22]

Gu, C. (2014). Smoothing Spline ANOVA Models: R Package gss. Journal of Statistical Software 58(5) 1–25.

[23]

Hansen, M. H. and Yu, B. (2003). Minimum description length model selection criteria for generalized linear models. Lecture Notes-Monograph Series 40 145–163. https://doi.org/10.1214/lnms/1215091140. MR2004337

[24]

Hoeting, J. A., Raftery, A. E. and Madigan, D. (2002). Bayesian variable and transformation selection in linear regression. Journal of Computational and Graphical Statistics 11(3) 485–507. https://doi.org/10.1198/106186002501. MR1938444

[25]

Hoeting, J. A., Madigan, D., Raftery, A. E. and Volinsky, C. T. (1999). Bayesian model averaging: a tutorial. Statistical Science 14 382–417. https://doi.org/10.1214/ss/1009212519. MR1765176

[26]

Ishwaran, H., Rao, J. S. and Kogalur, U. B. (2013). spikeslab: Prediction and variable selection using spike and slab regression. R package version 1.1.5. http://cran.r-project.org/web/packages/spikeslab/. https://doi.org/10.1214/21-ejp733. MR4366222

[27]

James, G., Witten, D., Hastie, T. and Tibshirani, R. (2017). ISLR: Data for an Introduction to Statistical Learning with Applications in R. R package version 1.2. https://CRAN.R-project.org/package=ISLR. https://doi.org/10.1007/978-1-0716-1418-1. MR4309209

[28]

Kass, R. E. and Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association 90(430) 773–795. https://doi.org/10.1080/01621459.1995.10476572. MR3363402

[29]

Leamer, E. E. (1978) Specification Searches: Ad hoc Inference with Nonexperimental Data 53. Wiley. MR0471118

[30]

Levine, R. and Renelt, D. (1992). A sensitivity analysis of cross-country growth regressions. The American economic review 942–963.

[31]

Ley, E. and Steel, M. F. (2009). On the effect of prior assumptions in Bayesian model averaging with applications to growth regression. Journal of applied econometrics 24(4) 651–674. https://doi.org/10.1002/jae.1057. MR2675199

[32]

Liang, F., Paulo, R., Molina, G., Clyde, M. A. and Berger, J. O. (2008). Mixtures of g priors for Bayesian variable selection. Journal of the American Statistical Association 103 410–423. https://doi.org/10.1198/016214507000001337. MR2420243

[33]

Lumley, T. (2020). leaps: Regression Subset Selection. R package version 3.1. https://CRAN.R-project.org/package=leaps.

[34]

Madigan, D. and Raftery, A. E. (1994). Model selection and accounting for model uncertainty in graphical models using Occam’s window. Journal of the American Statistical Association 89(428) 1535–1546.

[35]

Narisetty, N. N. and He, X. (2014). Bayesian variable selection with shrinking and diffusing priors. The Annals of Statistics 42(2) 789–817. https://doi.org/10.1214/14-AOS1207. MR3210987

[36]

Newman, D. J., Hettich, S., Blake, C. L. and Merz, C. J. (1998). UCI Repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html.

[37]

Porwal, A. and Raftery, A. E. (2022). Comparing methods for statistical inference with model uncertainty. Proceedings of the National Academy of Sciences 119(16) 2120737119.

[38]

Raftery, A. E. (1988). Approximate Bayes factors for generalized linear models. Technical Report No. 121, Department of Statistics, University of Washington. https://stat.uw.edu/sites/default/files/files/reports/1988/tr121.pdf.

[39]

Raftery, A. E. and Zheng, Y. (2003). Discussion: Performance of Bayesian model averaging. Journal of the American Statistical Association 98(464) 931–938.

[40]

Raftery, A. E., Madigan, D. and Hoeting, J. A. (1997). Bayesian model averaging for linear regression models. Journal of the American Statistical Association 92(437) 179–191. https://doi.org/10.2307/2291462. MR1436107

[41]

Rohart, F., Gautier, B., Singh, A. and Le Cao, K. q. A. (2017). mixOmics: An R package for ’omics feature selection and multiple data integration. PLoS Computational Biology 13(11) 1005752.

[42]

Rossell, D. (2021). Concentration of posterior model probabilities and normalized l0 criteria. Bayesian Analysis 1(1) 1–27. https://doi.org/10.1214/21-ba1262. MR4483231

[43]

Rossell, D. and Rubio, F. J. (2018). Tractable Bayesian Variable Selection: Beyond Normality. Journal of the American Statistical Association. https://doi.org/10.1080/01621459.2017.1371025. MR3902243

[44]

Rossell, D., Abril, O. and Bhattacharya, A. (2021). Approximate Laplace approximations for scalable model selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 83(4) 853–879. MR4320004

[45]

Sala-i-Martin, X. (1997). I just ran four million regressions. National Bureau of Economic Research Cambridge, Mass., USA.

[46]

Sala-I-Martin, X., Doppelhofer, G. and Miller, R. I. (2004). Determinants of long-term growth: A Bayesian averaging of classical estimates (BACE) approach. American economic review 94(4) 813–835.

[47]

Scott, J. G. and Berger, J. O. (2010). Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem. The Annals of Statistics 2587–2619. https://doi.org/10.1214/10-AOS792. MR2722450

[48]

van Zwet, E. (2019). A default prior for regression coefficients. Statistical Methods in Medical Research 28(12) 3799–3807. https://doi.org/10.1177/0962280218817792. MR4003623

[49]

Villa, C. and Walker, S. (2015). An objective Bayesian criterion to determine model prior probabilities. Scandinavian Journal of Statistics 42(4) 947–966. https://doi.org/10.1111/sjos.12145. MR3426304

[50]

Wasserman, L. (2000). Bayesian model selection and model averaging. Journal of Mathematical Psychology 44(1) 92–107. https://doi.org/10.1006/jmps.1999.1278. MR1770003

[51]

Yang, Y., Wainwright, M. J. and Jordan, M. I. (2016). On the computational complexity of high-dimensional Bayesian variable selection. The Annals of Statistics 44(6) 2497–2532. https://doi.org/10.1214/15-AOS1417. MR3576552

[52]

Young, W. C., Raftery, A. E. and Yeung, K. Y. (2014). Fast Bayesian inference for gene regulatory networks using ScanBMA. BMC Systems Biology 8(1) 47.

[53]

Zellner, A. (1986). On assessing prior distributions and Bayesian regression analysis with g-prior distributions. In Bayesian Inference and Decision Techniques 6. MR0881437

[54]

Zellner, A. and Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. Trabajos de Estadística y de Investigaciów Operativa 31(1) 585–603.

Full article Related articles

Open access article under the CC BY license.

Keywords

Bayesian model averaging Zellner’s g-prior Model space prior Beta-Binomial prior Complexity prior Model selection Prediction

Funding

This research was supported by NICHD grant R01 HD-070936, and by the Boeing International Professorship at the University of Washington.

Metrics

since December 2021

1266

Article info
views

475

Full article
views

382

PDF
downloads

130

XML
downloads

RSS

Authors

Abstract

Supplementary material

References

Export citation

Copy and paste formatted citation

Download citation in file