Highest Posterior Model Computation and Variable Selection via Simulated Annealing

Maity, Arnab Kumar; Basu, Sanjib

doi:10.51387/23-NEJSDS40

The New England Journal of Statistics in Data Science

Highest Posterior Model Computation and Variable Selection via Simulated Annealing

Volume 1, Issue 2 (2023), pp. 200–207

Arnab Kumar Maity Sanjib Basu

https://doi.org/10.51387/23-NEJSDS40

Pub. online: 26 June 2023 Type: Methodology Article

Open Access

Area: Statistical Methodology

Accepted
30 May 2023

Published
26 June 2023

Abstract

Variable selection is widely used in all application areas of data analytics, ranging from optimal selection of genes in large scale micro-array studies, to optimal selection of biomarkers for targeted therapy in cancer genomics to selection of optimal predictors in business analytics. A formal way to perform this selection under the Bayesian approach is to select the model with highest posterior probability. The problem may be thought as an optimization problem over the model space where the objective function is the posterior probability of model. We propose to carry out this optimization using simulated annealing and we illustrate its feasibility in high dimensional problems. By means of various simulation studies, this new approach has been shown to be efficient. Theoretical justifications are provided and applications to high dimensional datasets are discussed. The proposed method is implemented in an R package sahpm for general use and is made available on R CRAN.

Supplementary material

Supplementary Material

The R package sahpm for the method SA-HPM is available on R CRAN. Further mathematical discussion on the convergence of this method is given in a separate supplementary material.

References

[1]

Barbieri, M. M., Berger, J. O., George, E. I. and Roková, V. (2021). The median probability model and correlated variables. Bayesian Analysis 16(4) 1085–1112. https://doi.org/10.1214/20-BA1249. MR4381128

[2]

Barbieri, M. M. and Berger, J. O. (2004). Optimal predictive model selection. The annals of statistics 32(3) 870–897. https://doi.org/10.1214/009053604000000238. MR2065192

[3]

Basu, S. and Chib, S. (2003). Marginal likelihood and Bayes factors for Dirichlet process mixture models. Journal of the American Statistical Association 98(461) 224–235. https://doi.org/10.1198/01621450338861947. MR1965688

[4]

Bayarri, M. J., Berger, J. O., Forte, A. and García-Donato, G. (2012). Criteria for Bayesian model choice with application to variable selection. The Annals of statistics 40(3) 1550–1577. https://doi.org/10.1214/12-AOS1013. MR3015035

[5]

Berger, J. O. and Molina, G. (2005). Posterior model probabilities via path-based pairwise priors. Statistica Neerlandica 59(1) 3–15. https://doi.org/10.1111/j.1467-9574.2005.00275.x. MR2137378

[6]

Berger, J. O., Pericchi, L. R., Ghosh, J., Samanta, T., De Santis, F., Berger, J. and Pericchi, L. (2001). Objective Bayesian methods for model selection: Introduction and comparison. Lecture Notes-Monograph Series 135–207. https://doi.org/10.1214/lnms/1215540968. MR2000753

[7]

Bertsimas, D. and Tsitsiklis, J. (1993). Simulated annealing. Statistical science 8(1) 10–15. MR1194437

[8]

Bottolo, L. and Richardson, S. (2010). Evolutionary stochastic search for Bayesian model exploration. Bayesian Analysis 5(3) 583–618. https://doi.org/10.1214/10-BA523. MR2719668

[9]

Brusco, M. J. and Köhn, H. q. F. (2009). Exemplar-based clustering via simulated annealing. Psychometrika 74(3) 457–475. https://doi.org/10.1007/s11336-009-9115-2. MR2551671

[10]

Cadima, J., Cerdeira, J. O. and Minhoto, M. (2004). Computational aspects of algorithms for variable selection in the context of principal components. Computational Statistics & Data Analysis 47(2) 225–236. https://doi.org/10.1016/j.csda.2003.11.001. MR2101498

[11]

Casella, G. and Moreno, E. (2006). Objective Bayesian variable selection. Journal of the American Statistical Association 101(473) 157–167. https://doi.org/10.1198/016214505000000646. MR2268035

[12]

Casella, G., Girón, F. J., Martínez, M. L. and Moreno, E. (2009). Consistency of Bayesian procedures for variable selection. Annals of Statistics 37(3) 1207–1228. https://doi.org/10.1214/08-AOS606. MR2509072

[13]

Chen, J. and Chen, Z. (2008). Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95(3) 759–771. https://doi.org/10.1093/biomet/asn034. MR2443189

[14]

Clyde, M. A., Ghosh, J. and Littman, M. L. (2011). Bayesian adaptive sampling for variable selection and model averaging. Journal of Computational and Graphical Statistics 20(1) 80–101. https://doi.org/10.1198/jcgs.2010.09049. MR2816539

[15]

Crama, Y. and Schyns, M. (2003). Simulated annealing for complex portfolio selection problems. European Journal of Operational Research 150(3) 546–571.

[16]

Cruz, J. R. and Dorea, C. C. Y. (1998). Simple conditions for the convergence of simulated annealing type algorithms. Journal of Applied Probability 35(4) 885–892. https://doi.org/10.1239/jap/1032438383. MR1671238

[17]

Dey, T., Ishwaran, H. and Rao, J. S. (2008). An in-depth look at highest posterior model selection. Econometric Theory 24(2) 377–403. https://doi.org/10.1017/S026646660808016X. MR2391616

[18]

Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 70(5) 849–911. https://doi.org/10.1111/j.1467-9868.2008.00674.x. MR2530322

[19]

Fernandez, C., Ley, E. and Steel, M. F. (2001). Benchmark priors for Bayesian model averaging. Journal of Econometrics 100(2) 381–427. https://doi.org/10.1016/S0304-4076(00)00076-2. MR1820410

[20]

Fernandez, C., Ley, E. and Steel, M. F. (2001). Model uncertainty in cross-country growth regressions. Journal of Applied Econometrics 16(5) 563–576.

[21]

Garcia-Donato, G. and Martinez-Beneito, M. A. (2013). On sampling strategies in Bayesian variable selection problems with large model spaces. Journal of the American Statistical Association 108(501) 340–352. https://doi.org/10.1080/01621459.2012.742443. MR3174624

[22]

Geisser, S. and Eddy, W. F. (1979). A predictive approach to model selection. Journal of the American Statistical Association 74(365) 153–160. MR0529531

[23]

Hahn, P. R. and Carvalho, C. M. (2015). Decoupling shrinkage and selection in Bayesian linear models: a posterior summary perspective. Journal of the American Statistical Association 110(509) 435–448. https://doi.org/10.1080/01621459.2014.993077. MR3338514

[24]

Hans, C., Dobra, A. and West, M. (2007). Shotgun stochastic search for “large p” regression. Journal of the American Statistical Association 102(478) 507–516. https://doi.org/10.1198/016214507000000121. MR2370849

[25]

Hoeting, J. A., Madigan, D., Raftery, A. E. and Volinsky, C. T. (1999). Bayesian model averaging: a tutorial. Statistical Science 14(4)382–401. https://doi.org/10.1214/ss/1009212519. MR1765176

[26]

Jeong, I. q. S., Kim, H. q. K., Kim, T. q. H., Lee, D. H., Kim, K. J. and Kang, S. q. H. (2018). A Feature Selection Approach Based on Simulated Annealing for Detecting Various Denial of Service Attacks. Software Networking 2018(1) 173–190.

[27]

Johnson, V. E. and Rossell, D. (2012). Bayesian model selection in high-dimensional settings. Journal of the American Statistical Association 107(498) 649–660. https://doi.org/10.1080/01621459.2012.682536. MR2980074

[28]

Kass, R. E. and Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association 90(430) 773–795. https://doi.org/10.1080/01621459.1995.10476572. MR3363402

[29]

Kirkpatrick, S., Gelatt, C. D. and Vecchi, M. P. (1983). Optimization by simulated annealing. Science 220(4598) 671–680. https://doi.org/10.1126/science.220.4598.671. MR0702485

[30]

Liang, F., Paulo, R., Molina, G., Clyde, M. A. and Berger, J. O. (2008). Mixtures of g priors for Bayesian variable selection. Journal of the American Statistical Association 103(481) 410–423. https://doi.org/10.1198/016214507000001337. MR2420243

[31]

Maity, A. K., Basu, S. and Ghosh, S. (2021). Bayesian criterion-based variable selection. Journal of the Royal Statistical Society: Series C (Applied Statistics) 70(4) 835–857. https://doi.org/10.1111/rssc.12488. MR4318011

[32]

Moreno, E., Girón, F. J. and Casella, G. (2010). Consistency of objective Bayes factors as the model dimension grows. Annals of Statistics 38(4) 1937–1952. https://doi.org/10.1214/09-AOS754. MR2676879

[33]

Scott, J. G. and Berger, J. O. (2010). Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem. The Annals of Statistics 38(5) 2587–2619. https://doi.org/10.1214/10-AOS792. MR2722450

[34]

Shi, M. and Dunson, D. B. (2011). Bayesian variable selection via particle stochastic search. Statistics & probability letters 81(2) 283–291. https://doi.org/10.1016/j.spl.2010.10.011. MR2764295

[35]

Shin, M. and Tian, R. (2017). BayesS5: Bayesian Variable Selection Using Simplified Shotgun Stochastic Search with Screening (S5). R package version 1.30. https://CRAN.R-project.org/package=BayesS5.

[36]

Shin, M., Bhattacharya, A. and Johnson, V. E. (2018). Scalable Bayesian variable selection using nonlocal prior densities in ultrahigh-dimensional settings. Statistica Sinica 28(2) 1053. MR3791100

[37]

Spiegelhalter, D. J., Best, N. G., Carlin, B. P. and Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64(4) 583–639. https://doi.org/10.1111/1467-9868.00353. MR1979380

[38]

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1) 267–288. MR1379242

[39]

Wang, M. and Sun, X. (2013). Bayes factor consistency for unbalanced ANOVA models. Statistics 47(5) 1104–1115. https://doi.org/10.1080/02331888.2012.694445. MR3175737

[40]

Wang, M. and Sun, X. (2014). Bayes factor consistency for nested linear models with a growing number of parameters. Journal of Statistical Planning and Inference 147 95–105. https://doi.org/10.1016/j.jspi.2013.11.001. MR3151848

[41]

Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. Journal of Machine Learning Research 11(Dec) 3571–3594. MR2756194

Full article Related articles

Open access article under the CC BY license.

Keywords

Bayes factor Highest posterior model Simulated annealing Variable selection

Funding

Sanjib Basu’s research was partially supported by award R01-ES028790 from the National Institute of Environmental Health Sciences.

Metrics

since December 2021

352

Article info
views

207

Full article
views

199

PDF
downloads

XML
downloads

RSS

Authors

Abstract

Supplementary material

References

Export citation

Copy and paste formatted citation

Download citation in file