Consistent and Scalable Variable Selection with Robust Link Functions

Odoom, Eric; Wang, Xia

doi:10.51387/26-NEJSDS102

The New England Journal of Statistics in Data Science

Consistent and Scalable Variable Selection with Robust Link Functions

Eric Odoom Xia Wang

https://doi.org/10.51387/26-NEJSDS102

Pub. online: 9 March 2026 Type: Methodology Article

Open Access

Area: Statistical Methodology

Accepted
3 February 2026

Published
9 March 2026

Abstract

This study explores the application of the t-link model in high-dimensional variable selection for binary regression. The t-link model provides flexibility in binary modeling and offers robust inference in the presence of outliers, making it a preferable alternative to the commonly used probit and logit links. To address the computational challenges posed by a large number of covariates, the skinny Gibbs algorithm is employed, and the consistency of variable selection under this approximate algorithm is established. These advancements in both computational and theoretical perspectives enhance the practicality and ease of implementing the t-link model. The performance of the t-link model, with a specified degrees of freedom, is compared to logit link and the probit link through simulation studies and an application to PCR data. The results demonstrate the robustness and computational efficiency of the proposed method.

Supplementary material

Supplementary Material

The supplementary materials include the R code used in this study.

References

Conflicts of Interest

The authors declare that they have no conflict of interest.

[1]

Albert, J. H. and Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. Journal of the American statistical Association 88(422) 669–679. MR1224394

[2]

Anderson, D. and Kurtz, T. Continuous time Markov chain models for chemical reaction networks. http://www.math.wisc.edu/~kurtz/papers/AndKurJuly10.pdf. Accessed 27 July 2010.

[3]

Bhattacharya, A., Chakraborty, A. and Mallick, B. K. (2016). Fast sampling with Gaussian scale mixture priors in high-dimensional regression. Biometrika 042. https://doi.org/10.1093/biomet/asw042. MR3620452

[4]

Biswas, N., Mackey, L. and Meng, X. -L. (2022). Scalable spike-and-slab. In International Conference on Machine Learning 2021–2040. PMLR.

[5]

Blanchet, J., Leder, K. and Glynn, P. (2009). Efficient Simulation of Light-Tailed Sums: an Old-Folk Song Sung to a Faster New Tune... In Monte Carlo and Quasi-Monte Carlo Methods (P. L’ Ecuyer and A. B. Owen, eds.) Springer, Berlin. https://doi.org/10.1007/978-3-642-04107-5_13. MR2743897

[6]

Blanchet, J., Leder, K. and Shi, Y. (2011). Analysis of a splitting estimator for rare event probabilities in Jackson networks. Stochastic Systems 1 306–339. https://doi.org/10.1214/11-SSY026. MR2949543

[7]

Cao, X. and Lee, K. (2023). Consistent and scalable Bayesian joint variable and graph selection for disease diagnosis leveraging functional brain network. Bayesian Analysis 1(1) 1–29. https://doi.org/10.1214/23-ba1376. MR4770325

[8]

Cao, X. and Lee, K. (2024). Bayesian inference on hierarchical nonlocal priors in generalized linear models. Bayesian Analysis 19(1) 99–122. https://doi.org/10.1214/22-ba1350. MR4692544

[9]

Cao, X., Khare, K. and Ghosh, M. (2020). High-dimensional posterior consistency for hierarchical non-Local priors in regression. Bayesian Analysis 15(1) 241–262. https://doi.org/10.1214/19-BA1154. MR4050884

[10]

Cardie, C. and Howe, N. (1997). Improving minority class prediction using case-specific feature weights. In Computer Science: Faculty Publications, Smith College, Northampton, MA.

[11]

Castillo, I. and Van Der Vaart, A. (2012). Needles and straw in a haystack: Posterior concentration for possibly sparse sequences. The Annals of Statistics 40 2069–2101. https://doi.org/10.1214/12-AOS1029. MR3059077

[12]

Chen, M. -H., Ibrahim, J. G. and Yiannoutsos, C. (1999). Prior elicitation, variable selection and Bayesian computation for logistic regression models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61(1) 223–242. https://doi.org/10.1111/1467-9868.00173. MR1664057

[13]

Chicco, D. and Jurman, G. (2023). The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification. BioData Mining 16(1) 4.

[14]

Dunn, P. K., Smyth, G. K. et al. (2018) Generalized linear models with examples in R 53. Springer. https://doi.org/10.1007/978-1-4419-0118-7. MR3887706

[15]

Eldar, Y. C. and Kutyniok, G. (2012) Compressed sensing: theory and applications. Cambridge university press. https://doi.org/10.1515/dmvm-2014-0014. MR3254193

[16]

Gelman, A. (2006). Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper). https://doi.org/10.1214/06-BA117A. MR2221284

[17]

George, E. I. and McCulloch, R. E. (1993). Variable selection via Gibbs sampling. Journal of the American Statistical Association 88(423) 881–889.

[18]

George, E. I. and McCulloch, R. E. (1997). Approaches for Bayesian variable selection. Statistica Sinica 35 339–373.

[19]

Hampel, F. R. (1974). The influence curve and its role in robust estimation. Journal of the american statistical association 69(346) 383–393. MR0362657

[20]

Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (1986) Robust Statistics: The Approach Based on Influence Functions. Wiley, New York. MR0829458

[21]

He, H. and Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on knowledge and data engineering 21(9) 1263–1284.

[22]

Hosseinian, S. and Morgenthaler, S. (2011). Robust binary regression. Journal of Statistical Planning and Inference 141(4) 1497–1509. https://doi.org/10.1016/j.jspi.2010.11.015. MR2747918

[23]

Huber, P. J. (1981) Robust Statistics. Wiley, New York. MR0606374

[24]

Iannario, M., Monti, A. C., Piccolo, D. and Ronchetti, E. (2017). Robust inference for ordinal response models. Electronic Journal of Statistics 11(2) 3407–3445. https://doi.org/10.1214/17-EJS1314. https://doi.org/10.1214/17-EJS1314. MR3709859

[25]

Ishwaran, H. and Rao, J. S. (2005). Spike and slab variable selection: Frequentist and Bayesian strategies. The Annals of Statistics 33(2) 730–773. https://doi.org/10.1214/009053604000001147. MR2163158

[26]

Johnson, V. E. and Rossell, D. (2012). Bayesian model selection in high-dimensional settings. Journal of the American Statistical Association 107(498) 649–660. https://doi.org/10.1080/01621459.2012.682536. MR2980074

[27]

Lan, H., Chen, M., Flowers, J. B., Yandell, B. S., Stapleton, D. S., Mata, C. M., Mui, E. T. q. K., Flowers, M. T., Schueler, K. L., Manly, K. F. et al. (2006). Combined expression trait correlations and expression quantitative trait locus mapping. PLoS Genetics 2(1) 6. MR2709393

[28]

Lee, K. and Cao, X. (2021). Bayesian group selection in logistic regression with application to MRI data analysis. Biometrics 77(2) 391–400. https://doi.org/10.1111/biom.13290. MR4307642

[29]

Menon, A. K., Jayasumana, S., Rawat, A. S., Jain, H., Veit, A. and Kumar, S. (2020). Long-tail learning via logit adjustment. arXiv preprint arXiv:2007.07314.

[30]

Mirfarah, E., Naderi, M., Lin, T. -I. and Wang, W. -L. (2024). Robust Bayesian inference for the censored mixture of experts model using heavy-tailed distributions. Advances in Data Analysis and Classification 1–29. https://doi.org/10.1007/s11634-024-00609-2. MR4993902

[31]

Mitchell, T. J. and Beauchamp, J. J. (1988). Bayesian variable selection in linear regression. Journal of the american statistical association 83(404) 1023–1032. MR0997578

[32]

Narisetty, N. N., Shen, J. and He, X. (2018). Skinny Gibbs: A consistent and scalable Gibbs sampler for model selection. Journal of the American Statistical Association. https://doi.org/10.1080/01621459.2018.1482754. MR4011773

[33]

Narisetty, N. N. and He, X. (2014). Bayesian variable selection with shrinking and diffusing priors. The Annals of Statistics 42(2) 789–817. https://doi.org/10.1214/14-AOS1207. MR3210987

[34]

Nikooienejad, A., Wang, W. and Johnson, V. E. (2016). Bayesian variable selection for binary outcomes in high-dimensional genomic studies using non-local priors. Bioinformatics 32(9) 1338–1345.

[35]

Ntzoufras, I., Dellaportas, P. and Forster, J. J. (2003). Bayesian variable and link determination for generalised linear models. Journal of Statistical Planning and Inference 111(1-2) 165–180. https://doi.org/10.1016/S0378-3758(02)00298-7. MR1955879

[36]

O’Brien, S. M. and Dunson, D. B. (2004). Bayesian multivariate logistic regression. Biometrics 60(3) 739–746. https://doi.org/10.1111/j.0006-341X.2004.00224.x. MR2089450

[37]

Odoom, E., Ouyang, J., Cao, X. and Wang, X. (2026). Hierarchical skinny Gibbs sampler in logistic regression using Pólya-Gamma latent variables. Statistics and Its Interface 19(2) 179–196. https://doi.org/10.4310/sii.260108020451. MR5012359

[38]

Ouyang, J. and Cao, X. (2024). Consistent skinny Gibbs in probit regression. Computational Statistics & Data Analysis 198 107993. https://doi.org/10.1016/j.csda.2024.107993. https://doi.org/10.1016/j.csda.2024.107993. MR4752161

[39]

Rockova, V., Lesaffre, E., Luime, J. and Löwenberg, B. (2012). Hierarchical Bayesian formulations for selecting variables in regression models. Statistics in medicine 31(11-12) 1221–1237. https://doi.org/10.1002/sim.4439. MR2925691

[40]

Scalera, V., Iannario, M. and Monti, A. C. (2021). Robust link functions. Statistics 55(4) 963–977. https://doi.org/10.1080/02331888.2021.1987436. MR4364403

[41]

Scott, J. G. and Berger, J. O. (2010). Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem. The Annals of Statistics 38 2587–2619. https://doi.org/10.1214/10-AOS792. MR2722450

[42]

Seumois, G., Ramírez-Suástegui, C., Schmiedel, B. J., Liang, S., Peters, B., Sette, A. and Vijayanand, P. (2020). Single-cell transcriptomic analysis of allergen-specific T cells in allergy and asthma. Science Immunology 5(48) 6087.

[43]

Shin, M., Bhattacharya, A. and Johnson, V. E. (2018). Scalable Bayesian variable selection using nonlocal prior densities in ultrahigh-dimensional settings. Statistica Sinica 28 1053–1078. MR3791100

[44]

Song, Q. and Liang, F. (2017). Nearly optimal Bayesian shrinkage for high dimensional regression. arXiv preprint arXiv:1712.08964. https://doi.org/10.1007/s11425-020-1912-6. MR4535982

[45]

Yang, M., Wang, M. and Dong, G. (2020). Bayesian variable selection for mixed effects model with shrinkage prior. Computational Statistics 35 227–243. https://doi.org/10.1007/s00180-019-00895-x. MR4066283

[46]

Yang, Y., Wainwright, M. J., Jordan, M. I. et al. (2016). On the computational complexity of high-dimensional Bayesian variable selection. The Annals of Statistics 44(6) 2497–2532. https://doi.org/10.1214/15-AOS1417. MR3576552

Full article

Open access article under the CC BY license.

Keywords

Binary regression Link functions Robustness; skinny Gibbs Spike-and-slab prior

Funding

There is no funding required for this project.

Metrics

since December 2021

114

Article info
views

Full article
views

PDF
downloads

XML
downloads

RSS

Authors

Abstract

Supplementary material

References

Conflicts of Interest

Export citation

Copy and paste formatted citation

Download citation in file