Contrastive Inverse Regression for Dimension Reduction
Pub. online: 19 November 2024
Type: Statistical Methodology
Open Access
Accepted
3 October 2024
3 October 2024
Published
19 November 2024
19 November 2024
Abstract
Supervised dimension reduction (SDR) has been a topic of growing interest in data science, as it enables the reduction of high-dimensional covariates while preserving the functional relation with certain response variables of interest. However, existing SDR methods are not suitable for analyzing datasets collected from case-control studies. In this setting, the goal is to learn and exploit the low-dimensional structure unique to or enriched by the case group, also known as the foreground group. While some unsupervised techniques such as the contrastive latent variable model and its variants have been developed for this purpose, they fail to preserve the functional relationship between the dimension-reduced covariates and the response variable. In this paper, we propose a supervised dimension reduction method called contrastive inverse regression (CIR) specifically designed for the contrastive setting. CIR introduces an optimization problem defined on the Stiefel manifold with a non-standard loss function. We prove the convergence of CIR to a local optimum using a gradient descent-based algorithm, and our numerical study empirically demonstrates the improved performance over competing methods for high-dimensional data.
Supplementary material
Supplementary MaterialAdditional experimental details are included in the supplementary material.
References
Absil, P.-A., Mahony, R. and Sepulchre, R. (2009). Optimization algorithms on matrix manifolds. In Optimization Algorithms on Matrix Manifolds Princeton University Press. https://doi.org/10.1515/9781400830244. MR2364186
Boumal, N., Absil, P.-A. and Cartis, C. (2019). Global rates of convergence for nonconvex optimization on manifolds. IMA Journal of Numerical Analysis 39(1) 1–33. https://doi.org/10.1093/imanum/drx080. MR4023745
Breiman, L. (1996). Bias, variance, and arcing classifiers. Technical Report, Tech. Rep. 460, Statistics Department, University of California, Berkeley …. https://doi.org/10.1214/aos/1024691079. MR1635406
Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (2017) Classification and regression trees. Routledge. MR0726392
Cai, Z., Li, R. and Zhu, L. (2020). Online sufficient dimension reduction through sliced inverse regression. J. Mach. Learn. Res. 21(10) 1–25. MR4071193
Caliski, T. and Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics – Theory and Methods 3(1) 1–27. https://doi.org/10.1080/03610927408827101. MR0375641
Campadelli, P., Casiraghi, E., Ceruti, C. and Rozza, A. (2015). Intrinsic dimension estimation: Relevant techniques and a benchmark framework. Mathematical Problems in Engineering 2015 759567. https://doi.org/10.1155/2015/759567. MR3417646
Cook, R. D. (1996). Graphics for regressions with a binary response. Journal of the American Statistical Association 91(435) 983–992. https://doi.org/10.2307/2291717. MR1424601
Cook, R. D. (2009) Regression graphics: Ideas for studying regressions through graphics. John Wiley & Sons. https://doi.org/10.1002/9780470316931. MR1645673
Cook, R. D. and Weisberg, S. (1991). Sliced inverse regression for dimension reduction: Comment. Journal of the American Statistical Association 86(414) 328–332. MR1137117
Edelman, A., Arias, T. A. and Smith, S. T. (1998). The geometry of algorithms with orthogonality constraints. SIAM Journal on Matrix Analysis and Applications 20(2) 303–353. https://doi.org/10.1137/S0895479895290954. MR1646856
Freedman, D. A. (2009) Statistical models: theory and practice. cambridge university press. https://doi.org/10.1017/CBO9780511815867. MR2489600
Girard, S., Lorenzo, H. and Saracco, J. (2022). Advanced topics in sliced inverse regression. Journal of Multivariate Analysis 188 104852. https://doi.org/10.1016/j.jmva.2021.104852. MR4353861
Hastie, T., Tibshirani, R. and Buja, A. (1994). Flexible discriminant analysis by optimal scoring. Journal of the American Statistical Association 89(428) 1255–1270. MR1310220
Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences 79(8) 2554–2558. https://doi.org/10.1073/pnas.79.8.2554. MR0652033
Hsing, T. and Carroll, R. J. (1992). An asymptotic theory for sliced inverse regression. The Annals of Statistics 20(2) 1040–1061. https://doi.org/10.1214/aos/1176348669. MR1165605
Jiang, B. and Liu, J. S. (2014). Variable selection for general index models via sliced inverse regression. The Annals of Statistics 42(5) 1751–1786. https://doi.org/10.1214/14-AOS1233. MR3262467
Jiang, C.-R., Yu, W. and Wang, J.-L. (2014). Inverse regression for longitudinal data. The Annals of Statistics 42(2) 563–591. https://doi.org/10.1214/13-AOS1193. MR3210979
Jones, A., Townes, F. W., Li, D. and Engelhardt, B. E. (2022). Contrastive latent variable modeling with application to case-control sequencing experiments. The Annals of Applied Statistics 16(3) 1268–1291. https://doi.org/10.1214/21-aoas1534. MR4455880
Journée, M., Nesterov, Y., Richtárik, P. and Sepulchre, R. (2010). Generalized power method for sparse principal component analysis. Journal of Machine Learning Research 11(2) 517–553. MR2600619
Li, D., Jones, A. and Engelhardt, B. (2020). Probabilistic contrastive principal component analysis. arXiv preprint arXiv:2012.07977.
Li, D., Mukhopadhyay, M. and Dunson, D. B. (2017). Efficient manifold and subspace approximations with spherelets. arXiv preprint arXiv:1706.08263. https://doi.org/10.1111/rssb.12508. MR4494155
Li, D., Mukhopadhyay, M. and Dunson, D. B. (2022). Efficient manifold approximation with spherelets. Journal of the Royal Statistical Society Series B 84(4) 1129–1149. https://doi.org/10.1111/rssb.12508. MR4494155
Li, K.-C. (1991). Sliced inverse regression for dimension reduction. Journal of the American Statistical Association 86(414) 316–327. MR1137117
Li, L. and Yin, X. (2008). Sliced inverse regression with regularizations. Biometrics 64(1) 124–131. https://doi.org/10.1111/j.1541-0420.2007.00836.x. MR2422826
Li, L., Simonoff, J. S. and Tsai, C.-L. (2007). Tobit model estimation and sliced inverse regression. Statistical Modelling 7(2) 107–123. https://doi.org/10.1177/1471082X0700700201. MR2749982
Lin, Q., Zhao, Z. and Liu, J. S. (2018). On consistency and sparsity for sliced inverse regression in high dimensions. The Annals of Statistics 46(2) 580–610. https://doi.org/10.1214/17-AOS1561. MR3782378
Luo, H. and Li, D. (2022). Spherical rotation dimension reduction with geometric loss functions, 1–56. arXiv:2204.10975. MR4777417
Luo, H. and Strait, J. D. (2022). Nonparametric multi-shape modeling with uncertainty quantification. arXiv preprint arXiv:2206.09127.
Nierenberg, D. W., Stukel, T. A., Baron, J. A., Dain, B. J., Greenberg, E. R. and Group, S. C. P. S. (1989). Determinants of plasma levels of beta-carotene and retinol. American Journal of Epidemiology 130(3) 511–521. MR0210418
Oviedo, H. and Dalmau, O. (2019). A scaled gradient projection method for minimization over the Stiefel manifold. In Mexican International Conference on Artificial Intelligence 239–250. Springer. https://doi.org/10.1007/s11075-020-01001-9. MR4269662
Oviedo, H., Dalmau, O. and Lara, H. (2021). Two adaptive scaled gradient projection methods for Stiefel manifold constrained optimization. Numerical Algorithms 87(3) 1107–1127. https://doi.org/10.1007/s11075-020-01001-9. MR4269662
Ruhe, A. (1970). Perturbation bounds for means of eigenvalues and invariant subspaces. BIT Numerical Mathematics 10(3) 343–354. https://doi.org/10.1007/bf01934203. MR0273802
Stephenson, E., Reynolds, G., Botting, R., Calero-Nieto, F., Morgan, M., Tuong, Z., Bach, K., Sungnak, W., Worlock, K., Yoshida, M. et al. (2021). Cambridge Institute of therapeutic immunology and infectious disease-national Institute of health research (CITIID-NIHR) COVID-19 BioResource collaboration, single-cell multi-omics analysis of the immune response in COVID-19. Nat. Med 27(5) 904–916.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1) 267–288. MR1379242
Virta, J., Lee, K.-Y. and Li, L. (2022). Sliced inverse regression in metric spaces. arXiv preprint arXiv:2206.11511. MR4485085
Wilkinson, L. and Luo, H. (2022). A distance-preserving matrix sketch. Journal of Computational and Graphical Statistics 31 945–959. https://doi.org/10.1080/10618600.2022.2050246. MR4513361
Wu, Q., Liang, F. and Mukherjee, S. (2013). Kernel sliced inverse regression: Regularization and consistency. In Abstract and Applied Analysis 2013. Hindawi. https://doi.org/10.1155/2013/540725. MR3081598
Wu, Q., Mukherjee, S. and Liang, F. (2008). Localized sliced inverse regression. Advances in Neural Information Processing Systems 21. https://doi.org/10.1198/jcgs.2010.08080. MR2791260
Xiao, H., Rasul, K. and Vollgraf, R. (2017). Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747.
Zou, H., Hastie, T. and Tibshirani, R. (2006). Sparse principal component analysis. Journal of Computational and Graphical Statistics 15(2) 265–286. https://doi.org/10.1198/106186006X113430. MR2252527