Construction of Supersaturated Designs with Small Coherence for Variable Selection
Volume 1, Issue 3 (2023), pp. 323–333
Pub. online: 5 June 2023
Type: Machine Learning And Data Mining
Open Access
Accepted
19 April 2023
19 April 2023
Published
5 June 2023
5 June 2023
Abstract
The supersaturated design is often used to discover important factors in an experiment with a large number of factors and a small number of runs. We propose a method for constructing supersaturated designs with small coherence. Such designs are useful for variable selection methods such as the Lasso. Examples are provided to illustrate the proposed method.
References
Bertsimas, D., King, A. and Mazumder, R. Best subset selection via a modern optimization lens. The Annals of Statistics 44(2) 813–852 (2016). https://doi.org/10.1214/15-AOS1388. MR3476618
Booth, K. H. and Cox, D. R. Some systematic supersaturated designs. Technometrics 4(4) 489–495 (1962). https://doi.org/10.2307/1266285. MR0184369
Box, G. E. and Meyer, R. D. An analysis for unreplicated fractional factorials. Technometrics 28(1) 11–18 (1986). https://doi.org/10.2307/1269599. MR0824728
Bulutoglu, D. A. and Cheng, C. S. Hidden projection properties of some nonregular fractional factorial designs and their applications. The Annals of Statistics 31(3) 1012–1026 (2003). https://doi.org/10.1214/aos/1056562472. MR1994740
Chen, J. and Lin, D. K. On the identifiability of a supersaturated design. Journal of Statistical Planning and Inference 72(1–2) 99–107 (1998). https://doi.org/10.1016/S0378-3758(98)00025-1. MR1655185
Davenport, M. A., Duarte, M. F., Eldar, Y. C. and Kutyniok, G. Introduction to Compressed Sensing. Citeseer (2012). https://doi.org/10.1017/CBO9780511794308.002. MR2963166
Fan, J. and Lv, J. Sure independence screening for ultrahigh dimensional feature space. Journal of the Royal Statistical Society, Series B 70(5) 849–911 (2008). https://doi.org/10.1111/j.1467-9868.2008.00674.x. MR2530322
Fan, J., Feng, Y. and Song, R. Nonparametric independence screening in sparse ultra-high-dimensional additive models. Journal of the American Statistical Association 106(494) 544–557 (2011). https://doi.org/10.1198/jasa.2011.tm09779. MR2847969
Jones, B., Lin, D. K. and Nachtsheim, C. J. Bayesian D-optimal supersaturated designs. Journal of Statistical Planning and Inference 138(1) 86–92 (2008). https://doi.org/10.1016/j.jspi.2007.05.021. MR2369616
Jones, B. and Majumdar, D. Optimal supersaturated designs. Journal of the American Statistical Association 109(508) 1592–1600 (2014). https://doi.org/10.1080/01621459.2014.938810. MR3293612
Jones, B., Lekivetz, R., Majumdar, D., Nachtsheim, C. J. and Stallrich, J. W. Construction, properties, and analysis of group-orthogonal supersaturated designs. Technometrics 62(3) 403–414 (2020). https://doi.org/10.1080/00401706.2019.1654926. MR4125505
Li, R. and Lin, D. K. Data analysis in supersaturated designs. Statistics & Probability Letters 59(2) 135–144 (2002). https://doi.org/10.1016/S0167-7152(02)00140-2. MR1927525
Lin, D. K. Generating systematic supersaturated designs. Technometrics 37(2) 213–225 (1995). https://doi.org/10.2307/1269765. MR1365722
Liu, Y., Ruan, S. and Dean, A. M. Construction and analysis of $E({s^{2}})$ efficient supersaturated designs. Journal of Statistical Planning and Inference 137(5) 1516–1529 (2007). https://doi.org/10.1016/j.jspi.2006.09.001. MR2303773
Marley, C. J. and Woods, D. C. A comparison of design and model selection methods for supersaturated experiments. Computational Statistics & Data Analysis 54(12) 3158–3167 (2010). https://doi.org/10.1016/j.csda.2010.02.017. MR2727742
Plackett, R. L. and Burman, J. P. The design of optimum multifactorial experiments. Biometrika 33(4) 305–325 (1946). https://doi.org/10.1093/biomet/33.4.305. MR0016624
R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2017). https://www.R-project.org/.
Shen, X., Pan, W., Zhu, Y. and Zhou, H. On constrained and regularized high-dimensional regression. Annals of the Institute of Statistical Mathematics 65(5) 807–832 (2013). https://doi.org/10.1007/s10463-012-0396-3. MR3105798
Singh, R. and Stufken, J. Selection of two-level supersaturated designs for main effects models. Technometrics 65 1–9 (2022). https://doi.org/10.1080/00401706.2022.2102080. MR4543063
Tang, B. and Wu, C. F. J. A method for constructing supersaturated designs and its $E({s^{2}})$ optimality. Canadian Journal of Statistics 25(2) 191–201 (1997). https://doi.org/10.2307/3315731. MR1463319
Tibshirani, R. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B 58 267–288 (1996). MR1379242
Weese, M. L., Stallrich, J. W., Smucker, B. J. and Edwards, D. J. Strategies for supersaturated screening: Group orthogonal and constrained var (s) designs. Technometrics 63(4) 443–455 (2021). https://doi.org/10.1080/00401706.2020.1850529. MR4331445
Wu, C. F. J. Construction of supersaturated designs through partially aliased interactions. Biometrika 80(3) 661–669 (1993). https://doi.org/10.1093/biomet/80.3.661. MR1248029
Wu, C. F. J. and Hamada, M. S. Experiments: planning, analysis, and optimization 2nd ed. John Wiley & Sons, Hoboken, NJ (2009). MR2583259
Zhao, P. and Yu, B. On model selection consistency of Lasso. Journal of Machine Learning Research 7 2541–2563 (2006). MR2274449