The supersaturated design is often used to discover important factors in an experiment with a large number of factors and a small number of runs. We propose a method for constructing supersaturated designs with small coherence. Such designs are useful for variable selection methods such as the Lasso. Examples are provided to illustrate the proposed method.

Modern experiments can involve a large number of factors. Assuming the effect sparsity principle [

It is increasingly common to use modern variable selection methods to analyze data from a supersaturated design and identify important factors [

Motivated by the importance of controlling the worst case column correlation, we propose a new method to construct supersaturated designs according to the

Our construction method allows some columns of the design matrix to be unbalanced. Factor balance in a supersaturated design guarantees an accurate estimate of the intercept but unbalance provides other significant benefits in constructing designs. Being unbalanced provides flexibility in controlling the column correlations between every pair of the main effects to achieve a lower average column correlation [

The remainder of the article will unfold as follows. In Section

Throughout, “#Balance” denotes the number of balanced columns of a design. Let

Our construction method expands a

Step 1: Use two copies of

Step 2: Expand

Step 3: Expand

If

Since the design

The only requirement for

We now provide several examples for the proposed method with different choices of

Let

Designs Constructed from a

Size | #Balance | |

1/3 | 24 | |

1/3 | 168 | |

1/3 | 840 | |

1/3 | 3720 | |

1/3 | 15624 |

Let

where

Designs Constructed from a

Size | #Balance | |

1/3 | 2968 | |

1/3 | 18616 | |

1/3 | 99064 |

Let

Designs Constructed from a

Size | #Balance | |

1/3 | 101 | |

1/3 | 550 | |

1/3 | 2864 | |

1/3 | 13156 | |

1/3 | 56396 |

Whenever necessary, one can select some columns of the designs constructed for use. Selecting a subset of columns could remain coherence the same or further decrease it, but will never increase coherence. For example, for a given

We now generalize the proposed method to obtain a supersaturated design with

The generalization indicates the possibility of expanding any two-level supersaturated design with

Let

Designs Constructed from a

Size | #Balance | |

0.2 | 76 | |

0.2 | 112 | |

0.2 | 184 | |

0.2 | 328 | |

0.2 | 616 |

In this section, we compare the proposed designs with four popular classes of supersaturated designs: Lin’s designs [

Lin’s designs are constructed from the Plackett and Burman design [

We simulate data from the linear model in (

We conduct simulations using the eight active coefficients settings in Table

Active Coefficients Settings.

Case | #Active | Active Coefficients and Their Values |

1 | 2 | |

2 | 2 | |

3 | 4 | |

4 | 4 | |

5 | 4 | |

6 | 4 | |

7 | 6 | |

8 | 6 |

Comparison with Lin’s Design.

Case | #Active | Design | Size | #Balance | AFDR | AMR | MSE | EME | ||

1 | 2 | Proposed | 0.33 | 9.70 | 24 | |||||

LIN | 0.43 | 7.84 | 26 | 0.59 | 0.00 | 0.93 | 9.18 | |||

2 | 2 | Proposed | 0.33 | 9.70 | 24 | |||||

LIN | 0.43 | 7.84 | 26 | 0.59 | 0.08 | 1.13 | 12.51 | |||

3 | 4 | Proposed | 0.33 | 9.70 | 24 | |||||

LIN | 0.43 | 7.84 | 26 | 0.40 | 0.28 | 230.66 | 794.73 | |||

4 | 4 | Proposed | 0.33 | 9.70 | 24 | |||||

LIN | 0.43 | 7.84 | 26 | 0.64 | 0.53 | 4.19 | 19.79 |

For each design in our simulations, we will show its size, coherence, and number of balanced columns. Although

Average False Discovery Rate (AFDR)

Average Miss Rate (AMR)

Mean Squared Error (MSE)

Expected Model Error (EME)

We obtain the

To verify that this result is not due to some better property of the selected active factors in one design versus another, we also conducted a follow-up comparison with randomly selected active factors. More specifically, for a given setting in Table

Comparison with Lin’s Design with Randomly Selected Active Factors.

Case | #Active | Design | Size | #Balance | AFDR | AMR | MSE | EME | ||

1 | 2 | Proposed | 0.33 | 9.70 | 24 | 5.01 | ||||

LIN | 0.43 | 7.84 | 26 | 0.61 | 0.00 | 11.58 | ||||

2 | 2 | Proposed | 0.33 | 9.70 | 24 | |||||

LIN | 0.43 | 7.84 | 26 | 0.59 | 0.14 | 1.36 | 12.93 | |||

3 | 4 | Proposed | 0.33 | 9.70 | 24 | 0.47 | 51.62 | |||

LIN | 0.43 | 7.84 | 26 | 0.04 | 135.42 | |||||

4 | 4 | Proposed | 0.33 | 9.70 | 24 | 0.55 | ||||

LIN | 0.43 | 7.84 | 26 | 0.27 | 2.13 | 18.56 |

Comparison with Wu’s Designs.

Case | #Active | Design | Size | #Balance | AFDR | AMR | MSE | EME | ||

1 | 2 | Proposed | 0.33 | 9.90 | 24 | 0.65 | 0.95 | |||

WU | 0.33 | 11.06 | 63 | 0.00 | 9.03 | |||||

2 | 2 | Proposed | 0.33 | 9.90 | 24 | 0.66 | 0.05 | 1.20 | 12.46 | |

WU | 0.33 | 11.06 | 63 | |||||||

__________________ | _____________________ _ | ________________________ | ________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________ | ________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________ | ________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________ | ________________________ | ___________________ | _________________ | ________________ | _________________ |

3 | 4 | Proposed | 0.33 | 21.84 | 168 | |||||

WU | 0.33 | 23.00 | 255 | 0.65 | 0.00 | 2.08 | 23.25 | |||

4 | 4 | Proposed | 0.33 | 21.84 | 168 | |||||

WU | 0.33 | 23.00 | 255 | 0.68 | 0.14 | 1.76 | 24.90 | |||

__________________ | _______________________ | __________________________ | __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________ | __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________ | __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________ | __________________________ | _____________________ | ___________________ | __________________ | ___________________ |

7 | 6 | Proposed | 0.33 | 93.05 | 299 | |||||

WU | 0.33 | 41.99 | 299 | 0.75 | 0.00 | 2.52 | 37.94 | |||

8 | 6 | Proposed | 0.33 | 93.05 | 299 | |||||

WU | 0.33 | 41.99 | 299 | 0.77 | 0.01 | 2.58 | 44.09 |

We use the

We use the ^{ő} software [

Comparison with the Bayesian D-Optimal Supersaturated Designs.

Case | #Active | Design | Size | #Balance | AFDR | AMR | MSE | EME | ||

1 | 2 | Proposed | 0.33 | 9.90 | 24 | |||||

BAYES | 0.67 | 11.04 | 63 | 0.76 | 0.06 | 45.02 | 76.31 | |||

2 | 2 | Proposed | 0.33 | 9.90 | 24 | |||||

BAYES | 0.67 | 11.04 | 63 | 0.78 | 0.45 | 3.20 | 16.31 | |||

3 | 4 | Proposed | 0.33 | 9.90 | 24 | |||||

BAYES | 0.67 | 11.04 | 63 | 0.66 | 0.16 | 240.96 | 535.21 | |||

4 | 4 | Proposed | 0.33 | 9.90 | 24 | 0.65 | ||||

BAYES | 0.67 | 11.04 | 63 | 0.46 | 4.61 | 30.40 | ||||

__________________ | _______________________ | __________________________ | __________________________ | _____________________ | ___________________ | _____________________ | _____________________ | |||

5 | 4 | Proposed | 0.33 | 21.84 | 168 | |||||

BAYES | 0.67 | 22.91 | 252 | 0.67 | 0.00 | 3.62 | 27.99 | |||

6 | 4 | Proposed | 0.33 | 21.84 | 168 | 0.70 | ||||

BAYES | 0.67 | 22.91 | 252 | 0.28 | 3.36 | 50.61 | ||||

7 | 6 | Proposed | 0.33 | 21.84 | 168 | |||||

BAYES | 0.67 | 22.91 | 252 | 0.67 | 0.28 | 162.70 | 934.85 | |||

8 | 6 | Proposed | 0.33 | 21.84 | 168 | 0.72 | ||||

BAYES | 0.67 | 22.91 | 252 | 0.44 | 16.57 | 158.04 |

Comparison with the

Case | #Active | Design | Size | #Balance | MAFDR | MAMR | MMSE | MEME | ||

1 | 2 | Proposed | 0.33 | 9.90 | 24 | 0.67 | ||||

JM | 0.83 | 9.90 | 16 | 0.00 | 2.53 | 11.02 | ||||

2 | 2 | Proposed | 0.33 | 9.90 | 24 | 0.67 | ||||

JM | 0.83 | 9.90 | 16 | 0.26 | 2.35 | 15.26 | ||||

3 | 4 | Proposed | 0.33 | 9.90 | 24 | |||||

JM | 0.83 | 9.90 | 16 | 0.62 | 0.45 | 351.16 | 956.38 | |||

4 | 4 | Proposed | 0.33 | 9.90 | 24 | |||||

JM | 0.83 | 9.90 | 16 | 0.66 | 0.55 | 4.47 | 24.95 | |||

__________________ | _______________________ | __________________________ | __________________________ | _________________________ | _______________________ | ______________________ | ________________________ | |||

5 | 4 | Proposed | 0.33 | 21.84 | 168 | |||||

JM | 0.58 | 21.84 | 44 | 0.69 | 0.00 | 66.86 | 61.37 | |||

6 | 4 | Proposed | 0.33 | 21.84 | 168 | |||||

JM | 0.58 | 21.84 | 44 | 0.73 | 0.31 | 3.73 | 41.41 | |||

7 | 6 | Proposed | 0.33 | 21.84 | 168 | |||||

JM | 0.58 | 21.84 | 44 | 0.72 | 0.32 | 294.96 | 1336.26 | |||

8 | 6 | Proposed | 0.33 | 21.84 | 168 | |||||

JM | 0.58 | 21.84 | 44 | 0.72 | 0.43 | 18.54 | 147.44 |

According to Jones and Majumdar [

We use the

Comparison between the

To help people better understand how the proposed design performs compared to competing designs with the same size and under the same active coefficient setting, we collect the comparisons between designs with exactly the same size and active coefficient setting all together in Table

Comparison Between Designs with the Same Size and Active Coefficients.

Case | #Active | Design | Size | #Balance | AFDR | AMR | MSE | EME | ||

1 | 2 | Proposed | 0.33 | 9.90 | 24 | 0.65 | 0.95 | |||

WU | 0.33 | 11.06 | 63 | 0.00 | 9.03 | |||||

BAYES | 0.67 | 11.04 | 63 | 0.76 | 0.06 | 45.02 | 76.31 | |||

JM | 0.83 | 9.90 | 16 | 0.61 | 0.00 | 2.53 | 11.02 | |||

2 | 2 | Proposed | 0.33 | 9.90 | 24 | 0.66 | 0.05 | 1.20 | 12.46 | |

WU | 0.33 | 11.06 | 63 | |||||||

BAYES | 0.67 | 11.04 | 63 | 0.78 | 0.45 | 3.20 | 16.31 | |||

JM | 0.83 | 9.90 | 16 | 0.66 | 0.26 | 2.35 | 15.26 |

According to Table

In summary, for Wu’s design, the proposed design is generally comparable to it, because of the same coherence they have. For Lin’s design, the Bayesian D-optimal design and the

We have proposed a method for constructing supersaturated designs with small coherence. The constructed designs are allowed to be unbalanced to achieve more flexible sample sizes. The proposed method uses direct constructions and it entails no intensive computing even for large

Here are possible directions for future work. First, the proposed method can expand a

Throughout, for two columns

Let

where

Pick two arbitrary columns of

Suppose that both of them are from

Suppose that one of the two columns is the

Since the absolute dot products of the two columns in these blocks are ≤ 2, the absolute dot product of the two columns in

Let

where

Pick two arbitrary columns of

Suppose that both of them are from

Suppose that one of the two columns is the

Suppose that one of the two columns is the

Suppose that one of the two columns is the

Suppose that one of the two columns is the

To prove this theorem, we use the same notation from the proof of Lemma

Suppose that both of the two columns are from

Suppose that one of the two columns is from

Zhao and Yu [