In online experimentation, appropriate metrics (e.g., purchase) provide strong evidence to support hypotheses and enhance the decision-making process. However, incomplete metrics are frequently occurred in the online experimentation, making the available data to be much fewer than the planned online experiments (e.g., A/B testing). In this work, we introduce the concept of dropout buyers and categorize users with incomplete metric values into two groups: visitors and dropout buyers. For the analysis of incomplete metrics, we propose a clustering-based imputation method using k-nearest neighbors. Our proposed imputation method considers both the experiment-specific features and users’ activities along their shopping paths, allowing different imputation values for different users. To facilitate efficient imputation of large-scale data sets in online experimentation, the proposed method uses a combination of stratification and clustering. The performance of the proposed method is compared to several conventional methods in both simulation studies and a real online experiment at eBay.
In the interest of business innovation, social network companies often carry out experiments to test product changes and new ideas. In such experiments, users are typically assigned to one of two experimental conditions with some outcome of interest observed and compared. In this setting, the outcome of one user may be influenced by not only the condition to which they are assigned but also the conditions of other users via their network connections. This challenges classical experimental design and analysis methodologies and requires specialized methods. We introduce the general additive network effect (GANE) model, which encompasses many existing outcome models in the literature under a unified model-based framework. The model is both interpretable and flexible in modeling the treatment effect as well as the network influence. We show that (quasi) maximum likelihood estimators are consistent and asymptotically normal for a family of model specifications. Quantities of interest such as the global treatment effect are defined and expressed as functions of the GANE model parameters, and hence inference can be carried out using likelihood theory. We further propose the “power-degree” (POW-DEG) specification of the GANE model. The performance of POW-DEG and other specifications of the GANE model are investigated via simulations. Under model misspecification, the POW-DEG specification appears to work well. Finally, we study the characteristics of good experimental designs for the POW-DEG specification. We find that graph-cluster randomization and balanced designs are not necessarily optimal for precise estimation of the global treatment effect, indicating the need for alternative design strategies.