General Additive Network Effect Models
Volume 1, Issue 3 (2023), pp. 342–360
Pub. online: 27 April 2023
Type: Statistical Methodology
Open Access
Accepted
6 April 2023
6 April 2023
Published
27 April 2023
27 April 2023
Abstract
In the interest of business innovation, social network companies often carry out experiments to test product changes and new ideas. In such experiments, users are typically assigned to one of two experimental conditions with some outcome of interest observed and compared. In this setting, the outcome of one user may be influenced by not only the condition to which they are assigned but also the conditions of other users via their network connections. This challenges classical experimental design and analysis methodologies and requires specialized methods. We introduce the general additive network effect (GANE) model, which encompasses many existing outcome models in the literature under a unified model-based framework. The model is both interpretable and flexible in modeling the treatment effect as well as the network influence. We show that (quasi) maximum likelihood estimators are consistent and asymptotically normal for a family of model specifications. Quantities of interest such as the global treatment effect are defined and expressed as functions of the GANE model parameters, and hence inference can be carried out using likelihood theory. We further propose the “power-degree” (POW-DEG) specification of the GANE model. The performance of POW-DEG and other specifications of the GANE model are investigated via simulations. Under model misspecification, the POW-DEG specification appears to work well. Finally, we study the characteristics of good experimental designs for the POW-DEG specification. We find that graph-cluster randomization and balanced designs are not necessarily optimal for precise estimation of the global treatment effect, indicating the need for alternative design strategies.
Supplementary material
Supplementary MaterialThe Supplementary Material to “General Additive Network Effect Models” contains additional simulation results for Section 4.
References
Aronow, P. M., Samii, C. et al. (2017). Estimating average causal effects under general interference, with application to a social network experiment. The Annals of Applied Statistics 11(4) 1912–1947. https://doi.org/10.1214/16-AOAS1005. MR3743283
Banerjee, S., Carlin, B. P. and Gelfand, A. E. (2003). Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC. MR3362184
Basse, G. and Feller, A. (2018). Analyzing two-stage experiments in the presence of interference. Journal of the American Statistical Association 113(521) 41–55. https://doi.org/10.1080/01621459.2017.1323641. MR3803438
Basse, G. W. and Airoldi, E. M. (2018). Model-assisted design of experiments in the presence of network-correlated outcomes. Biometrika 105(4) 849–858. https://doi.org/10.1093/biomet/asy036. MR3877869
Bramoullé, Y., Djebbari, H. and Fortin, B. (2009). Identification of peer effects through social networks. Journal of Econometrics 150(1) 41–55. https://doi.org/10.1016/j.jeconom.2008.12.021. MR2525993
Chin, A. (2019). Regression adjustments for estimating the global treatment effect in experiments with interference. Journal of Causal Inference 7(2). https://doi.org/10.1515/jci-2018-0026. MR4350071
Cliff, A. D. (1981). Spatial processes: models & applications. Pion, London. MR0632256
Cox, D. R. (1958). Planning of experiments. Wiley. MR0095561
Eckles, D., Karrer, B. and Ugander, J. (2016). Design and analysis of experiments in networks: reducing bias from interference. Journal of Causal Inference 5(1). https://doi.org/10.1515/jci-2015-0021. MR4323809
Fedorov, V. V. and Leonov, S. L. (2013). Optimal design for nonlinear response models. CRC Press. MR3113978
Gupta, S., Kohavi, R., Tang, D., Xu, Y., Andersen, R., Bakshy, E., Cardin, N., Chandran, S., Chen, N., Coey, D., Curtis, M., Deng, A., Duan, W., Forbes, P., Frasca, B., Guy, T., Imbens, G. W., Saint Jacques, G., Kantawala, P., Katsev, I., Katzwer, M., Konutgan, M., Kunakova, E., Lee, M., Lee, M., Liu, J., McQueen, J., Najmi, A., Smith, B., Trehan, V., Vermeer, L., Walker, T., Wong, J. and Yashkov, I. (2019). Top challenges from the first practical online controlled experiments summit. ACM SIGKDD Explorations Newsletter 21(1) 20–35. https://doi.org/10.1145/3331651.3331655
Harville, D. A. (1998). Matrix algebra from a statistician’s perspective. Taylor & Francis. https://doi.org/10.1007/b98818. MR1467237
Horn, R. A. and Johnson, C. R. (2012). Matrix analysis. Cambridge University Press. MR2978290
Hudgens, M. G. and Halloran, M. E. (2008). Toward causal inference with interference. Journal of the American Statistical Association 103(482) 832–842. https://doi.org/10.1198/016214508000000292. MR2435472
Jennrich, R. I. (1969). Asymptotic properties of non-linear least squares estimators. The Annals of Mathematical Statistics 40(2) 633–643. https://doi.org/10.1214/aoms/1177697731. MR0238419
Jochmans, K. and Weidner, M. (2019). Fixed-Effect Regressions on Network Data. Econometrica 87(5) 1543–1560. https://doi.org/10.3982/ECTA14605. MR4021456
Kelejian, H. H. and Prucha, I. R. (2001). On the asymptotic distribution of the Moran I test statistic with applications. Journal of Econometrics 104(2) 219–257. https://doi.org/10.1016/S0304-4076(01)00064-1. MR1864417
Kelejian, H. H. and Prucha, I. R. (2010). Specification and estimation of spatial autoregressive models with autoregressive and heteroskedastic disturbances. Journal of Econometrics 157(1) 53–67. https://doi.org/10.1016/j.jeconom.2009.10.025. MR2652278
Koutra, V., Gilmour, S. G. and Parker, B. M. (2021). Optimal block designs for experiments on networks. Journal of the Royal Statistical Society: Series C (Applied Statistics) 70(3) 596–618. https://doi.org/10.1111/rssc.12473. MR4275838
Lee, L.-F. (2004). Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica 72(6) 1899–1925. https://doi.org/10.1111/j.1468-0262.2004.00558.x. MR2095537
Li, T., Levina, E. and Zhu, J. (2019). Prediction models for network-linked data. The Annals of Applied Statistics 13(1) 132–164. https://doi.org/10.1214/18-AOAS1205. MR3937424
Manski, C. F. (1993). Identification of endogenous social effects: the reflection problem. The Review of Economic Studies 60(3) 531–542. https://doi.org/10.2307/2298123. MR1236836
Nelder, J. A. and Mead, R. (1965). A simplex method for function minimization. The Computer Journal 7(4) 308–313. https://doi.org/10.1093/comjnl/7.4.308. MR3363409
Parker, B. M., Gilmour, S. G. and Schormans, J. (2017). Optimal design of experiments on connected units with application to social networks. Journal of the Royal Statistical Society. Series C (Applied Statistics) 66(3) 455–480. https://doi.org/10.1111/rssc.12170. MR3632337
Pokhilko, V., Zhang, Q., Kang, L. and Mays, D. P. (2019). D-optimal design for network A/B testing. Journal of Statistical Theory and Practice 13(4) 1–23. https://doi.org/10.1007/s42519-019-0058-3. MR4021855
Pukelsheim, F. (2006). Optimal design of experiments. SIAM. https://doi.org/10.1137/1.9780898719109. MR2224698
Rossi, R. A. and Ahmed, N. K. (2015). The network data repository with interactive graph analytics and visualization. In AAAI. https://networkrepository.com.
Saint-Jacques, G., Varshney, M., Simpson, J. and Xu, Y. (2019). Using Ego-Clusters to Measure Network Effects at LinkedIn. arXiv preprint arXiv:1903.08755.
Shalizi, C. R. and Thomas, A. C. (2011). Homophily and contagion are generically confounded in observational social network studies. Sociological Methods & Research 40(2) 211–239. https://doi.org/10.1177/0049124111404820. MR2767833
Tchetgen, E. J. T. and VanderWeele, T. J. (2012). On causal inference in the presence of interference. Statistical Methods in Medical Research 21(1) 55–75. https://doi.org/10.1177/0962280210386779. MR2867538
Traud, A. L., Kelsic, E. D., Mucha, P. J. and Porter, M. A. (2011). Comparing community structure to characteristics in online collegiate social networks. SIAM Review 53(3) 526–543. https://doi.org/10.1137/080734315. MR2834086
Van der Vaart, A. W. (2000). Asymptotic statistics 3. Cambridge University Press. https://doi.org/10.1017/CBO9780511802256. MR1652247
White, H. (1996). Estimation, inference and specification analysis. Cambridge University Press. https://doi.org/10.1017/CCOL0521252806. MR1292251