The New England Journal of Statistics in Data Science logo


  • Help
Login Register

  1. Home
  2. Issues
  3. Volume 1, Issue 3 (2023)
  4. General Additive Network Effect Models

The New England Journal of Statistics in Data Science

Submit your article Information Become a Peer-reviewer
  • Article info
  • Full article
  • Related articles
  • More
    Article info Full article Related articles

General Additive Network Effect Models
Volume 1, Issue 3 (2023), pp. 342–360
Trang Bui   Stefan H. Steiner   Nathaniel T. Stevens  

Authors

 
Placeholder
https://doi.org/10.51387/23-NEJSDS29
Pub. online: 27 April 2023      Type: Methodology Article      Open accessOpen Access
Area: Statistical Methodology

Accepted
6 April 2023
Published
27 April 2023

Abstract

In the interest of business innovation, social network companies often carry out experiments to test product changes and new ideas. In such experiments, users are typically assigned to one of two experimental conditions with some outcome of interest observed and compared. In this setting, the outcome of one user may be influenced by not only the condition to which they are assigned but also the conditions of other users via their network connections. This challenges classical experimental design and analysis methodologies and requires specialized methods. We introduce the general additive network effect (GANE) model, which encompasses many existing outcome models in the literature under a unified model-based framework. The model is both interpretable and flexible in modeling the treatment effect as well as the network influence. We show that (quasi) maximum likelihood estimators are consistent and asymptotically normal for a family of model specifications. Quantities of interest such as the global treatment effect are defined and expressed as functions of the GANE model parameters, and hence inference can be carried out using likelihood theory. We further propose the “power-degree” (POW-DEG) specification of the GANE model. The performance of POW-DEG and other specifications of the GANE model are investigated via simulations. Under model misspecification, the POW-DEG specification appears to work well. Finally, we study the characteristics of good experimental designs for the POW-DEG specification. We find that graph-cluster randomization and balanced designs are not necessarily optimal for precise estimation of the global treatment effect, indicating the need for alternative design strategies.

Supplementary material

 Supplementary Material
The Supplementary Material to “General Additive Network Effect Models” contains additional simulation results for Section 4.

References

[1] 
Advani, A. and Malde, B. (2018). Methods to identify linear network models: a review. Swiss Journal of Economics and Statistics 154(1) 12.
[2] 
Aronow, P. M., Samii, C. et al. (2017). Estimating average causal effects under general interference, with application to a social network experiment. The Annals of Applied Statistics 11(4) 1912–1947. https://doi.org/10.1214/16-AOAS1005. MR3743283
[3] 
Banerjee, S., Carlin, B. P. and Gelfand, A. E. (2003). Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC. MR3362184
[4] 
Basse, G. and Feller, A. (2018). Analyzing two-stage experiments in the presence of interference. Journal of the American Statistical Association 113(521) 41–55. https://doi.org/10.1080/01621459.2017.1323641. MR3803438
[5] 
Basse, G. W. and Airoldi, E. M. (2018). Model-assisted design of experiments in the presence of network-correlated outcomes. Biometrika 105(4) 849–858. https://doi.org/10.1093/biomet/asy036. MR3877869
[6] 
Bowers, J., Desmarais, B. A., Frederickson, M., Ichino, N., Lee, H.-W. and Wang, S. (2018). Models, methods and network topology: experimental design for the study of interference. Social Networks 54 196–208.
[7] 
Bramoullé, Y., Djebbari, H. and Fortin, B. (2009). Identification of peer effects through social networks. Journal of Econometrics 150(1) 41–55. https://doi.org/10.1016/j.jeconom.2008.12.021. MR2525993
[8] 
Candogan, O., Chen, C. and Niazadeh, R. (2021). Correlated cluster-based randomized Experiments: robust Variance Minimization. Chicago Booth Research Paper No. 21-17.
[9] 
Chin, A. (2019). Regression adjustments for estimating the global treatment effect in experiments with interference. Journal of Causal Inference 7(2). https://doi.org/10.1515/jci-2018-0026. MR4350071
[10] 
Cliff, A. D. (1981). Spatial processes: models & applications. Pion, London. MR0632256
[11] 
Cox, D. R. (1958). Planning of experiments. Wiley. MR0095561
[12] 
Eckles, D., Karrer, B. and Ugander, J. (2016). Design and analysis of experiments in networks: reducing bias from interference. Journal of Causal Inference 5(1). https://doi.org/10.1515/jci-2015-0021. MR4323809
[13] 
Fedorov, V. V. and Leonov, S. L. (2013). Optimal design for nonlinear response models. CRC Press. MR3113978
[14] 
Fruchterman, T. M. and Reingold, E. M. (1991). Graph drawing by force-directed placement. Software: Practice and Experience 21(11) 1129–1164.
[15] 
Gossen, H. H. (1983). The laws of human relations and the rules of human action derived therefrom. MIT Press.
[16] 
Gui, H., Xu, Y., Bhasin, A. and Han, J. (2015). Network A/B testing: from sampling to estimation. In Proceedings of the 24th International Conference on World Wide Web 399–409.
[17] 
Gupta, S., Kohavi, R., Tang, D., Xu, Y., Andersen, R., Bakshy, E., Cardin, N., Chandran, S., Chen, N., Coey, D., Curtis, M., Deng, A., Duan, W., Forbes, P., Frasca, B., Guy, T., Imbens, G. W., Saint Jacques, G., Kantawala, P., Katsev, I., Katzwer, M., Konutgan, M., Kunakova, E., Lee, M., Lee, M., Liu, J., McQueen, J., Najmi, A., Smith, B., Trehan, V., Vermeer, L., Walker, T., Wong, J. and Yashkov, I. (2019). Top challenges from the first practical online controlled experiments summit. ACM SIGKDD Explorations Newsletter 21(1) 20–35. https://doi.org/10.1145/3331651.3331655
[18] 
Harville, D. A. (1998). Matrix algebra from a statistician’s perspective. Taylor & Francis. https://doi.org/10.1007/b98818. MR1467237
[19] 
Horn, R. A. and Johnson, C. R. (2012). Matrix analysis. Cambridge University Press. MR2978290
[20] 
Hudgens, M. G. and Halloran, M. E. (2008). Toward causal inference with interference. Journal of the American Statistical Association 103(482) 832–842. https://doi.org/10.1198/016214508000000292. MR2435472
[21] 
Jennrich, R. I. (1969). Asymptotic properties of non-linear least squares estimators. The Annals of Mathematical Statistics 40(2) 633–643. https://doi.org/10.1214/aoms/1177697731. MR0238419
[22] 
Jochmans, K. and Weidner, M. (2019). Fixed-Effect Regressions on Network Data. Econometrica 87(5) 1543–1560. https://doi.org/10.3982/ECTA14605. MR4021456
[23] 
Karrer, B., Shi, L., Bhole, M., Goldman, M., Palmer, T., Gelman, C., Konutgan, M. and Sun, F. (2021). Network experimentation at scale. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining 3106–3116.
[24] 
Kelejian, H. H. and Prucha, I. R. (2001). On the asymptotic distribution of the Moran I test statistic with applications. Journal of Econometrics 104(2) 219–257. https://doi.org/10.1016/S0304-4076(01)00064-1. MR1864417
[25] 
Kelejian, H. H. and Prucha, I. R. (2010). Specification and estimation of spatial autoregressive models with autoregressive and heteroskedastic disturbances. Journal of Econometrics 157(1) 53–67. https://doi.org/10.1016/j.jeconom.2009.10.025. MR2652278
[26] 
Koutra, V., Gilmour, S. G. and Parker, B. M. (2021). Optimal block designs for experiments on networks. Journal of the Royal Statistical Society: Series C (Applied Statistics) 70(3) 596–618. https://doi.org/10.1111/rssc.12473. MR4275838
[27] 
La Vigne, N. G., Lowry, S. S., Markman, J. A. and Dwyer, A. M. (2011). Evaluating the use of public surveillance cameras for crime control and prevention. Washington, DC: US Department of Justice, Office of Community Oriented Policing Services. Urban Institute, Justice Policy Center.
[28] 
Lee, L.-F. (2004). Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica 72(6) 1899–1925. https://doi.org/10.1111/j.1468-0262.2004.00558.x. MR2095537
[29] 
Li, T., Levina, E. and Zhu, J. (2019). Prediction models for network-linked data. The Annals of Applied Statistics 13(1) 132–164. https://doi.org/10.1214/18-AOAS1205. MR3937424
[30] 
Lynn, C. W. and Bassett, D. S. (2019). The physics of brain network structure, function and control. Nature Reviews Physics 1(5) 318–332.
[31] 
Manski, C. F. (1993). Identification of endogenous social effects: the reflection problem. The Review of Economic Studies 60(3) 531–542. https://doi.org/10.2307/2298123. MR1236836
[32] 
Nelder, J. A. and Mead, R. (1965). A simplex method for function minimization. The Computer Journal 7(4) 308–313. https://doi.org/10.1093/comjnl/7.4.308. MR3363409
[33] 
Paluck, E. L., Shepherd, H. and Aronow, P. M. (2016). Changing climates of conflict: a social network experiment in 56 schools. Proceedings of the National Academy of Sciences 113(3) 566–571.
[34] 
Panger, G. (2016). Reassessing the Facebook experiment: critical thinking about the validity of big data research. Information, Communication & Society 19(8) 1108–1126.
[35] 
Parker, B. M., Gilmour, S. G. and Schormans, J. (2017). Optimal design of experiments on connected units with application to social networks. Journal of the Royal Statistical Society. Series C (Applied Statistics) 66(3) 455–480. https://doi.org/10.1111/rssc.12170. MR3632337
[36] 
Pokhilko, V., Zhang, Q., Kang, L. and Mays, D. P. (2019). D-optimal design for network A/B testing. Journal of Statistical Theory and Practice 13(4) 1–23. https://doi.org/10.1007/s42519-019-0058-3. MR4021855
[37] 
Pukelsheim, F. (2006). Optimal design of experiments. SIAM. https://doi.org/10.1137/1.9780898719109. MR2224698
[38] 
Rossi, R. A. and Ahmed, N. K. (2015). The network data repository with interactive graph analytics and visualization. In AAAI. https://networkrepository.com.
[39] 
Saint-Jacques, G., Varshney, M., Simpson, J. and Xu, Y. (2019). Using Ego-Clusters to Measure Network Effects at LinkedIn. arXiv preprint arXiv:1903.08755.
[40] 
Shalizi, C. R. and Thomas, A. C. (2011). Homophily and contagion are generically confounded in observational social network studies. Sociological Methods & Research 40(2) 211–239. https://doi.org/10.1177/0049124111404820. MR2767833
[41] 
Tchetgen, E. J. T. and VanderWeele, T. J. (2012). On causal inference in the presence of interference. Statistical Methods in Medical Research 21(1) 55–75. https://doi.org/10.1177/0962280210386779. MR2867538
[42] 
Traud, A. L., Mucha, P. J. and Porter, M. A. (2012). Social structure of Facebook networks. Journal of Physics A 391(16) 4165–4180.
[43] 
Traud, A. L., Kelsic, E. D., Mucha, P. J. and Porter, M. A. (2011). Comparing community structure to characteristics in online collegiate social networks. SIAM Review 53(3) 526–543. https://doi.org/10.1137/080734315. MR2834086
[44] 
Ugander, J., Karrer, B., Backstrom, L. and Kleinberg, J. (2013). Graph cluster randomization: network exposure to multiple universes. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 329–337.
[45] 
Van der Vaart, A. W. (2000). Asymptotic statistics 3. Cambridge University Press. https://doi.org/10.1017/CBO9780511802256. MR1652247
[46] 
White, H. (1996). Estimation, inference and specification analysis. Cambridge University Press. https://doi.org/10.1017/CCOL0521252806. MR1292251
[47] 
Xu, Y., Chen, N., Fernandez, A., Sinno, O. and Bhasin, A. (2015). From infrastructure to culture: A/B testing challenges in large scale social networks. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2227–2236.

Full article Related articles PDF XML
Full article Related articles PDF XML

Copyright
© 2023 New England Statistical Society
by logo by logo
Open access article under the CC BY license.

Keywords
Design of experiments A/B testing Social networks SUTVA Network effect modeling

Funding
This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) by way of Grants RGPIN-2019-04212 and RGPIN-2023-03245.

Metrics
since December 2021
861

Article info
views

267

Full article
views

286

PDF
downloads

98

XML
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

The New England Journal of Statistics in Data Science

  • ISSN: 2693-7166
  • Copyright © 2021 New England Statistical Society

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer
Powered by PubliMill  •  Privacy policy