Help

Home
Search

The New England Journal of Statistics in Data Science

Submit your article Information Become a Peer-reviewer

Journal home
To appear
Current issue
All issues
More
Journal home To appear Current issue All issues

Detailed search

Title

Author

Types

Areas

Abstract

Keywords

Published

Pages

Volumes

Issues

DOI

Affiliation

Search results 108

Order by:

Select: All None Download:

Bayesian D-Optimal Design of Experiments with Quantitative and Qualitative Responses

Lulu Kang Xinwei Deng Ran Jin

https://doi.org/10.51387/23-NEJSDS30

Pub. online: 21 Apr 2023 Type: Methodology Article

Open Access

Area: Statistical Methodology

Journal: The New England Journal of Statistics in Data Science Volume 1, Issue 3 (2023), pp. 371–385

Abstract

Systems with both quantitative and qualitative responses are widely encountered in many applications. Design of experiment methods are needed when experiments are conducted to study such systems. Classic experimental design methods are unsuitable here because they often focus on one type of response. In this paper, we develop a Bayesian D-optimal design method for experiments with one continuous and one binary response. Both noninformative and conjugate informative prior distributions on the unknown parameters are considered. The proposed design criterion has meaningful interpretations regarding the D-optimality for the models for both types of responses. An efficient point-exchange search algorithm is developed to construct the local D-optimal designs for given parameter values. Global D-optimal designs are obtained by accumulating the frequencies of the design points in local D-optimal designs, where the parameters are sampled from the prior distributions. The performances of the proposed methods are evaluated through two examples.

Inaugural Editorial. Can We Achieve Our Mission: Fast, Accessible, Cutting-edge, and Top-quality?

Colin O. Wu Ming-Hui Chen Min-ge Xie All authors (5)

https://doi.org/10.51387/23-NEJSDS11EDI

Pub. online: 12 Apr 2023 Type: Editorial

Open Access

Journal: The New England Journal of Statistics in Data Science Volume 1, Issue 1 (2023), pp. 1–3

Abstract

We are pleased to launch the first issue of the New England Journal of Statistics in Data Science (NEJSDS). NEJSDS is the official journal of the New England Statistical Society (NESS) under the leadership of Vice President for Journal and Publication and sponsored by the College of Liberal Arts and Sciences, University of Connecticut. The aims of the journal are to serve as an interface between statistics and other disciplines in data science, to encourage researchers to exchange innovative ideas, and to promote data science methods to the general scientific community. The journal publishes high quality original research, novel applications, and timely review articles in all aspects of data science, including all areas of statistical methodology, methods of machine learning, and artificial intelligence, novel algorithms, computational methods, data management and manipulation, applications of data science methods, among others. We encourage authors to submit collaborative work driven by real life problems posed by researchers, administrators, educators, or other stakeholders, and which require original and innovative solutions from data scientists.

Seamless Clinical Trials with Doubly Adaptive Biased Coin Designs

Hongjian Zhu Jun Yu Dejian Lai All authors (4)

https://doi.org/10.51387/23-NEJSDS25

Pub. online: 1 Mar 2023 Type: Methodology Article

Open Access

Area: Biomedical Research

Journal: The New England Journal of Statistics in Data Science Volume 1, Issue 3 (2023), pp. 314–322

Abstract

In addition to scientific questions, clinical trialists often explore or require other design features, such as increasing the power while controlling the type I error rate, minimizing unnecessary exposure to inferior treatments, and comparing multiple treatments in one clinical trial. We propose implementing adaptive seamless design (ASD) with response adaptive randomization (RAR) to satisfy various clinical trials’ design objectives. However, the combination of ASD and RAR poses a challenge in controlling the type I error rate. In this paper, we investigated how to utilize the advantages of the two adaptive methods and control the type I error rate. We offered the theoretical foundation for this procedure. Numerical studies demonstrated that our methods could achieve efficient and ethical objectives while controlling the type I error rate.

Evaluating Designs for Hyperparameter Tuning in Deep Neural Networks

Chenlu Shi Ashley Kathleen Chiu Hongquan Xu

https://doi.org/10.51387/23-NEJSDS26

Pub. online: 24 Feb 2023 Type: Methodology Article

Open Access

Area: Machine Learning and Data Mining

Journal: The New England Journal of Statistics in Data Science Volume 1, Issue 3 (2023), pp. 334–341

Abstract

The performance of a learning technique relies heavily on hyperparameter settings. It calls for hyperparameter tuning for a deep learning technique, which may be too computationally expensive for sophisticated learning techniques. As such, expeditiously exploring the relationship between hyperparameters and the performance of a learning technique controlled by these hyperparameters is desired, and thus it entails the consideration of design strategies to collect informative data efficiently to do so. Various designs can be considered for this purpose. The question as to which design to use then naturally arises. In this paper, we examine the use of different types of designs in efficiently collecting informative data to study the surface of test accuracy, a measure of the performance of a learning technique, over hyperparameters. Under the settings we considered, we find that the strong orthogonal array outperforms all other comparable designs.

Bayesian Simultaneous Partial Envelope Model with Application to an Imaging Genetics Analysis

Yanbo Shen Yeonhee Park Saptarshi Chakraborty All authors (4)

https://doi.org/10.51387/23-NEJSDS23

Pub. online: 2 Feb 2023 Type: Methodology Article

Open Access

Area: Statistical Methodology

Journal: The New England Journal of Statistics in Data Science Volume 1, Issue 2 (2023), pp. 237–269

Abstract

As a prominent dimension reduction method for multivariate linear regression, the envelope model has received increased attention over the past decade due to its modeling flexibility and success in enhancing estimation and prediction efficiencies. Several enveloping approaches have been proposed in the literature; among these, the partial response envelope model [57] that focuses on only enveloping the coefficients for predictors of interest, and the simultaneous envelope model [14] that combines the predictor and the response envelope models within a unified modeling framework, are noteworthy. In this article we incorporate these two approaches within a Bayesian framework, and propose a novel Bayesian simultaneous partial envelope model that generalizes and addresses some limitations of the two approaches. Our method offers the flexibility of incorporating prior information if available, and aids coherent quantification of all modeling uncertainty through the posterior distribution of model parameters. A block Metropolis-within-Gibbs algorithm for Markov chain Monte Carlo (MCMC) sampling from the posterior is developed. The utility of our model is corroborated by theoretical results, comprehensive simulations, and a real imaging genetics data application for the Alzheimer’s Disease Neuroimaging Initiative (ADNI) study.

On Bayesian Sequential Clinical Trial Designs

Tianjian Zhou Yuan Ji

https://doi.org/10.51387/23-NEJSDS24

Pub. online: 31 Jan 2023 Type: Methodology Article

Open Access

Area: Cancer Research

Journal: The New England Journal of Statistics in Data Science Volume 2, Issue 1 (2024), pp. 136–151

Abstract

Clinical trials usually involve sequential patient entry. When designing a clinical trial, it is often desirable to include a provision for interim analyses of accumulating data with the potential for stopping the trial early. We review Bayesian sequential clinical trial designs based on posterior probabilities, posterior predictive probabilities, and decision-theoretic frameworks. A pertinent question is whether Bayesian sequential designs need to be adjusted for the planning of interim analyses. We answer this question from three perspectives: a frequentist-oriented perspective, a calibrated Bayesian perspective, and a subjective Bayesian perspective. We also provide new insights into the likelihood principle, which is commonly tied to statistical inference and decision making in sequential clinical trials. Some theoretical results are derived, and numerical studies are conducted to illustrate and assess these designs.

Optimal Design of Controlled Experiments for Personalized Decision Making in the Presence of Observational Covariates

Yezhuo Li Qiong Zhang Amin Khademi All authors (4)

https://doi.org/10.51387/23-NEJSDS22

Pub. online: 26 Jan 2023 Type: Methodology Article

Open Access

Area: Statistical Methodology

Journal: The New England Journal of Statistics in Data Science Volume 1, Issue 3 (2023), pp. 386–393

Abstract

Controlled experiments are widely applied in many areas such as clinical trials or user behavior studies in IT companies. Recently, it is popular to study experimental design problems to facilitate personalized decision making. In this paper, we investigate the problem of optimal design of multiple treatment allocation for personalized decision making in the presence of observational covariates associated with experimental units (often, patients or users). We assume that the response of a subject assigned to a treatment follows a linear model which includes the interaction between covariates and treatments to facilitate precision decision making. We define the optimal objective as the maximum variance of estimated personalized treatment effects over different treatments and different covariates values. The optimal design is obtained by minimizing this objective. Under a semi-definite program reformulation of the original optimization problem, we use a YALMIP and MOSEK based optimization solver to provide the optimal design. Numerical studies are provided to assess the quality of the optimal design.

Comments on Xiao-Li Meng’s Double Your Variance, Dirtify Your Bayes, Devour Your Pufferfish, and Draw Your Kidstogram

Dennis K.J. Lin

https://doi.org/10.51387/23-NEJSDS6E

Pub. online: 20 Jan 2023 Type: Commentary And/or Historical Perspective

Open Access

Area: Statistical Methodology

Journal: The New England Journal of Statistics in Data Science Volume 1, Issue 1 (2023), pp. 31–34

Detection of Anomalies in Traffic Flows with Large Amounts of Missing Data

Qing He Charles W. Harrison Hsin-Hsiung Huang

https://doi.org/10.51387/23-NEJSDS20

Pub. online: 11 Jan 2023 Type: Methodology Article

Open Access

Area: Statistical Methodology

Journal: The New England Journal of Statistics in Data Science Volume 1, Issue 1 (2023), pp. 84–94

Abstract

Anomaly detection plays an important role in traffic operations and control. Missingness in spatial-temporal datasets prohibits anomaly detection algorithms from learning characteristic rules and patterns due to the lack of large amounts of data. This paper proposes an anomaly detection scheme for the 2021 Algorithms for Threat Detection (ATD) challenge based on Gaussian process models that generate features used in a logistic regression model which leads to high prediction accuracy for sparse traffic flow data with a large proportion of missingness. The dataset is provided by the National Science Foundation (NSF) in conjunction with the National Geospatial-Intelligence Agency (NGA), and it consists of thousands of labeled traffic flow records for 400 sensors from 2011 to 2020. Each sensor is purposely downsampled by NSF and NGA in order to simulate missing completely at random, and the missing rates are 99%, 98%, 95%, and 90%. Hence, it is challenging to detect anomalies from the sparse traffic flow data. The proposed scheme makes use of traffic patterns at different times of day and on different days of week to recover the complete data. The proposed anomaly detection scheme is computationally efficient by allowing parallel computation on different sensors. The proposed method is one of the two top performing algorithms in the 2021 ATD challenge.

A Not-so-radical Rejoinder: Habituate Systems Thinking and Data (Science) Confession for Quality Enhancement

Xiao-Li Meng

https://doi.org/10.51387/22-NEJSDS6REJ

Pub. online: 6 Jan 2023 Type: Commentary And/or Historical Perspective

Open Access

Area: Statistical Methodology

Journal: The New England Journal of Statistics in Data Science Volume 1, Issue 1 (2023), pp. 39–45

7 8 9 10 11

Items per page

Export citation

Copy and paste formatted citation

Formatted citation

Placeholder

Citation style

Download citation in file

Export format

Authors

Placeholder

RSS

The New England Journal of Statistics in Data Science

ISSN: 2693-7166

About

About journal

For contributors

Submit
OA Policy
Become a Peer-reviewer