<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">NEJSDS</journal-id>
<journal-title-group><journal-title>The New England Journal of Statistics in Data Science</journal-title></journal-title-group>
<issn pub-type="ppub">2693-7166</issn><issn-l>2693-7166</issn-l>
<publisher>
<publisher-name>New England Statistical Society</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">NEJSDS52</article-id>
<article-id pub-id-type="doi">10.51387/23-NEJSDS52</article-id>
<article-categories>
<subj-group subj-group-type="heading"><subject>Methodology Article</subject></subj-group>
<subj-group subj-group-type="area"><subject>NextGen</subject></subj-group>
</article-categories>
<title-group>
<article-title>U.S. Mental Health Dashboard</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Arvelo</surname><given-names>Isabel</given-names></name><email xlink:href="mailto:ica1@williams.edu">ica1@williams.edu</email><xref ref-type="aff" rid="j_nejsds52_aff_001"/><xref ref-type="aff" rid="j_nejsds52_aff_002"/><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Plantinga</surname><given-names>Anna</given-names></name><email xlink:href="mailto:amp9@williams.edu">amp9@williams.edu</email><xref ref-type="aff" rid="j_nejsds52_aff_002"/>
</contrib>
<aff id="j_nejsds52_aff_001">Data Science Institute, <institution>Vanderbilt University</institution>, <country>United States</country>. E-mail address: <email xlink:href="mailto:ica1@williams.edu">ica1@williams.edu</email></aff>
<aff id="j_nejsds52_aff_002">Department of Mathematics and Statistics, <institution>Williams College</institution>, <country>United States</country>. E-mail address: <email xlink:href="mailto:amp9@williams.edu">amp9@williams.edu</email></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2024</year></pub-date><pub-date pub-type="epub"><day>1</day><month>12</month><year>2023</year></pub-date><volume>2</volume><issue>3</issue><fpage>323</fpage><lpage>329</lpage><history><date date-type="accepted"><day>25</day><month>10</month><year>2023</year></date></history>
<permissions><copyright-statement>© 2024 New England Statistical Society</copyright-statement><copyright-year>2024</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>In this paper, we present the U.S. Mental Health Dashboard, an R Shiny web application that facilitates exploratory data analysis of U.S. mental health data collected through national surveys. Mental health affects almost every aspect of people’s lives including their social relationships, substance use, academic success, professional productivity, and physical wellness. Even so, mental illnesses are often perceived as less legitimate or serious than physical diseases, and as a result of this stigmatization, many people suffer in silence without access to proper treatment. To address the lack of accessible healthcare information related to mental illness, the U.S. Mental Health Dashboard presents dynamic visualizations, tables, and choropleth maps of the prevalence and geographic distribution of key mental health metrics based on data from the National Survey on Drug Use and Health (NSDUH) and Behavioral Risk Factor Surveillance System (BRFSS). National and state-level estimates are provided for the civilian, non-institutionalized adult population of the United States as well as within relevant demographic subpopulations. By demonstrating the pervasiveness of mental illness and stark health inequities between demographic groups, this application aims to raise mental health awareness and reduce self-blame and stigmatization, especially for individuals that may inherently be at high risk. The U.S. Mental Health Dashboard has a wide variety of potential use cases: to illustrate to individuals suffering from mental illness and those in close proximity to them that they are not alone, identify subpopulations with the biggest need for mental health care, and help epidemiologists planning studies identify the target population for specific mental illness symptoms.</p>
</abstract>
<kwd-group>
<label>Keywords and phrases</label>
<kwd>Mental health</kwd>
<kwd>Data visualization</kwd>
<kwd>Health inequities</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="j_nejsds52_s_001">
<label>1</label>
<title>Introduction</title>
<p>It is fairly commonplace for people to go to the doctor or confide in their friends and family when they are feeling physically unwell. However, mental illnesses are much less likely to be discussed or treated. According to the Mental Health Million Project [<xref ref-type="bibr" rid="j_nejsds52_ref_012">12</xref>], 45% of individuals with clinical-level mental health challenges in the United States do not seek professional help. Stigma and lack of affordable treatment options are major obstacles that preclude people from discussing and seeking treatment for mental illness [<xref ref-type="bibr" rid="j_nejsds52_ref_006">6</xref>].</p>
<p>Existing web applications that visualize healthcare data rely on only one survey or are focused on physical illnesses such as cancer and heart disease [<xref ref-type="bibr" rid="j_nejsds52_ref_002">2</xref>]. Other more comprehensive platforms aimed towards mental health are cumbersome to navigate or answer a small set of specific questions, instead of providing a more high-level view of the mental health landscape [<xref ref-type="bibr" rid="j_nejsds52_ref_011">11</xref>, <xref ref-type="bibr" rid="j_nejsds52_ref_002">2</xref>, <xref ref-type="bibr" rid="j_nejsds52_ref_010">10</xref>]. However, there is currently no easy-to-navigate, broad-scale data visualization web application for those who want to learn more about the prevalence and health disparities associated with common symptoms of mental illness. However, such an app would be useful in a wide range of scenarios: for example, it could help people suffering from mental illness and those who care for them know they are not alone, as well as providing an overview of the scale and distribution of mental illness for public health practitioners and epidemiologists.</p>
<p>To address this gap, we present the U.S. Mental Health Dashboard, an interactive web application for exploratory data analysis that aggregates mental health statistics from two national surveys. We use data from national surveys run by the Department of Health and Human Services (DHHS) [<xref ref-type="bibr" rid="j_nejsds52_ref_013">13</xref>] and Centers for Disease Control and Prevention (CDC) [<xref ref-type="bibr" rid="j_nejsds52_ref_003">3</xref>] to visualize various mental health illnesses and key mental health metrics for adults across the United States. The databases are integrated into an interactive web app that allows users to select response variables of interest, produce dynamic visualizations, tables, and choropleths for response variables of interest, and compare results across different subpopulations.</p>
</sec>
<sec id="j_nejsds52_s_002" sec-type="methods">
<label>2</label>
<title>Methods</title>
<sec id="j_nejsds52_s_003">
<label>2.1</label>
<title>Contributing Datasets</title>
<sec id="j_nejsds52_s_004">
<label>2.1.1</label>
<title>National Survey on Drug Use and Health (2020)</title>
<p>The National Survey on Drug Use and Health (NSDUH) is a national study run by the DHHS that collects information on several health-related issues including tobacco, alcohol, drug use, and mental health in the United States. The population of interest is the civilian, non-institutionalized population aged 12 or older at the time of the survey, i.e., excluding active military personnel, people living in institutional group quarters, and homeless individuals. Our analysis only uses data on adults ages 18 and older from the 2020 survey (27,170 observations).</p>
<p>Participants are selected for the survey using an independent, multistage area probability sample within each state and the District of Columbia. Because geographic identifiers such as state are not included in the 2020 public use file, it is only possible to make national estimates. The weights in this file are approximately the inverse probability of selection for each record.</p>
<p>The response variables of interest from this survey include level of psychological distress over the past 30 days, worst psychological distress over the past 30 days, predicted probability of serious mental health illness, and mental illness category as well as indicators for the past month of serious psychological distress, serious suicidal thoughts, suicidal plans, suicide attempt, lifetime major depressive episode, and major depressive episode in last year. Available demographic and social characteristics include gender, educational status, marital status, age, race, employment status, income, and poverty level.</p>
<p>Due to the COVID-19 pandemic, Quarter 4 of 2020 was the first time that the NSDUH utilized web-based interviewing. However, a concerningly high number of adults provided usable information on their substance use but did not complete the mental health or later questions (i.e., “break-offs”). The DHHS created additional analysis weights to analyze the unimputed outcomes starting from the mental health and subsequent sections of the questionnaire to account for respondents who broke off the interview before completing these sections. Unfortunately, the public use file does not contain these “break-off analysis weights,” so estimates from 2020 should not be compared longitudinally to other years of survey data.</p>
</sec>
<sec id="j_nejsds52_s_005">
<label>2.1.2</label>
<title>BRFSS: Behavioral Risk Factor Surveillance System (2021)</title>
<p>The Behavioral Risk Factor Surveillance System (BRFSS) is a nationwide system run by the CDC of health-related telephone surveys that provide state-level information about health-related risk behaviors, chronic health conditions, and the use of preventive services among U.S. residents. The population of interest is the noninstitutionalized adult population (18 years or older) residing in private residences or college housing in the United States or participating areas who have a working cellular telephone. Participants are selected through an overlapping, dual-frame landline and cell phone sample. The BRFSS samples high and medium-density strata to obtain a probability sample of all households with telephones. State health departments may directly collect data from their state residents, or they may use a contractor. Person-level analysis weights involve two components: design-based weights and weight adjustment factors. Design weights reflect probabilities of selection at the sample stage. Weight adjustments are performed using the generalized exponential model (GEM) developed by Folsom and Singh [<xref ref-type="bibr" rid="j_nejsds52_ref_007">7</xref>], which calibrated the design-based weights to reduce non-response bias, poststratify to known population control totals, and control for extreme weights when necessary.</p>
<p>The response variables of interest from this survey include an indicator for whether individuals were ever told they had a depressive disorder and a self-assessment of general health. Relevant demographic and social variables include health insurance status, age, education, and income.</p>
</sec>
</sec>
<sec id="j_nejsds52_s_006">
<label>2.2</label>
<title>R Packages Used</title>
<sec id="j_nejsds52_s_007">
<label>2.2.1</label>
<title>survey</title>
<p>The main package we used to specify complex survey designs and produce unbiased summary statistics was <italic>survey: Analysis of Complex Survey Samples</italic> [<xref ref-type="bibr" rid="j_nejsds52_ref_009">9</xref>], written by Thomas Lumley. This package creates <italic>svydesign</italic> objects that ensure that the design information cannot be separated or used with the wrong data.</p>
<p>The fundamental concept that underlies design-based inference is that an individual sampled with sampling probability <inline-formula id="j_nejsds52_ineq_001"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\pi _{i}}$]]></tex-math></alternatives></inline-formula> represents <inline-formula id="j_nejsds52_ineq_002"><alternatives><mml:math><mml:mstyle displaystyle="false">
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mstyle></mml:math><tex-math><![CDATA[$\frac{1}{{\pi _{i}}}$]]></tex-math></alternatives></inline-formula> individuals in the population. <inline-formula id="j_nejsds52_ineq_003"><alternatives><mml:math><mml:mstyle displaystyle="false">
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mstyle></mml:math><tex-math><![CDATA[$\frac{1}{{\pi _{i}}}$]]></tex-math></alternatives></inline-formula> is referred to as the sampling weight. All of the analyses run with this package use the Horvitz-Thompson estimator to estimate the population total [<xref ref-type="bibr" rid="j_nejsds52_ref_008">8</xref>]. For a sample of size <italic>n</italic>, the Horvitz-Thompson estimator <inline-formula id="j_nejsds52_ineq_004"><alternatives><mml:math><mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">X</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover></mml:math><tex-math><![CDATA[$\hat{{T_{X}}}$]]></tex-math></alternatives></inline-formula> for the population total of <italic>X</italic> is 
<disp-formula id="j_nejsds52_eq_001">
<label>(2.1)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">X</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
<mml:mo>=</mml:mo>
<mml:munderover accentunder="false" accent="false">
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
</mml:munderover><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:munderover accentunder="false" accent="false">
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
</mml:munderover><mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ \hat{{T_{X}}}={\sum \limits_{i=1}^{n}}\frac{1}{{\pi _{i}}}{X_{i}}={\sum \limits_{i=1}^{n}}\hat{{X_{i}}}.\]]]></tex-math></alternatives>
</disp-formula> 
The variance estimate is 
<disp-formula id="j_nejsds52_eq_002">
<label>(2.2)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:mover accent="false">
<mml:mrow>
<mml:mi mathvariant="italic">v</mml:mi>
<mml:mi mathvariant="italic">a</mml:mi>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo><mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">X</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
<mml:mo fence="true" stretchy="false">]</mml:mo>
</mml:mrow>
<mml:mo stretchy="true">ˆ</mml:mo></mml:mover>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:munder><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo>−</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mstyle><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ \widehat{var[\hat{{T_{X}}}]}=\sum \limits_{i,j}\frac{{X_{i}}{X_{j}}}{{\pi _{ij}}}-\frac{{X_{i}}}{{\pi _{i}}}\frac{{X_{j}}}{{\pi _{j}}},\]]]></tex-math></alternatives>
</disp-formula> 
which can also be written as 
<disp-formula id="j_nejsds52_eq_003">
<label>(2.3)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:mover accent="false">
<mml:mrow>
<mml:mi mathvariant="italic">v</mml:mi>
<mml:mi mathvariant="italic">a</mml:mi>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo><mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">X</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
<mml:mo fence="true" stretchy="false">]</mml:mo>
</mml:mrow>
<mml:mo stretchy="true">ˆ</mml:mo></mml:mover>
<mml:mo>=</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mi mathvariant="italic">x</mml:mi><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">v</mml:mi>
<mml:mi mathvariant="italic">a</mml:mi>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ \widehat{var[\hat{{T_{X}}}]}=\frac{N-n}{N}x{N^{2}}x\frac{var[X]}{n},\]]]></tex-math></alternatives>
</disp-formula> 
for a population of size <italic>N</italic>. The svymean() function used to produce mean estimates in the app is estimated by dividing the estimated total by the population size <italic>N</italic>. The variance estimate is estimated by dividing the variance estimate for the total by <inline-formula id="j_nejsds52_ineq_005"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${N^{2}}$]]></tex-math></alternatives></inline-formula>. The Horvitz-Thompson estimator of the population size is 
<disp-formula id="j_nejsds52_eq_004">
<label>(2.4)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:munderover accentunder="false" accent="false">
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
</mml:munderover><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ {\sum \limits_{i=1}^{n}}\frac{1}{{\pi _{i}}}.\]]]></tex-math></alternatives>
</disp-formula>
</p>
<p>For estimates in a subpopulation, the survey package handles the computational details of domain estimation and sets the sampling weights to 0 for observations outside of the subpopulation.</p>
</sec>
<sec id="j_nejsds52_s_008">
<label>2.2.2</label>
<title>ggsurvey</title>
<p>The visualizations on the National tab are implemented with the package, <italic>ggsurvey</italic> [<xref ref-type="bibr" rid="j_nejsds52_ref_001">1</xref>] that simplifies ggplot2 functions for svydesign objects. The package’s functions call “ggplot2” to make bar charts, histograms, boxplots, and hexplots of survey objects to accurately represent the weighted sample distributions.</p>
</sec>
<sec id="j_nejsds52_s_009">
<label>2.2.3</label>
<title>R shiny</title>
<p>The interactive app is implemented using the <italic>shiny</italic> package [<xref ref-type="bibr" rid="j_nejsds52_ref_004">4</xref>] and it is deployed on shinyapps.io in its own protected environment with SSL-encrypted access.</p>
<fig id="j_nejsds52_fig_001">
<label>Figure 1</label>
<caption>
<p>National Dashboard Visualization: Showing distribution of psychological distress within the past month stratified by employment status based on NSDUH survey.</p>
</caption>
<graphic xlink:href="nejsds52_g001.jpg"/>
</fig>
</sec>
</sec>
</sec>
<sec id="j_nejsds52_s_010">
<label>3</label>
<title>Results</title>
<p>The U.S. Mental Health Dashboard, found at <uri>https://50lulw-isabel-arvelo.shinyapps.io/USMHD/</uri>, has two major tabs: the <italic>National</italic> tab with national-level estimates from the NSDUH survey, and the <italic>State</italic> tab with state-level estimates from the BRFSS survey. Each tab includes a visualization of the response variable of interest as well as a table view to examine specific estimates with more precision.</p>
<sec id="j_nejsds52_s_011">
<label>3.1</label>
<title>Major Functions in Shiny App</title>
<sec id="j_nejsds52_s_012">
<label>3.1.1</label>
<title>National Level Boxplot, Bar Chart, and Histogram</title>
<p>We begin by visualizing national-level statistics from the NSDUH data set through boxplots and histograms for continuous response variables and bar charts for categorical responses. The visualizations pipe the current input values into the ggsurvey functions to dynamically render the histogram and box plots in response to the user’s selected input values. With the log transform Y radio button, the plots remain in the original scale for interpretability, but the breaks in the y-axis for the boxplot and the x-axis for the histogram are spaced according to the log scale. This allows the user to analyze and compare the centers and spreads of the skewed distributions with greater precision.</p>
<p>Figure <xref rid="j_nejsds52_fig_001">1</xref> shows the distribution of psychological distress within the past month without the log transformation.</p>
</sec>
<sec id="j_nejsds52_s_013">
<label>3.1.2</label>
<title>National Level Summary Table</title>
<p>The table of national-level summaries appears on the National tab and includes data on the estimated population mean standard error, and confidence intervals for the population distributions represented in the visualizations. Logistically, we first find the estimated survey mean, standard error, and confidence interval for the response using the Horvitz-Thompson estimators (see Equations (<xref rid="j_nejsds52_eq_001">2.1</xref>) and (<xref rid="j_nejsds52_eq_002">2.2</xref>)) as calculated in the <italic>survey</italic> package. The total number of individuals represented in the estimated population for each subgroup is calculated using Equation (<xref rid="j_nejsds52_eq_004">2.4</xref>), i.e., adding the sampling weights of the individuals in the survey that fit into each category. Formatted data table objects may be sorted by any of the columns, allowing users to order estimates.</p>
<p>Figure <xref rid="j_nejsds52_fig_002">2</xref> shows the table view of the national distribution of the categorical mental illness variable by gender.</p>
<fig id="j_nejsds52_fig_002">
<label>Figure 2</label>
<caption>
<p>National Dashboard Table: Showing distribution of mental illness category stratified by marital status based on NSDUH survey.</p>
</caption>
<graphic xlink:href="nejsds52_g002.jpg"/>
</fig>
</sec>
<sec id="j_nejsds52_s_014">
<label>3.1.3</label>
<title>State-Level Choropleths</title>
<p>The state-level choropleths visually represent the geographic distribution of response variables of interest in the BRFSS across levels of social factors. In order to increase the speed at which the outputs are rendered, we preprocessed the BRFSS survey object to produce lists of tibbles that represent the distribution of the General Health and Depressive Diagnosis variables across health insurance status, education, age, and race (only for General Health). These tibbles are then merged with a simplified shapefile from the <italic>tigris</italic> [<xref ref-type="bibr" rid="j_nejsds52_ref_014">14</xref>] package that has data on the primary governmental divisions of the 50 states in the United States, as well as the District of Columbia, Puerto Rico, American Samoa, the Commonwealth of the Northern Mariana Islands, Guam, and the U.S. Virgin Islands.</p>
<p>Figure <xref rid="j_nejsds52_fig_003">3</xref> shows the geographic distribution of depressive disorder prevalence by health insurance status.</p>
<fig id="j_nejsds52_fig_003">
<label>Figure 3</label>
<caption>
<p>State-level Choropleth: Showing geographic distribution of depressive diagnosis indicator stratified by health insurance status based on BRFSS survey.</p>
</caption>
<graphic xlink:href="nejsds52_g003.jpg"/>
</fig>
<p>After the variables are specified, a function is applied over the corresponding tibble to render a choropleth with <italic>leaflet</italic> [<xref ref-type="bibr" rid="j_nejsds52_ref_005">5</xref>], a JavaScript library used to build web mapping applications. Each state on the leaflet has hover text including the state name, estimated mean of the selected response variable, and estimated standard error.</p>
</sec>
<sec id="j_nejsds52_s_015">
<label>3.1.4</label>
<title>Table of State-Level Summaries</title>
<p>The state table organizes the data visualized in the choropleths into rows that represent the estimated mean of the selected response variable for each state across the different levels of the selected demographic variable (in columns). We combined the reactive list of tibbles produced by the input calls by variable name using a binary function and then spread the rows by the demographic variable to show the state-level mean estimates for the response variables for all states across all levels of the demographic variable.</p>
<p>Figure <xref rid="j_nejsds52_fig_004">4</xref> shows the table view of the state-level estimates of the prevalence of depressive diagnoses as well as the marginal distribution by health insurance status.</p>
<fig id="j_nejsds52_fig_004">
<label>Figure 4</label>
<caption>
<p>State-level Table: Showing table view of the distribution of depressive diagnosis indicator stratified by health insurance status based on BRFSS survey.</p>
</caption>
<graphic xlink:href="nejsds52_g004.jpg"/>
</fig>
</sec>
</sec>
</sec>
<sec id="j_nejsds52_s_016">
<label>4</label>
<title>Discussion</title>
<p>In this paper, we have presented an interactive web app for exploratory analysis of public data on the prevalence of mental health disorders and symptoms in the U.S. The mental health crisis adversely affects the quality of life in communities across the United States. The infrastructure and capacity to support individuals struggling with mental illness are lacking, especially in minoritized communities with higher mental health burdens due to systemic unjust policies and practices. Identifying and visualizing which communities and marginalized identities have the highest need for mental health care can help public health professionals allocate resources appropriately and promote solidarity within those groups.</p>
<p>A web app is a useful tool in this scenario because raising general mental health awareness requires reaching a wide audience with varied backgrounds and levels of comfort with quantitative information. Large national surveys such as the NSDUH and the BRFSS include a vast quantity of mental health, social, and demographic data that could be used to answer many different questions, but they require complex survey weighting and analysis approaches that are not accessible to the general public. Instead of trying to anticipate what other stakeholders are interested in learning about, the flexibility of the app allows users to explore specific questions and focus on response variables or demographic factors that are relevant to them. Data visualization gives individuals a clear idea of how to make sense of the information by providing a visual context that makes it easier for audiences to identify and understand trends and patterns.</p>
<p>One of the main aims of this app is to raise awareness and reduce the stigma surrounding mental illness. These figures and visualizations show those suffering that they are not alone and eliminate feelings of self-blame associated with something that often feels uncomfortable or shameful to talk about. By illustrating the pervasiveness of psychological distress, particularly for minoritized populations, we hope to encourage those suffering to seek help and find a support network. Secondly, this app could be among the tools referenced by public health practitioners and epidemiological researchers as they design studies or interventions. High-quality mental health services are unevenly accessible in the United States. The supply of psychiatric residential facility beds, inpatient and outpatient services, and mental health providers is simply not adequate to meet the demand, especially in communities that need it most. The U.S. Mental Health Dashboard provides quantification and visualization of significant differences in prevalence between different populations to help public health officials prioritize where and how they are investing resources.</p>
<p>In the future, the scope of the U.S. Mental Health Dashboard could be extended by introducing new datasets such as DHHS Mental Health Client-Level Data as well as the CDC National Health Interview Survey for a more comprehensive and robust representation of the prevalence of mental health disorders in the United States. These would allow for the estimation of specific diagnoses of people in mental health treatment facilities and provide more metrics to capture and triangulate the prevalence of anxiety and depression. The app could also be expanded to include treatment options and healthcare coverage available using the National Mental Health Services Survey and Medicare/Medicaid payment and access data to identify the greatest need-to-care gaps so that public health professionals and government officials can prioritize investing resources in these communities. To investigate how structural racism manifests in contemporary health inequities typically assumed to have primarily biological or cultural causes, we could compare the geographic distribution of mental illness prevalence across different subpopulations to historical geospatial data on redlining.</p>
</sec>
<sec id="j_nejsds52_s_017">
<label>5</label>
<title>Conclusion</title>
<p>Common misconceptions have caused diseases such as major depressive disorder to be perceived as an indication of personal weakness or fault, and deep-rooted stigmatization and lack of accessible treatments deter individuals from seeking help. However, the ubiquitous feelings of fear and uncertainty that during and since the COVID-19 pandemic left an impression on everyone’s mental well-being. The shared experiences of isolation and trauma associated with a global pandemic have promoted a unique form of solidarity that public health professionals can capitalize on to change how society cares for people suffering from mental health challenges and disorders. Promoting dialogue about these issues is an important start, but awareness must be accompanied by action to turn this opportunity into real change. By providing usable data to policymakers, healthcare systems, and healthcare providers, we hope to help guide efforts to improve the mental health and well-being of people and populations.</p>
</sec>
</body>
<back>
<ack id="j_nejsds52_ack_001">
<title>Acknowledgements</title>
<p>The data and code for the app can be found on <uri>https://github.com/isabelarvelo/U.S.-Mental-Health-Dashboard</uri>.</p></ack>
<ref-list id="j_nejsds52_reflist_001">
<title>References</title>
<ref id="j_nejsds52_ref_001">
<label>[1]</label><mixed-citation publication-type="other"><string-name><surname>Alexander</surname>, <given-names>B.</given-names></string-name> (2022). ggsurvey: Simplifying ‘ggplot2’ for Survey Data. R package version 1.0.0. <uri>https://CRAN.R-project.org/package=ggsurvey</uri>.</mixed-citation>
</ref>
<ref id="j_nejsds52_ref_002">
<label>[2]</label><mixed-citation publication-type="other"><sc>CDC-GIS Centers for Disease Control and Prevention</sc> (2018). Online Public Health Maps. <uri>https://www.cdc.gov/gis/public-health-maps.htm</uri>.</mixed-citation>
</ref>
<ref id="j_nejsds52_ref_003">
<label>[3]</label><mixed-citation publication-type="other"><sc>Center for Disease Control (CDC) (2021)</sc>. <italic>Behavioral Risk Factor Surveillance System Survey Data</italic>.</mixed-citation>
</ref>
<ref id="j_nejsds52_ref_004">
<label>[4]</label><mixed-citation publication-type="other"><string-name><surname>Chang</surname>, <given-names>W.</given-names></string-name>, <string-name><surname>Cheng</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Allaire</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Sievert</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Schloerke</surname>, <given-names>B.</given-names></string-name>, <string-name><surname>Xie</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Allen</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>McPherson</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Dipert</surname>, <given-names>A.</given-names></string-name> and <string-name><surname>Borges</surname>, <given-names>B.</given-names></string-name> (2022). shiny: Web Application Framework for R. R package version 1.7.3. <uri>https://CRAN.R-project.org/package=shiny</uri>.</mixed-citation>
</ref>
<ref id="j_nejsds52_ref_005">
<label>[5]</label><mixed-citation publication-type="other"><string-name><surname>Cheng</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Karambelkar</surname>, <given-names>B.</given-names></string-name> and <string-name><surname>Xie</surname>, <given-names>Y.</given-names></string-name> (2022). leaflet: Create Interactive Web Maps with the JavaScript ‘Leaflet’ Library. R package version 2.1.1. <uri>https://CRAN.R-project.org/package=leaflet</uri>.</mixed-citation>
</ref>
<ref id="j_nejsds52_ref_006">
<label>[6]</label><mixed-citation publication-type="journal"><string-name><surname>Coombs</surname>, <given-names>N. C.</given-names></string-name>, <string-name><surname>Meriwether</surname>, <given-names>W. E.</given-names></string-name>, <string-name><surname>Caringi</surname>, <given-names>J.</given-names></string-name> and <string-name><surname>Newcomer</surname>, <given-names>S. R.</given-names></string-name> (<year>2021</year>). <article-title>Barriers to healthcare access among US adults with mental health challenges: A population-based study</article-title>. <source>SSM-Population Health</source> <volume>15</volume> <fpage>100847</fpage>.</mixed-citation>
</ref>
<ref id="j_nejsds52_ref_007">
<label>[7]</label><mixed-citation publication-type="chapter"><string-name><surname>Folsom</surname>, <given-names>R. E.</given-names></string-name> and <string-name><surname>Singh</surname>, <given-names>A. C.</given-names></string-name> (<year>2000</year>). <chapter-title>The generalized exponential model for sampling weight calibration for extreme values, nonresponse, and poststratification</chapter-title>. In <source>Proceedings of the American Statistical Association, Survey Research Methods Section</source> <volume>598603</volume>.</mixed-citation>
</ref>
<ref id="j_nejsds52_ref_008">
<label>[8]</label><mixed-citation publication-type="journal"><string-name><surname>Lumley</surname>, <given-names>T.</given-names></string-name> (<year>2004</year>). <article-title>Analysis of complex survey samples</article-title>. <source>Journal of Statistical Software</source> <volume>9</volume> <fpage>1</fpage>–<lpage>19</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds52_ref_009">
<label>[9]</label><mixed-citation publication-type="other"><string-name><surname>Lumley</surname>, <given-names>T.</given-names></string-name> (2020). <italic>survey: analysis of complex survey samples</italic>. R package version 4.0.</mixed-citation>
</ref>
<ref id="j_nejsds52_ref_010">
<label>[10]</label><mixed-citation publication-type="other"><sc>Mental Health America</sc> (2022). <italic>County and State Data Map: Defining Mental Health Across Communities</italic>. Accessed: 2022-09-29.</mixed-citation>
</ref>
<ref id="j_nejsds52_ref_011">
<label>[11]</label><mixed-citation publication-type="other"><sc>MHTTC</sc> (2022). Southeast MHTTC Data Visualization Resources Accessed: 2022-09-27.</mixed-citation>
</ref>
<ref id="j_nejsds52_ref_012">
<label>[12]</label><mixed-citation publication-type="other"><string-name><surname>Newson</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Sukhoi</surname>, <given-names>O.</given-names></string-name>, <string-name><surname>Taylor</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Topalo</surname>, <given-names>O.</given-names></string-name> and <string-name><surname>Thiagarajan</surname>, <given-names>T.</given-names></string-name> (March 2021). Mental State of the World 2021. Technical Report, Sapien Labs.</mixed-citation>
</ref>
<ref id="j_nejsds52_ref_013">
<label>[13]</label><mixed-citation publication-type="other"><sc>Substance Abuse and Mental Health Services Administration (SAMHSA)</sc> (2020). National Survey on Drug Use and Health 2020.</mixed-citation>
</ref>
<ref id="j_nejsds52_ref_014">
<label>[14]</label><mixed-citation publication-type="other"><string-name><surname>Walker</surname>, <given-names>K.</given-names></string-name> (2022). tigris: Load Census TIGER/Line Shapefiles. R package version 1.6.1. <uri>https://CRAN.R-project.org/package=tigris</uri>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
