<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">NEJSDS</journal-id>
<journal-title-group><journal-title>The New England Journal of Statistics in Data Science</journal-title></journal-title-group>
<issn pub-type="ppub">2693-7166</issn><issn-l>2693-7166</issn-l>
<publisher>
<publisher-name>New England Statistical Society</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">NEJSDS54</article-id>
<article-id pub-id-type="doi">10.51387/23-NEJSDS54</article-id>
<article-categories>
<subj-group subj-group-type="heading"><subject>Methodology Article</subject></subj-group>
<subj-group subj-group-type="area"><subject>Statistical Methodology</subject></subj-group>
</article-categories>
<title-group>
<article-title>AUGUST: An Interpretable, Resolution-based Two-sample Test</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Brown</surname><given-names>Benjamin</given-names></name><email xlink:href="mailto:brownb1@live.unc.edu">brownb1@live.unc.edu</email><xref ref-type="aff" rid="j_nejsds54_aff_001"/><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Zhang</surname><given-names>Kai</given-names></name><email xlink:href="mailto:zhangk@email.unc.edu">zhangk@email.unc.edu</email><xref ref-type="aff" rid="j_nejsds54_aff_002"/>
</contrib>
<aff id="j_nejsds54_aff_001">Chapel Hill, North Carolina, Department of Statistics and Operations Research, <institution>University of North Carolina at Chapel Hill</institution>, <country>USA</country>. E-mail address: <email xlink:href="mailto:brownb1@live.unc.edu">brownb1@live.unc.edu</email></aff>
<aff id="j_nejsds54_aff_002">Chapel Hill, North Carolina, Department of Statistics and Operations Research, <institution>University of North Carolina at Chapel Hill</institution>, <country>USA</country>. E-mail address: <email xlink:href="mailto:zhangk@email.unc.edu">zhangk@email.unc.edu</email></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2024</year></pub-date><pub-date pub-type="epub"><day>15</day><month>12</month><year>2023</year></pub-date><volume>2</volume><issue>3</issue><fpage>357</fpage><lpage>367</lpage><supplementary-material id="S1" content-type="document" xlink:href="nejsds54_s001.pdf" mimetype="application" mime-subtype="pdf">
<caption>
<title>Supplementary Material</title>
<p>Supplementary material for AUGUST.</p>
</caption>
</supplementary-material><history><date date-type="accepted"><day>4</day><month>9</month><year>2023</year></date></history>
<permissions><copyright-statement>© 2024 New England Statistical Society</copyright-statement><copyright-year>2024</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>Two-sample testing is a fundamental problem in statistics. While many powerful nonparametric methods exist for both the univariate and multivariate context, it is comparatively less common to see a framework for determining which data features lead to rejection of the null. In this paper, we propose a new nonparametric two-sample test named AUGUST, which incorporates a framework for interpretation while maintaining power comparable to existing methods. AUGUST tests for inequality in distribution up to a predetermined resolution using symmetry statistics from binary expansion. Designed for univariate and low to moderate-dimensional multivariate data, this construction allows us to understand distributional differences as a combination of fundamental orthogonal signals. Asymptotic theory for the test statistic facilitates p-value computation and power analysis, and an efficient algorithm enables computation on large data sets. In empirical studies, we show that our test has power comparable to that of popular existing methods, as well as greater power in some circumstances. We illustrate the interpretability of our method using NBA shooting data.</p>
</abstract>
<kwd-group>
<label>Keywords and phrases</label>
<kwd>Distributional difference</kwd>
<kwd>Interpretability</kwd>
<kwd>Power</kwd>
<kwd>Symmetry</kwd>
<kwd>Visualization</kwd>
</kwd-group>
<funding-group><award-group><funding-source xlink:href="https://doi.org/10.13039/100000001">NSF</funding-source><award-id>DMS-1613112</award-id><award-id>IIS-1633212</award-id><award-id>DMS-1916237</award-id><award-id>DMS-2152289</award-id></award-group><funding-statement>This research is partially supported by NSF grants DMS-1613112, IIS-1633212, DMS-1916237, and DMS-2152289.</funding-statement></funding-group>
</article-meta>
</front>
<body>
<sec id="j_nejsds54_s_001">
<label>1</label>
<title>Introduction</title>
<sec id="j_nejsds54_s_002">
<label>1.1</label>
<title>Addressing the Two-sample Testing Problem</title>
<p>Two-sample tests are one of the most frequently used methods for statistical inference. While rooted in classical statistics, the two-sample problem is relevant to numerous cutting-edge applications, including high-energy physics [<xref ref-type="bibr" rid="j_nejsds54_ref_011">11</xref>], computer vision [<xref ref-type="bibr" rid="j_nejsds54_ref_014">14</xref>], and genome-wide expression analysis [<xref ref-type="bibr" rid="j_nejsds54_ref_036">36</xref>].</p>
<p>We begin with two samples <inline-formula id="j_nejsds54_ineq_001"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_002"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula>, which may be either univariate or multivariate. In the nonparametric setting, we make minimal assumptions regarding the distributions <italic>F</italic> and <italic>G</italic> used to generate <inline-formula id="j_nejsds54_ineq_003"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_004"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula>, as we test the null hypothesis <inline-formula id="j_nejsds54_ineq_005"><alternatives><mml:math>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi></mml:math><tex-math><![CDATA[$F=G$]]></tex-math></alternatives></inline-formula>. In Section <xref rid="j_nejsds54_s_003">1.2</xref>, we discuss the landscape of existing methods.</p>
<p>While we face no shortage of effective two-sample tests, certain factors may hinder their real-world applicability. For one, we find some nonparametric tests to be more parsimonious than others against the range of potential alternatives. Relatively speaking, a method may excel at detecting location and scale shifts but struggle to catch bimodality when mean and variance are held constant, as one example. We explore this phenomenon in Section <xref rid="j_nejsds54_s_013">5</xref>, showing that the relative performance of well-known univariate tests at detecting a location shift can be reversed by a suitable choice of distribution family. This is unintuitive, as one might expect power against location alternatives to be nearly independent of family.</p>
<p>Furthermore, many tests offer non-transparent rejections of the null hypothesis. While data visualizations and summary statistics offer some degree of explanation for a test, such analyses do not quantify the contribution of various data features to the test’s rejection. For multivariate tests, this problem is compounded, as distributional alternatives may easily exceed human intuition for higher dimensions.</p>
<p>Here, we formulate a new nonparametric two-sample test called AUGUST, an abbreviation of Augmented CDF for Uniform Statistic Transformation. Our method explicitly tests for multiple orthogonal sources of distributional inequality up to a predetermined resolution <italic>d</italic>, giving it power against a wide range of alternatives. Upon rejection of the null, both resolution control and decomposition into orthogonal signals allow for interpretation of how equality in distribution between <inline-formula id="j_nejsds54_ineq_006"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_007"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula> has failed. To promote ease of use, we provide asymptotic theory as well as algorithmic optimizations.</p>
</sec>
<sec id="j_nejsds54_s_003">
<label>1.2</label>
<title>Relatives and Further Reading</title>
<p>Some well-known rank-based tests are designed for the univariate context, including [<xref ref-type="bibr" rid="j_nejsds54_ref_013">13</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_027">27</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_033">33</xref>]. Other approaches explicitly refer to a distance between the empirical cumulative distribution functions of <inline-formula id="j_nejsds54_ineq_008"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_009"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula>. For instance, [<xref ref-type="bibr" rid="j_nejsds54_ref_001">1</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_012">12</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_015">15</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_026">26</xref>] are all widely known. The recent test of [<xref ref-type="bibr" rid="j_nejsds54_ref_016">16</xref>] somewhat combines [<xref ref-type="bibr" rid="j_nejsds54_ref_001">1</xref>] and [<xref ref-type="bibr" rid="j_nejsds54_ref_015">15</xref>].</p>
<p>For nonparametric multivariate methods, the range of approaches is quite broad. Tests based on geometric graphs, including [<xref ref-type="bibr" rid="j_nejsds54_ref_009">9</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_010">10</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_018">18</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_039">39</xref>], have had considerable success [<xref ref-type="bibr" rid="j_nejsds54_ref_005">5</xref>]. Ball divergence [<xref ref-type="bibr" rid="j_nejsds54_ref_003">3</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_036">36</xref>] and energy distance [<xref ref-type="bibr" rid="j_nejsds54_ref_002">2</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_042">42</xref>] are also popular names. Among the family of kernel-based tests are [<xref ref-type="bibr" rid="j_nejsds54_ref_011">11</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_020">20</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_021">21</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_022">22</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_025">25</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_041">41</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_044">44</xref>], while [<xref ref-type="bibr" rid="j_nejsds54_ref_004">4</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_024">24</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_035">35</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_040">40</xref>] use generalized ranks. Additional recent work includes [<xref ref-type="bibr" rid="j_nejsds54_ref_006">6</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_007">7</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_028">28</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_031">31</xref>].</p>
<p>As for interpretable methods, one line of work [<xref ref-type="bibr" rid="j_nejsds54_ref_025">25</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_044">44</xref>] proposes feature selection in the framework of maximum mean discrepancy [<xref ref-type="bibr" rid="j_nejsds54_ref_021">21</xref>]. Similar in principle is [<xref ref-type="bibr" rid="j_nejsds54_ref_034">34</xref>]. The test of [<xref ref-type="bibr" rid="j_nejsds54_ref_031">31</xref>] inherits the interpretability of the classifier on which the test is based. These methods provide a global type of interpretability, compared to the study of local significant differences [<xref ref-type="bibr" rid="j_nejsds54_ref_017">17</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_023">23</xref>]. To put AUGUST in context, we propose that our method is more geometrical than feature-selecting tests, but more global than local significant difference methods.</p>
<p>Regarding methodological relatives, the use of subsampling has been considered for the two-sample location and scale problems [<xref ref-type="bibr" rid="j_nejsds54_ref_032">32</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_037">37</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_038">38</xref>]. In addition, our method builds on the binary expansion framework [<xref ref-type="bibr" rid="j_nejsds54_ref_045">45</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_046">46</xref>], which has applications to resolution-based nonparametric models of dependency [<xref ref-type="bibr" rid="j_nejsds54_ref_008">8</xref>]. Using an underlying binary expansion framework, we furnish substantial methodological and algorithmic innovations to yield a practical test in the two-sample context.</p>
</sec>
</sec>
<sec id="j_nejsds54_s_004">
<label>2</label>
<title>Derivation of a Statistic</title>
<sec id="j_nejsds54_s_005">
<label>2.1</label>
<title>Motivation from the Probability Integral Transformation</title>
<p>We begin by introducing our procedure in the context of univariate data. Given independent samples <inline-formula id="j_nejsds54_ineq_010"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${\{{\boldsymbol{X}_{i}}\}_{i=1}^{m}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_011"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${\{{\boldsymbol{Y}_{i}}\}_{i=1}^{n}}$]]></tex-math></alternatives></inline-formula>, where <inline-formula id="j_nejsds54_ineq_012"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">∼</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi></mml:math><tex-math><![CDATA[${\boldsymbol{X}_{i}}\sim G$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_013"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">∼</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi></mml:math><tex-math><![CDATA[${\boldsymbol{Y}_{i}}\sim F$]]></tex-math></alternatives></inline-formula>, we are interested in testing 
<disp-formula id="j_nejsds54_eq_001">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mspace width="2.5pt"/>
<mml:mtext>vs.</mml:mtext>
<mml:mspace width="2.5pt"/>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">a</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo stretchy="false">≠</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ {H_{0}}:F=G\hspace{2.5pt}\text{vs.}\hspace{2.5pt}{H_{a}}:F\ne G.\]]]></tex-math></alternatives>
</disp-formula> 
For our purposes, we will assume that <italic>F</italic> and <italic>G</italic> are absolutely continuous functions. We adopt boldface type in <inline-formula id="j_nejsds54_ineq_014"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${\{{\boldsymbol{X}_{i}}\}_{i=1}^{m}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_015"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${\{{\boldsymbol{Y}_{i}}\}_{i=1}^{n}}$]]></tex-math></alternatives></inline-formula> to indicate that these are collections of random quantities. In addition, we use blackboard bold <inline-formula id="j_nejsds54_ineq_016"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">A</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\mathbb{P}(A)$]]></tex-math></alternatives></inline-formula> to refer to the probability of an event <italic>A</italic>.</p>
<p>To begin, imagine a one-sample setting where we test <inline-formula id="j_nejsds54_ineq_017"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>:</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">∼</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi></mml:math><tex-math><![CDATA[${H_{0}}:{\boldsymbol{X}_{i}}\sim F$]]></tex-math></alternatives></inline-formula>, with <italic>F</italic> known. It is a well-known result that <inline-formula id="j_nejsds54_ineq_018"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">∼</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi></mml:math><tex-math><![CDATA[${\boldsymbol{X}_{i}}\sim F$]]></tex-math></alternatives></inline-formula> if and only if the transformed variables <inline-formula id="j_nejsds54_ineq_019"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\{F({\boldsymbol{X}_{i}}):i\in [m]\}$]]></tex-math></alternatives></inline-formula> follow a Uniform<inline-formula id="j_nejsds54_ineq_020"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(0,1)$]]></tex-math></alternatives></inline-formula> distribution. Given this fact, we can test the goodness-of-fit of <italic>F</italic> by testing of the uniformity of the collection <inline-formula id="j_nejsds54_ineq_021"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\{F({\boldsymbol{X}_{i}}):i\in [m]\}$]]></tex-math></alternatives></inline-formula>. Moreover, examining how <inline-formula id="j_nejsds54_ineq_022"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\{F({\boldsymbol{X}_{i}}):i\in [m]\}$]]></tex-math></alternatives></inline-formula> fails to be uniform indicates why <italic>F</italic> does not fit the distribution of <inline-formula id="j_nejsds54_ineq_023"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula>.</p>
<p>Returning to the two-sample setting, the same intuition holds true: we might construct transformed variables that are nearly uniform in <inline-formula id="j_nejsds54_ineq_024"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$[0,1]$]]></tex-math></alternatives></inline-formula> when <inline-formula id="j_nejsds54_ineq_025"><alternatives><mml:math>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi></mml:math><tex-math><![CDATA[$F=G$]]></tex-math></alternatives></inline-formula>, and that are not uniform otherwise. When the distributions of the two samples are different, the way that uniformity fails should be informative.</p>
<p>Given the fact that the transformed variables <inline-formula id="j_nejsds54_ineq_026"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\{G({\boldsymbol{X}_{i}}):i\in [m]\}$]]></tex-math></alternatives></inline-formula> follow a uniform distribution, an intuitive choice would be <inline-formula id="j_nejsds54_ineq_027"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\{{\hat{F}_{\boldsymbol{Y}}}({\boldsymbol{X}_{i}}):i\in [m]\}$]]></tex-math></alternatives></inline-formula>, where <inline-formula id="j_nejsds54_ineq_028"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\hat{F}_{\boldsymbol{Y}}}$]]></tex-math></alternatives></inline-formula> is the empirical cumulative distribution function of <inline-formula id="j_nejsds54_ineq_029"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula>: 
<disp-formula id="j_nejsds54_eq_002">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:munderover accentunder="false" accent="false">
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:mi mathvariant="italic">I</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ {\hat{F}_{\boldsymbol{Y}}}(t)=\frac{1}{n}{\sum \limits_{i=1}^{n}}I({\boldsymbol{Y}_{i}}\le t).\]]]></tex-math></alternatives>
</disp-formula> 
The binary expansion testing framework introduced in [<xref ref-type="bibr" rid="j_nejsds54_ref_045">45</xref>] provides a way to test <inline-formula id="j_nejsds54_ineq_030"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\{{\hat{F}_{\boldsymbol{Y}}}({\boldsymbol{X}_{i}}):i\in [m]\}$]]></tex-math></alternatives></inline-formula> for uniformity up to a given binary depth <italic>d</italic>, which is equivalent to testing multinomial uniformity over dyadic fractions <inline-formula id="j_nejsds54_ineq_031"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\{1/{2^{d}},\dots ,1\}$]]></tex-math></alternatives></inline-formula>. In particular, we define the random vector <inline-formula id="j_nejsds54_ineq_032"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">P</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{P}$]]></tex-math></alternatives></inline-formula> of length <inline-formula id="j_nejsds54_ineq_033"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${2^{d}}$]]></tex-math></alternatives></inline-formula> such that, for <inline-formula id="j_nejsds54_ineq_034"><alternatives><mml:math>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo stretchy="false">≤</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$1\le i\le {2^{d}}$]]></tex-math></alternatives></inline-formula>, 
<disp-formula id="j_nejsds54_eq_003">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="normal">#</mml:mi>
<mml:mo maxsize="2.03em" minsize="2.03em" fence="true">{</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>:</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mfenced separators="" open="[" close=")">
<mml:mrow>
<mml:mstyle displaystyle="false">
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo mathvariant="normal">,</mml:mo><mml:mstyle displaystyle="false">
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
</mml:mrow>
</mml:mfenced>
<mml:mo maxsize="2.03em" minsize="2.03em" fence="true">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ {\boldsymbol{P}_{i}}=\frac{\mathrm{\# }\bigg\{k:{\hat{F}_{\boldsymbol{Y}}}({\boldsymbol{X}_{k}})\in \left[\frac{i-1}{{2^{d}}},\frac{i}{{2^{d}}}\right)\bigg\}}{m}.\]]]></tex-math></alternatives>
</disp-formula> 
That is, <inline-formula id="j_nejsds54_ineq_035"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">P</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{P}$]]></tex-math></alternatives></inline-formula> counts the number of transformed observations falling in dyadic intervals of width <inline-formula id="j_nejsds54_ineq_036"><alternatives><mml:math>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$1/{2^{d}}$]]></tex-math></alternatives></inline-formula>. The associated vector <inline-formula id="j_nejsds54_ineq_037"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">S</mml:mi>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
<mml:mi mathvariant="bold-italic">P</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{S}={\mathbf{H}_{{2^{d}}}}\boldsymbol{P}$]]></tex-math></alternatives></inline-formula> is said to contain <italic>symmetry statistics</italic>, where <inline-formula id="j_nejsds54_ineq_038"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\mathbf{H}_{{2^{d}}}}$]]></tex-math></alternatives></inline-formula> is the Hadamard matrix of size <inline-formula id="j_nejsds54_ineq_039"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${2^{d}}$]]></tex-math></alternatives></inline-formula> according to Sylvester’s construction. As the top row of <inline-formula id="j_nejsds54_ineq_040"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\mathbf{H}_{{2^{d}}}}$]]></tex-math></alternatives></inline-formula> contains only ones, the first coordinate of <inline-formula id="j_nejsds54_ineq_041"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">S</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{S}$]]></tex-math></alternatives></inline-formula> is always equal to <inline-formula id="j_nejsds54_ineq_042"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mo largeop="false" movablelimits="false">∑</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[${\textstyle\sum _{i=1}^{{2^{d}}}}{\boldsymbol{P}_{i}}=1$]]></tex-math></alternatives></inline-formula>, and we may as well restrict our attention to <inline-formula id="j_nejsds54_ineq_043"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{-1}}$]]></tex-math></alternatives></inline-formula>, dropping the first component. As shown in [<xref ref-type="bibr" rid="j_nejsds54_ref_045">45</xref>], <inline-formula id="j_nejsds54_ineq_044"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{-1}}$]]></tex-math></alternatives></inline-formula> is a sufficient statistic for uniformity in the one sample setting, and the binary expansion test based on <inline-formula id="j_nejsds54_ineq_045"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{-1}}$]]></tex-math></alternatives></inline-formula> achieves the minimax rate in sample size required for power against a wide variety of alternatives.</p>
<p>We can think of <inline-formula id="j_nejsds54_ineq_046"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{-1}}$]]></tex-math></alternatives></inline-formula> in a signal-processing context: the Hadamard transform maps the vector of cell probabilities <inline-formula id="j_nejsds54_ineq_047"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">P</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{P}$]]></tex-math></alternatives></inline-formula> in the physical domain to the vector of symmetries <inline-formula id="j_nejsds54_ineq_048"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{-1}}$]]></tex-math></alternatives></inline-formula> in the frequency domain. This transformation is advantageous since, in the one sample setting, the entries of <inline-formula id="j_nejsds54_ineq_049"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{-1}}$]]></tex-math></alternatives></inline-formula> have mean zero and are pairwise uncorrelated under the null. As a result, fluctuations of <inline-formula id="j_nejsds54_ineq_050"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{-1}}$]]></tex-math></alternatives></inline-formula> away from <inline-formula id="j_nejsds54_ineq_051"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mn mathvariant="bold">0</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\mathbf{0}_{{2^{d}}-1}}$]]></tex-math></alternatives></inline-formula> unambiguously support the alternative, and the coordinates of <inline-formula id="j_nejsds54_ineq_052"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{-1}}$]]></tex-math></alternatives></inline-formula> are interpretable as orthogonal signals of nonuniformity. Moreover, the vector <inline-formula id="j_nejsds54_ineq_053"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">P</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{P}$]]></tex-math></alternatives></inline-formula> always satisfies <inline-formula id="j_nejsds54_ineq_054"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mo largeop="false" movablelimits="false">∑</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[${\textstyle\sum _{i=1}^{{2^{d}}}}{\boldsymbol{P}_{i}}=1$]]></tex-math></alternatives></inline-formula>, meaning that the mass of <inline-formula id="j_nejsds54_ineq_055"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">P</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{P}$]]></tex-math></alternatives></inline-formula> is constrained to a <inline-formula id="j_nejsds54_ineq_056"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$({2^{d}}-1)$]]></tex-math></alternatives></inline-formula>-dimensional hyperplane in <inline-formula id="j_nejsds54_ineq_057"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\mathbb{R}^{{2^{d}}}}$]]></tex-math></alternatives></inline-formula>. In contrast, the vector <inline-formula id="j_nejsds54_ineq_058"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{-1}}$]]></tex-math></alternatives></inline-formula> is non-degenerate and summarizes the same information about non-uniformity with greater efficiency. We elaborate on the interpretability of <inline-formula id="j_nejsds54_ineq_059"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{-1}}$]]></tex-math></alternatives></inline-formula> in Section <xref rid="j_nejsds54_s_008">2.4</xref>.</p>
<p>One possible choice of test statistic is the quantity <inline-formula id="j_nejsds54_ineq_060"><alternatives><mml:math>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo stretchy="false">‖</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:msubsup>
<mml:mrow>
<mml:mo stretchy="false">‖</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[$S=\| {\boldsymbol{S}_{-1}}{\| _{2}^{2}}$]]></tex-math></alternatives></inline-formula>. A test based on <italic>S</italic> is essentially a <inline-formula id="j_nejsds54_ineq_061"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">χ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\chi ^{2}}$]]></tex-math></alternatives></inline-formula> test and has decent power at detecting <inline-formula id="j_nejsds54_ineq_062"><alternatives><mml:math>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo stretchy="false">≠</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi></mml:math><tex-math><![CDATA[$F\ne G$]]></tex-math></alternatives></inline-formula>. However, we can substantially improve the power by modifying our construction of <inline-formula id="j_nejsds54_ineq_063"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">P</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{P}$]]></tex-math></alternatives></inline-formula>.</p>
</sec>
<sec id="j_nejsds54_s_006">
<label>2.2</label>
<title>An Augmented Cumulative Distribution Function</title>
<p>For our testing purposes, recall that we are only interested in the uniformity of <inline-formula id="j_nejsds54_ineq_064"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\{{\hat{F}_{\boldsymbol{Y}}}({\boldsymbol{X}_{i}}),i\in [m]\}$]]></tex-math></alternatives></inline-formula> up to binary depth <italic>d</italic>. The range of <inline-formula id="j_nejsds54_ineq_065"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\hat{F}_{\boldsymbol{Y}}}(x)$]]></tex-math></alternatives></inline-formula> as a function of <italic>x</italic> comprises <inline-formula id="j_nejsds54_ineq_066"><alternatives><mml:math>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$n+1$]]></tex-math></alternatives></inline-formula> possible values, namely, <inline-formula id="j_nejsds54_ineq_067"><alternatives><mml:math>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$0,1/n,\dots ,1$]]></tex-math></alternatives></inline-formula>. However, in our construction of cell counts <inline-formula id="j_nejsds54_ineq_068"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">P</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{P}$]]></tex-math></alternatives></inline-formula>, the collection <inline-formula id="j_nejsds54_ineq_069"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\{{\hat{F}_{\boldsymbol{Y}}}({\boldsymbol{X}_{i}}),i\in [m]\}$]]></tex-math></alternatives></inline-formula> is binned across <inline-formula id="j_nejsds54_ineq_070"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${2^{d}}$]]></tex-math></alternatives></inline-formula>-many dyadic intervals of depth <italic>d</italic>. Whenever <inline-formula id="j_nejsds54_ineq_071"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">&lt;</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi></mml:math><tex-math><![CDATA[${2^{d}}\lt m$]]></tex-math></alternatives></inline-formula>, some distinct values in the range of <inline-formula id="j_nejsds54_ineq_072"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mo>·</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\hat{F}_{\boldsymbol{Y}}}(\cdot )$]]></tex-math></alternatives></inline-formula> correspond to the same dyadic interval by the pigeonhole principle, which indicates that a coarser transformation than <inline-formula id="j_nejsds54_ineq_073"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mo>·</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\hat{F}_{\boldsymbol{Y}}}(\cdot )$]]></tex-math></alternatives></inline-formula> should work at least as well, and possibly better. Our approach is to consider transformed variables <inline-formula id="j_nejsds54_ineq_074"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\hat{F}_{{\boldsymbol{Y}^{\ast }}}}({\boldsymbol{X}_{i}})$]]></tex-math></alternatives></inline-formula> based on a small, random subsample <inline-formula id="j_nejsds54_ineq_075"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\boldsymbol{Y}^{\ast }}$]]></tex-math></alternatives></inline-formula> of some size <italic>r</italic> from <inline-formula id="j_nejsds54_ineq_076"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula>. The following discussion makes this alternate process explicit. In addition, we comment on the success of this approach in Section <xref rid="j_nejsds54_s_012">4</xref>, and we include empirical power comparisons against the non-subsampled transformation <inline-formula id="j_nejsds54_ineq_077"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mo>·</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\hat{F}_{\boldsymbol{Y}}}(\cdot )$]]></tex-math></alternatives></inline-formula> in the supplementary materials.</p>
<p>Let <inline-formula id="j_nejsds54_ineq_078"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\boldsymbol{Y}^{\ast }}$]]></tex-math></alternatives></inline-formula> be a random subsample from <inline-formula id="j_nejsds54_ineq_079"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula> of size <inline-formula id="j_nejsds54_ineq_080"><alternatives><mml:math>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$r={2^{d+1}}-1$]]></tex-math></alternatives></inline-formula>. We explain this choice of <italic>r</italic> momentarily. For any <inline-formula id="j_nejsds54_ineq_081"><alternatives><mml:math>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="double-struck">R</mml:mi></mml:math><tex-math><![CDATA[$x\in \mathbb{R}$]]></tex-math></alternatives></inline-formula> and integer <inline-formula id="j_nejsds54_ineq_082"><alternatives><mml:math>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo stretchy="false">≤</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$1\le k\le {2^{d}}$]]></tex-math></alternatives></inline-formula>, let <inline-formula id="j_nejsds54_ineq_083"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\boldsymbol{p}_{k}}(x)$]]></tex-math></alternatives></inline-formula> be the probability, conditional on <inline-formula id="j_nejsds54_ineq_084"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula>, that either <inline-formula id="j_nejsds54_ineq_085"><alternatives><mml:math>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>2</mml:mn></mml:math><tex-math><![CDATA[$2k-2$]]></tex-math></alternatives></inline-formula> or <inline-formula id="j_nejsds54_ineq_086"><alternatives><mml:math>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$2k-1$]]></tex-math></alternatives></inline-formula> elements of <inline-formula id="j_nejsds54_ineq_087"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\boldsymbol{Y}^{\ast }}$]]></tex-math></alternatives></inline-formula> are less than or equal to <italic>x</italic>. The probabilities <inline-formula id="j_nejsds54_ineq_088"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\boldsymbol{p}_{k}}(x)$]]></tex-math></alternatives></inline-formula> are essentially hypergeometric and simple to compute: 
<disp-formula id="j_nejsds54_eq_004">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mtd>
<mml:mtd class="align-even">
<mml:mo>=</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mfenced separators="" open="(" close=")">
<mml:mfrac linethickness="0.0pt">
<mml:mrow>
<mml:mi mathvariant="normal">#</mml:mi>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>:</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:mfrac>
</mml:mfenced>
<mml:mfenced separators="" open="(" close=")">
<mml:mfrac linethickness="0.0pt">
<mml:mrow>
<mml:mi mathvariant="normal">#</mml:mi>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>:</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">&gt;</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mfenced separators="" open="(" close=")">
<mml:mfrac linethickness="0.0pt">
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mfrac>
</mml:mfenced>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="align-odd"/>
<mml:mtd class="align-even">
<mml:mspace width="1em"/>
<mml:mo>+</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mfenced separators="" open="(" close=")">
<mml:mfrac linethickness="0.0pt">
<mml:mrow>
<mml:mi mathvariant="normal">#</mml:mi>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>:</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mfrac>
</mml:mfenced>
<mml:mfenced separators="" open="(" close=")">
<mml:mfrac linethickness="0.0pt">
<mml:mrow>
<mml:mi mathvariant="normal">#</mml:mi>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>:</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">&gt;</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mfenced separators="" open="(" close=")">
<mml:mfrac linethickness="0.0pt">
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mfrac>
</mml:mfenced>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[\begin{aligned}{}{\boldsymbol{p}_{k}}(x)& =\frac{\left(\genfrac{}{}{0.0pt}{}{\mathrm{\# }\{i:{\boldsymbol{Y}_{i}}\le x\}}{2k-2}\right)\left(\genfrac{}{}{0.0pt}{}{\mathrm{\# }\{i:{\boldsymbol{Y}_{i}}\gt x\}}{{2^{d+1}}-1-(2k-2)}\right)}{\left(\genfrac{}{}{0.0pt}{}{n}{{2^{d+1}}-1}\right)}\\ {} & \hspace{1em}+\frac{\left(\genfrac{}{}{0.0pt}{}{\mathrm{\# }\{i:{\boldsymbol{Y}_{i}}\le x\}}{2k-1}\right)\left(\genfrac{}{}{0.0pt}{}{\mathrm{\# }\{i:{\boldsymbol{Y}_{i}}\gt x\}}{{2^{d+1}}-1-(2k-1)}\right)}{\left(\genfrac{}{}{0.0pt}{}{n}{{2^{d+1}}-1}\right)}.\end{aligned}\]]]></tex-math></alternatives>
</disp-formula> 
Using the scalar function <inline-formula id="j_nejsds54_ineq_089"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mo>·</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\boldsymbol{p}_{k}}(\cdot )$]]></tex-math></alternatives></inline-formula>, we define the <inline-formula id="j_nejsds54_ineq_090"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">P</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="double-struck">R</mml:mi>
<mml:mo stretchy="false">→</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$\boldsymbol{P}:\mathbb{R}\to {\mathbb{R}^{{2^{d}}}}$]]></tex-math></alternatives></inline-formula> such that, for each coordinate <italic>k</italic>, 
<disp-formula id="j_nejsds54_eq_005">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mspace width="2.5pt"/>
<mml:mtext>for</mml:mtext>
<mml:mspace width="2.5pt"/>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo stretchy="false">≤</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ {\boldsymbol{P}_{k}}(x)={\boldsymbol{p}_{k}}(x),\hspace{2.5pt}\text{for}\hspace{2.5pt}1\le k\le {2^{d}}.\]]]></tex-math></alternatives>
</disp-formula> 
It holds that <inline-formula id="j_nejsds54_ineq_091"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mfenced separators="" open="[" close="]">
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:math><tex-math><![CDATA[${\hat{F}_{{\boldsymbol{Y}^{\ast }}}}(x)\in \left[(k-1)/{2^{d}},k/{2^{d}}\right]$]]></tex-math></alternatives></inline-formula> when exactly <inline-formula id="j_nejsds54_ineq_092"><alternatives><mml:math>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>2</mml:mn></mml:math><tex-math><![CDATA[$2k-2$]]></tex-math></alternatives></inline-formula> or <inline-formula id="j_nejsds54_ineq_093"><alternatives><mml:math>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$2k-1$]]></tex-math></alternatives></inline-formula> subsampled elements in <inline-formula id="j_nejsds54_ineq_094"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\boldsymbol{Y}^{\ast }}$]]></tex-math></alternatives></inline-formula> are less than or equal to <italic>x</italic>. Therefore, we could equally say that for <inline-formula id="j_nejsds54_ineq_095"><alternatives><mml:math>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo stretchy="false">≤</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$1\le k\le {2^{d}}$]]></tex-math></alternatives></inline-formula>, 
<disp-formula id="j_nejsds54_eq_006">
<label>(2.1)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mtd>
<mml:mtd class="align-even">
<mml:mo>=</mml:mo>
<mml:mi mathvariant="double-struck">P</mml:mi>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mfenced separators="" open="[" close="]">
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo mathvariant="normal">,</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
</mml:mrow>
</mml:mfenced>
<mml:mo fence="true" maxsize="2.03em" minsize="2.03em" stretchy="true">|</mml:mo>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[\begin{aligned}{}{\boldsymbol{P}_{k}}(x)& =\mathbb{P}\left({\hat{F}_{{\boldsymbol{Y}^{\ast }}}}(x)\in \left[\frac{k-1}{{2^{d}}},\frac{k}{{2^{d}}}\right]\bigg|\boldsymbol{Y}\right).\end{aligned}\]]]></tex-math></alternatives>
</disp-formula> 
It is in precisely this sense that <inline-formula id="j_nejsds54_ineq_096"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\boldsymbol{P}(x)$]]></tex-math></alternatives></inline-formula> can be considered an <italic>augmented cumulative distribution function</italic>: instead of mapping <italic>x</italic> to a single value in the unit interval, <inline-formula id="j_nejsds54_ineq_097"><alternatives><mml:math>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo stretchy="false">↦</mml:mo>
<mml:mi mathvariant="bold-italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$x\mapsto \boldsymbol{P}(x)$]]></tex-math></alternatives></inline-formula> maps <italic>x</italic> to a distribution. Moreover, this characterization explains the choice of subsample size <inline-formula id="j_nejsds54_ineq_098"><alternatives><mml:math>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$r={2^{d+1}}-1$]]></tex-math></alternatives></inline-formula>. Any <italic>r</italic> satisfying <inline-formula id="j_nejsds54_ineq_099"><alternatives><mml:math>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">q</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$r={2^{q}}-1$]]></tex-math></alternatives></inline-formula>, <inline-formula id="j_nejsds54_ineq_100"><alternatives><mml:math>
<mml:mi mathvariant="italic">q</mml:mi>
<mml:mo stretchy="false">≥</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi></mml:math><tex-math><![CDATA[$q\ge d$]]></tex-math></alternatives></inline-formula>, guarantees that the discrete random variable <inline-formula id="j_nejsds54_ineq_101"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\hat{F}_{{\boldsymbol{Y}^{\ast }}}}(x)$]]></tex-math></alternatives></inline-formula> has the same number of point masses inside every interval of the form <inline-formula id="j_nejsds54_ineq_102"><alternatives><mml:math>
<mml:mfenced separators="" open="[" close="]">
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:math><tex-math><![CDATA[$\left[(k-1)/{2^{d}},k/{2^{d}}\right]$]]></tex-math></alternatives></inline-formula>. In Section <xref rid="j_nejsds54_s_012">4</xref>, we give further intuition behind the meaning of <italic>q</italic>, and in the supplementary materials, we asses our default choice of <inline-formula id="j_nejsds54_ineq_103"><alternatives><mml:math>
<mml:mi mathvariant="italic">q</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$q=d+1$]]></tex-math></alternatives></inline-formula> empirically.</p>
<p>To collect information about every <inline-formula id="j_nejsds54_ineq_104"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{X}_{i}}$]]></tex-math></alternatives></inline-formula>, we define the vector <inline-formula id="j_nejsds54_ineq_105"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{P}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> to be the average of all <inline-formula id="j_nejsds54_ineq_106"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\boldsymbol{P}({\boldsymbol{X}_{i}})$]]></tex-math></alternatives></inline-formula>: 
<disp-formula id="j_nejsds54_eq_007">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:munderover accentunder="false" accent="false">
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:mi mathvariant="bold-italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ {\boldsymbol{P}_{\boldsymbol{X}}}=\frac{1}{m}{\sum \limits_{i=1}^{m}}\boldsymbol{P}({\boldsymbol{X}_{i}}).\]]]></tex-math></alternatives>
</disp-formula> 
Given that the formula for <inline-formula id="j_nejsds54_ineq_107"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\boldsymbol{p}_{k}}(x)$]]></tex-math></alternatives></inline-formula> is computed from hypergeometric probabilities, we refer the coordinates of <inline-formula id="j_nejsds54_ineq_108"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{P}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> as <italic>hypergeometric cell probabilities</italic>. Just as we expect the distribution of the transformed variables <inline-formula id="j_nejsds54_ineq_109"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\{{\hat{F}_{\boldsymbol{Y}}}({\boldsymbol{X}_{i}}):i\in [m]\}$]]></tex-math></alternatives></inline-formula> to be uniform under the null, we expect the mass of <inline-formula id="j_nejsds54_ineq_110"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{P}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> to be nearly uniform over its coordinates. The vector of symmetry statistics <inline-formula id="j_nejsds54_ineq_111"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{X}}}={({\mathbf{H}_{{2^{d}}}}{\boldsymbol{P}_{\boldsymbol{X}}})_{-1}}$]]></tex-math></alternatives></inline-formula> quantifies non-uniformity in <inline-formula id="j_nejsds54_ineq_112"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{P}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula>.</p>
<p>Importantly, the cell probabilities in <inline-formula id="j_nejsds54_ineq_113"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\boldsymbol{P}(x)$]]></tex-math></alternatives></inline-formula> are computed with reference to a subsampling procedure, but without actually subsampling. As the discussion above suggests, these probabilities could indeed be approximated by a bootstrap procedure: take many subsamples <inline-formula id="j_nejsds54_ineq_114"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\boldsymbol{Y}^{\ast }}$]]></tex-math></alternatives></inline-formula> of size <inline-formula id="j_nejsds54_ineq_115"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[${2^{d+1}}-1$]]></tex-math></alternatives></inline-formula> from <inline-formula id="j_nejsds54_ineq_116"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula>, compute <inline-formula id="j_nejsds54_ineq_117"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\hat{F}_{{\boldsymbol{Y}^{\ast }}}}(x)$]]></tex-math></alternatives></inline-formula> each time, and bin the results as cell counts at intervals of <inline-formula id="j_nejsds54_ineq_118"><alternatives><mml:math>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$1/{2^{d}}$]]></tex-math></alternatives></inline-formula>. The exact cell probabilities <inline-formula id="j_nejsds54_ineq_119"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\boldsymbol{P}(x)$]]></tex-math></alternatives></inline-formula> derived above are the limiting values of this bootstrap procedure as the number of subsamples tends to infinity. The following theorem makes this result explicit.</p><statement id="j_nejsds54_stat_001"><label>Theorem 1.</label>
<p><italic>Let Y be a fixed vector of length at least</italic> <inline-formula id="j_nejsds54_ineq_120"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${2^{d+1}}$]]></tex-math></alternatives></inline-formula><italic>, and let</italic> <inline-formula id="j_nejsds54_ineq_121"><alternatives><mml:math>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mi mathvariant="double-struck">R</mml:mi></mml:math><tex-math><![CDATA[$x\in \mathbb{R}$]]></tex-math></alternatives></inline-formula><italic>. Consider the following bootstrap method for computing a vector</italic> <inline-formula id="j_nejsds54_ineq_122"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\boldsymbol{P}^{\ast }}(x)$]]></tex-math></alternatives></inline-formula> <italic>using K subsamples from Y.</italic> 
<list>
<list-item id="j_nejsds54_li_001">
<label>1.</label>
<p><italic>Take bootstrap subsamples</italic> <inline-formula id="j_nejsds54_ineq_123"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${\boldsymbol{Y}_{k}^{\ast }}$]]></tex-math></alternatives></inline-formula> <italic>of size</italic> <inline-formula id="j_nejsds54_ineq_124"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[${2^{d+1}}-1$]]></tex-math></alternatives></inline-formula> <italic>from Y without replacement, for subsamples</italic> <inline-formula id="j_nejsds54_ineq_125"><alternatives><mml:math>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">K</mml:mi></mml:math><tex-math><![CDATA[$1\le k\le K$]]></tex-math></alternatives></inline-formula><italic>.</italic></p>
</list-item>
<list-item id="j_nejsds54_li_002">
<label>2.</label>
<p><italic>Compute</italic> <inline-formula id="j_nejsds54_ineq_126"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\hat{F}_{{\boldsymbol{Y}_{k}^{\ast }}}}(x)$]]></tex-math></alternatives></inline-formula><italic>, for subsamples</italic> <inline-formula id="j_nejsds54_ineq_127"><alternatives><mml:math>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">K</mml:mi></mml:math><tex-math><![CDATA[$1\le k\le K$]]></tex-math></alternatives></inline-formula><italic>.</italic></p>
</list-item>
<list-item id="j_nejsds54_li_003">
<label>3.</label>
<p><italic>Set</italic> <inline-formula id="j_nejsds54_ineq_128"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="normal">#</mml:mi>
<mml:mfenced separators="" open="{" close="}">
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>:</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mfenced separators="" open="[" close=")">
<mml:mrow>
<mml:mstyle displaystyle="false">
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo mathvariant="normal">,</mml:mo><mml:mstyle displaystyle="false">
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:mi mathvariant="italic">K</mml:mi></mml:math><tex-math><![CDATA[${\boldsymbol{P}_{i}^{\ast }}(x)=\mathrm{\# }\left\{k:{\hat{F}_{{\boldsymbol{Y}_{k}^{\ast }}}}(x)\in \left[\frac{i-1}{{2^{d}}},\frac{i}{{2^{d}}}\right)\right\}/K$]]></tex-math></alternatives></inline-formula><italic>, for coordinates</italic> <inline-formula id="j_nejsds54_ineq_129"><alternatives><mml:math>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo stretchy="false">≤</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$1\le i\le {2^{d}}$]]></tex-math></alternatives></inline-formula><italic>.</italic></p>
<p><italic>It follows that</italic> 
<disp-formula id="j_nejsds54_eq_008">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:mi mathvariant="double-struck">P</mml:mi>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:munder>
<mml:mrow>
<mml:mo movablelimits="false">lim</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">K</mml:mi>
<mml:mo stretchy="false">→</mml:mo>
<mml:mi>∞</mml:mi>
</mml:mrow>
</mml:munder>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="bold-italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfenced>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ \mathbb{P}\left(\underset{K\to \infty }{\lim }{\boldsymbol{P}^{\ast }}(x)=\boldsymbol{P}(x)\right)=1,\]]]></tex-math></alternatives>
</disp-formula> 
<italic>where the probability is taken over the randomness of the subsampling, and</italic> <inline-formula id="j_nejsds54_ineq_130"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\boldsymbol{P}(x)$]]></tex-math></alternatives></inline-formula> <italic>is the augmented cumulative distribution function based on Y.</italic></p>
</list-item>
</list>
</p></statement>
<p>Theorem <xref rid="j_nejsds54_stat_001">1</xref> shows that the hypergeometric cell probabilities are equivalent to the limiting values of a certain bootstrap procedure. Effectively, one could say that actual subsampling is a valid way to approximate <inline-formula id="j_nejsds54_ineq_131"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{P}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula>. In practice, it is much faster to directly compute the limiting hypergeometric probabilities. While the procedure described in this subsection is more complicated than the original approach from Section <xref rid="j_nejsds54_s_005">2.1</xref>, we achieve superior power using the augmented cumulative distribution function introduced here, which we illustrate empirically in the supplementary materials. In Section <xref rid="j_nejsds54_s_013">5</xref>, we provide comparisons of empirical power against well-known nonparametric tests.</p>
</sec>
<sec id="j_nejsds54_s_007">
<label>2.3</label>
<title>Distributional Difference as a Scalar Quantity</title>
<p>To combine information on all forms of asymmetry, we propose the statistic <inline-formula id="j_nejsds54_ineq_132"><alternatives><mml:math>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo>−</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[$S=-{\boldsymbol{S}_{\boldsymbol{X}}^{T}}{\boldsymbol{S}_{\boldsymbol{Y}}}$]]></tex-math></alternatives></inline-formula>, with <inline-formula id="j_nejsds54_ineq_133"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{Y}}}$]]></tex-math></alternatives></inline-formula> defined analogously to <inline-formula id="j_nejsds54_ineq_134"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> by reversing the roles of the two samples. First, this choice of statistic has the advantage of treating the <inline-formula id="j_nejsds54_ineq_135"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_136"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula> samples symmetrically. This is desirable because it would be counterintuitive for the value of <italic>S</italic> to change when the roles of <inline-formula id="j_nejsds54_ineq_137"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_138"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula> are switched. In addition, this statistic is a continuous function of the concatenated vector <inline-formula id="j_nejsds54_ineq_139"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${({\boldsymbol{S}_{\boldsymbol{X}}^{T}},{\boldsymbol{S}_{\boldsymbol{Y}}^{T}})^{T}}$]]></tex-math></alternatives></inline-formula>, and in Theorem <xref rid="j_nejsds54_stat_004">3</xref>, we state the asymptotic distribution of <inline-formula id="j_nejsds54_ineq_140"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${({\boldsymbol{S}_{\boldsymbol{X}}^{T}},{\boldsymbol{S}_{\boldsymbol{Y}}^{T}})^{T}}$]]></tex-math></alternatives></inline-formula> in the case of univariate data. For the multivariate AUGUST test, which is described in Section <xref rid="j_nejsds54_s_011">3.2</xref>, we use permutation for <italic>p</italic>-value calculation.</p>
<p>The negative sign in <inline-formula id="j_nejsds54_ineq_141"><alternatives><mml:math>
<mml:mo>−</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[$-{\boldsymbol{S}_{\boldsymbol{X}}^{T}}{\boldsymbol{S}_{\boldsymbol{Y}}}$]]></tex-math></alternatives></inline-formula> comes from the fact that <inline-formula id="j_nejsds54_ineq_142"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_143"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{Y}}}$]]></tex-math></alternatives></inline-formula> typically have opposite signs in the case of distributional difference, and we wish the critical values of <italic>S</italic> to be positive. The proposition below gives intuition for this phenomenon in the context of a location shift.</p><statement id="j_nejsds54_stat_002"><label>Proposition 1.</label>
<p><italic>Let</italic> <inline-formula id="j_nejsds54_ineq_144"><alternatives><mml:math>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo stretchy="false">≥</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$m,n\ge {2^{d+1}}$]]></tex-math></alternatives></inline-formula><italic>, and suppose</italic> <inline-formula id="j_nejsds54_ineq_145"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${\{{\boldsymbol{X}_{i}}\}_{i=1}^{m}}$]]></tex-math></alternatives></inline-formula> <italic>and</italic> <inline-formula id="j_nejsds54_ineq_146"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${\{{\boldsymbol{Y}_{j}}\}_{j=1}^{n}}$]]></tex-math></alternatives></inline-formula> <italic>satisfy</italic> <inline-formula id="j_nejsds54_ineq_147"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mo movablelimits="false">max</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">}</mml:mo>
<mml:mo mathvariant="normal">&lt;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mo movablelimits="false">min</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[${\max _{i}}\{{\boldsymbol{X}_{i}}\}\lt {\min _{j}}\{{\boldsymbol{Y}_{j}}\}$]]></tex-math></alternatives></inline-formula><italic>. Then</italic> <inline-formula id="j_nejsds54_ineq_148"><alternatives><mml:math>
<mml:mo movablelimits="false">cos</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">θ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mo>−</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$\cos (\theta )=-{({2^{d}}-1)^{-1}}$]]></tex-math></alternatives></inline-formula><italic>, where θ is the angle between</italic> <inline-formula id="j_nejsds54_ineq_149"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> <italic>and</italic> <inline-formula id="j_nejsds54_ineq_150"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{Y}}}$]]></tex-math></alternatives></inline-formula> <italic>as vectors in</italic> <inline-formula id="j_nejsds54_ineq_151"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\mathbb{R}^{{2^{d}}-1}}$]]></tex-math></alternatives></inline-formula><italic>.</italic></p></statement>
<p>Informally, we could say the following: if <inline-formula id="j_nejsds54_ineq_152"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> is to the left of <inline-formula id="j_nejsds54_ineq_153"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula>, then <inline-formula id="j_nejsds54_ineq_154"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula> is to the right of <inline-formula id="j_nejsds54_ineq_155"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula>, and the symmetry statistic detecting left/right imbalance will be positive in <inline-formula id="j_nejsds54_ineq_156"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> and negative in <inline-formula id="j_nejsds54_ineq_157"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{Y}}}$]]></tex-math></alternatives></inline-formula>. As shown in Section <xref rid="j_nejsds54_s_013">5</xref>, the negative inner product <inline-formula id="j_nejsds54_ineq_158"><alternatives><mml:math>
<mml:mo>−</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[$-{\boldsymbol{S}_{\boldsymbol{X}}^{T}}{\boldsymbol{S}_{\boldsymbol{Y}}}$]]></tex-math></alternatives></inline-formula> gives good power against a wide range of distributional alternatives. In addition, see Section <xref rid="j_nejsds54_s_012">4</xref> for exploration of the asymptotic properties of <inline-formula id="j_nejsds54_ineq_159"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${({\boldsymbol{S}_{\boldsymbol{X}}^{T}},{\boldsymbol{S}_{\boldsymbol{Y}}^{T}})^{T}}$]]></tex-math></alternatives></inline-formula>.</p>
</sec>
<sec id="j_nejsds54_s_008">
<label>2.4</label>
<title>Interpretation of the Results</title>
<p>One may use entries of <inline-formula id="j_nejsds54_ineq_160"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_161"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{Y}}}$]]></tex-math></alternatives></inline-formula> to interpret the outcome of the AUGUST test based on <italic>S</italic>. With respect to <inline-formula id="j_nejsds54_ineq_162"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula>, the sample <inline-formula id="j_nejsds54_ineq_163"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula> serves as the <italic>reference sample</italic>, meaning that information from <inline-formula id="j_nejsds54_ineq_164"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> allows us to make statements about how points of <inline-formula id="j_nejsds54_ineq_165"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> fall relative to the distribution of <inline-formula id="j_nejsds54_ineq_166"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula>. Each entry in the vector <inline-formula id="j_nejsds54_ineq_167"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> describes the non-uniformity of <inline-formula id="j_nejsds54_ineq_168"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{P}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> with respect to a row of <inline-formula id="j_nejsds54_ineq_169"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\mathbf{H}_{{2^{d}}}}$]]></tex-math></alternatives></inline-formula>. In particular, the largest entries of <inline-formula id="j_nejsds54_ineq_170"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> in absolute value tell us the sources of greatest asymmetry in <inline-formula id="j_nejsds54_ineq_171"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{P}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula>.</p>
<p>Before performing the test based on <italic>S</italic>, we must first choose some resolution <italic>d</italic>, which determines the scale on which the test will be sensitive. For convenience, let <inline-formula id="j_nejsds54_ineq_172"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold">H</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\tilde{\mathbf{H}}_{{2^{d}}}}$]]></tex-math></alternatives></inline-formula> denote the Hadamard matrix of size <inline-formula id="j_nejsds54_ineq_173"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${2^{d}}$]]></tex-math></alternatives></inline-formula> according to Sylvester’s construction, without the first row, which is a row of all ones. Now, the depth <inline-formula id="j_nejsds54_ineq_174"><alternatives><mml:math>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$d=1$]]></tex-math></alternatives></inline-formula> is sensitive primarily left/right imbalance. When <inline-formula id="j_nejsds54_ineq_175"><alternatives><mml:math>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$d=1$]]></tex-math></alternatives></inline-formula>, the <inline-formula id="j_nejsds54_ineq_176"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:mn>2</mml:mn></mml:math><tex-math><![CDATA[${2^{1}}=2$]]></tex-math></alternatives></inline-formula> entries of <inline-formula id="j_nejsds54_ineq_177"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{P}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> roughly correspond to the fraction of the <inline-formula id="j_nejsds54_ineq_178"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> sample falling above or below the median of <inline-formula id="j_nejsds54_ineq_179"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula>. In this case, the only symmetry statistic is the product of <inline-formula id="j_nejsds54_ineq_180"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold">H</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:mtable columnspacing="10.0pt" equalrows="false" columnlines="none none none none none none none none none" equalcolumns="false" columnalign="center center">
<mml:mtr>
<mml:mtd class="array">
<mml:mn>1</mml:mn>
</mml:mtd>
<mml:mtd class="array">
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:math><tex-math><![CDATA[${\tilde{\mathbf{H}}_{2}}=\left(\begin{array}{c@{\hskip10.0pt}c}1& -1\end{array}\right)$]]></tex-math></alternatives></inline-formula> with <inline-formula id="j_nejsds54_ineq_181"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{P}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula>, namely the difference between the two components of <inline-formula id="j_nejsds54_ineq_182"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{P}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula>.</p>
<p>For <inline-formula id="j_nejsds54_ineq_183"><alternatives><mml:math>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>2</mml:mn></mml:math><tex-math><![CDATA[$d=2$]]></tex-math></alternatives></inline-formula>, both <inline-formula id="j_nejsds54_ineq_184"><alternatives><mml:math>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:mtable columnspacing="10.0pt 10.0pt 10.0pt" equalrows="false" columnlines="none none none none none none none none none" equalcolumns="false" columnalign="center center center center">
<mml:mtr>
<mml:mtd class="array">
<mml:mn>1</mml:mn>
</mml:mtd>
<mml:mtd class="array">
<mml:mn>1</mml:mn>
</mml:mtd>
<mml:mtd class="array">
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mtd>
<mml:mtd class="array">
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:math><tex-math><![CDATA[$\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}1& 1& -1& -1\end{array}\right)$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_185"><alternatives><mml:math>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:mtable columnspacing="10.0pt 10.0pt 10.0pt" equalrows="false" columnlines="none none none none none none none none none" equalcolumns="false" columnalign="center center center center">
<mml:mtr>
<mml:mtd class="array">
<mml:mn>1</mml:mn>
</mml:mtd>
<mml:mtd class="array">
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mtd>
<mml:mtd class="array">
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mtd>
<mml:mtd class="array">
<mml:mn>1</mml:mn>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:math><tex-math><![CDATA[$\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}1& -1& -1& 1\end{array}\right)$]]></tex-math></alternatives></inline-formula> are (necessarily orthogonal) rows of <inline-formula id="j_nejsds54_ineq_186"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold">H</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mn>4</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\tilde{\mathbf{H}}_{4}}$]]></tex-math></alternatives></inline-formula>. When multiplied by <inline-formula id="j_nejsds54_ineq_187"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{P}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula>, the first of these rows produces a statistic for left/right imbalance, similar to the <inline-formula id="j_nejsds54_ineq_188"><alternatives><mml:math>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$d=1$]]></tex-math></alternatives></inline-formula> case, while the latter row detects differences in scale. Larger values of <italic>d</italic> detect more granular varieties of imbalance. We use a depth of <inline-formula id="j_nejsds54_ineq_189"><alternatives><mml:math>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>2</mml:mn></mml:math><tex-math><![CDATA[$d=2$]]></tex-math></alternatives></inline-formula> in our real data example, and <inline-formula id="j_nejsds54_ineq_190"><alternatives><mml:math>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>3</mml:mn></mml:math><tex-math><![CDATA[$d=3$]]></tex-math></alternatives></inline-formula> in simulated power comparisons. (See the supplementary materials for an empirical comparison across depths.) In [<xref ref-type="bibr" rid="j_nejsds54_ref_046">46</xref>], it is shown that a depth of <inline-formula id="j_nejsds54_ineq_191"><alternatives><mml:math>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>3</mml:mn></mml:math><tex-math><![CDATA[$d=3$]]></tex-math></alternatives></inline-formula> is sufficient for a symmetry statistic-based test of independence to outperform both distance correlation and the <italic>F</italic>-test, which are known to be optimal, in detecting correlation in bivariate normal distributions.</p>
<p>Higher depths <inline-formula id="j_nejsds54_ineq_192"><alternatives><mml:math>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo mathvariant="normal">&gt;</mml:mo>
<mml:mn>3</mml:mn></mml:math><tex-math><![CDATA[$d\gt 3$]]></tex-math></alternatives></inline-formula> can be useful for alternatives that are extremely close in the Kolmogorov–Smirnov metric but have densities that are bounded apart in the uniform norm. As one example, we may have <inline-formula id="j_nejsds54_ineq_193"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> sampled from Uniform<inline-formula id="j_nejsds54_ineq_194"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(0,1)$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_195"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula> sampled from a high frequency square wave distribution with the same support. In Section <xref rid="j_nejsds54_s_016">6</xref>, we use symmetry statistics in visualizations of NBA shooting data. As an additional example, a step-by-step interpretation on simulated data is provided in the supplementary materials.</p>
</sec>
</sec>
<sec id="j_nejsds54_s_009">
<label>3</label>
<title>Computational Considerations</title>
<sec id="j_nejsds54_s_010">
<label>3.1</label>
<title>Algorithms for the Univariate Statistic</title>
<p>Below, Algorithms <xref rid="j_nejsds54_fig_001">1</xref> and <xref rid="j_nejsds54_fig_002">2</xref> formalize the steps to calculating the AUGUST statistic outlined in earlier sections. In terms of prior notation, Algorithm <xref rid="j_nejsds54_fig_001">1</xref> computes the vector <inline-formula id="j_nejsds54_ineq_196"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\boldsymbol{P}(x)$]]></tex-math></alternatives></inline-formula>, and Algorithm <xref rid="j_nejsds54_fig_002">2</xref> calculates the overall test statistic <inline-formula id="j_nejsds54_ineq_197"><alternatives><mml:math>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo>−</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[$S=-{\boldsymbol{S}_{\boldsymbol{X}}^{T}}{\boldsymbol{S}_{\boldsymbol{Y}}}$]]></tex-math></alternatives></inline-formula>. Recall that we use <inline-formula id="j_nejsds54_ineq_198"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\mathbf{H}_{{2^{d}}}}$]]></tex-math></alternatives></inline-formula> to refer to the Hadamard matrix of size <inline-formula id="j_nejsds54_ineq_199"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${2^{d}}$]]></tex-math></alternatives></inline-formula> according to Sylvester’s construction, and for a matrix <bold>M</bold>, we use <inline-formula id="j_nejsds54_ineq_200"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold">M</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${(\mathbf{M})_{-1}}$]]></tex-math></alternatives></inline-formula> to refer to <bold>M</bold> without its first row.</p>
<fig id="j_nejsds54_fig_001">
<label>Algorithm 1</label>
<caption>
<p>Augmented CDF<inline-formula id="j_nejsds54_ineq_201"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">V</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(V,d,x)$]]></tex-math></alternatives></inline-formula>.</p>
</caption>
<graphic xlink:href="nejsds54_g001.jpg"/>
</fig>
<fig id="j_nejsds54_fig_002">
<label>Algorithm 2</label>
<caption>
<p>AUGUST<inline-formula id="j_nejsds54_ineq_202"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">Y</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(X,Y,d)$]]></tex-math></alternatives></inline-formula>.</p>
</caption>
<graphic xlink:href="nejsds54_g002.jpg"/>
</fig>
<p>The two samples <inline-formula id="j_nejsds54_ineq_203"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_204"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula> have sizes <italic>m</italic> and <italic>n</italic>, respectively. Treating <italic>d</italic> as a constant, Algorithm <xref rid="j_nejsds54_fig_002">2</xref> requires <inline-formula id="j_nejsds54_ineq_205"><alternatives><mml:math>
<mml:mi mathvariant="italic">O</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$O(mn)$]]></tex-math></alternatives></inline-formula> elementary operations. This is due to the calculation of <inline-formula id="j_nejsds54_ineq_206"><alternatives><mml:math>
<mml:mi mathvariant="italic">K</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="normal">#</mml:mi>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>:</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">V</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$K=\mathrm{\# }\{i:{\boldsymbol{V}_{i}}\le x\}$]]></tex-math></alternatives></inline-formula> in the Algorithm <xref rid="j_nejsds54_fig_001">1</xref>, which necessitates iterating over all entries of <inline-formula id="j_nejsds54_ineq_207"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">V</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{V}$]]></tex-math></alternatives></inline-formula> each time that Algorithm <xref rid="j_nejsds54_fig_001">1</xref><inline-formula id="j_nejsds54_ineq_208"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">V</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(\boldsymbol{V},d,x)$]]></tex-math></alternatives></inline-formula> is called.</p>
<p>However, by first sorting the concatenated <inline-formula id="j_nejsds54_ineq_209"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_210"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula> samples, it is possible to reduce the running time to <inline-formula id="j_nejsds54_ineq_211"><alternatives><mml:math>
<mml:mi mathvariant="italic">O</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo movablelimits="false">log</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$O((m+n)\log (m+n))$]]></tex-math></alternatives></inline-formula> operations.</p><statement id="j_nejsds54_stat_003"><label>Theorem 2.</label>
<p><italic>There exists an algorithm for calculating the exact test statistic S that requires</italic> <inline-formula id="j_nejsds54_ineq_212"><alternatives><mml:math>
<mml:mi mathvariant="italic">O</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo movablelimits="false">log</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$O((m+n)\log (m+n))$]]></tex-math></alternatives></inline-formula> <italic>elementary operations.</italic></p></statement>
<p>This improved algorithm, named AUGUST+, is stated explicitly in the supplementary materials. The time complexity is asymptotically equivalent to that of an efficient sorting algorithm applied to the concatenated data. The constant factor in this comparison depends on the resolution <italic>d</italic>, which is assumed constant in Theorem <xref rid="j_nejsds54_stat_003">2</xref>. In terms of storage, the AUGUST+ algorithm defines only one array whose length depends on <italic>m</italic> and <italic>n</italic>. This array has dimension <inline-formula id="j_nejsds54_ineq_213"><alternatives><mml:math>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$2\times (m+n)$]]></tex-math></alternatives></inline-formula>, meaning the space requirement is linear in the combined sample size. In Section <xref rid="j_nejsds54_s_016">6</xref>, we use this algorithm to perform our two-sample test on a large data set on the order of <inline-formula id="j_nejsds54_ineq_214"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mn>10</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>6</mml:mn>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${10^{6}}$]]></tex-math></alternatives></inline-formula> observations.</p>
</sec>
<sec id="j_nejsds54_s_011">
<label>3.2</label>
<title>Multivariate Extension</title>
<p>With an appropriate transformation, we can extend the univariate test to the problem of multivariate two-sample testing. For the purposes of this subsection, let <inline-formula id="j_nejsds54_ineq_215"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo>=</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[$\boldsymbol{X}={\{{\boldsymbol{X}_{i}}\}_{i=1}^{m}}$]]></tex-math></alternatives></inline-formula> be an independent sample from multivariate distribution <italic>G</italic>, and let <inline-formula id="j_nejsds54_ineq_216"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
<mml:mo>=</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[$\boldsymbol{Y}={\{{\boldsymbol{Y}_{j}}\}_{j=1}^{n}}$]]></tex-math></alternatives></inline-formula> be an independent sample from multivariate distribution <italic>F</italic>, with <italic>F</italic> and <italic>G</italic> defined on <inline-formula id="j_nejsds54_ineq_217"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\mathbb{R}^{k}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_218"><alternatives><mml:math>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo stretchy="false">≥</mml:mo>
<mml:mn>2</mml:mn></mml:math><tex-math><![CDATA[$k\ge 2$]]></tex-math></alternatives></inline-formula>. We adapt an approach that could be appropriately named <italic>mutual Mahalanobis distance</italic>.</p>
<p>Given a mean <inline-formula id="j_nejsds54_ineq_219"><alternatives><mml:math>
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$\mu \in {\mathbb{R}^{k}}$]]></tex-math></alternatives></inline-formula> and invertible <inline-formula id="j_nejsds54_ineq_220"><alternatives><mml:math>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>×</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi></mml:math><tex-math><![CDATA[$k\times k$]]></tex-math></alternatives></inline-formula> covariance matrix Σ, recall that the Mahalanobis distance of <inline-formula id="j_nejsds54_ineq_221"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$\boldsymbol{x}\in {\mathbb{R}^{k}}$]]></tex-math></alternatives></inline-formula> from <italic>μ</italic> with respect to Σ is 
<disp-formula id="j_nejsds54_eq_009">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:mi mathvariant="italic">M</mml:mi>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="normal">Σ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mfenced separators="" open="[" close="]">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="normal">Σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ MD(x;\mu ,\Sigma )={\left[{(x-\mu )^{T}}{\Sigma ^{-1}}(x-\mu )\right]^{1/2}}.\]]]></tex-math></alternatives>
</disp-formula> 
Let <inline-formula id="j_nejsds54_ineq_222"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">μ</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\hat{\mu }_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_223"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="normal">Σ</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\hat{\Sigma }_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> be the sample mean and sample covariance matrix of <inline-formula id="j_nejsds54_ineq_224"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula>, where we assume <inline-formula id="j_nejsds54_ineq_225"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="normal">Σ</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\hat{\Sigma }_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> is nonsingular. Consider the transformed collections 
<disp-formula id="j_nejsds54_eq_010">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mtd>
<mml:mtd class="align-even">
<mml:mo>=</mml:mo>
<mml:mfenced separators="" open="{" close="}">
<mml:mrow>
<mml:mi mathvariant="italic">M</mml:mi>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">μ</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="normal">Σ</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>:</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="align-odd">
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mtd>
<mml:mtd class="align-even">
<mml:mo>=</mml:mo>
<mml:mfenced separators="" open="{" close="}">
<mml:mrow>
<mml:mi mathvariant="italic">M</mml:mi>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">μ</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="normal">Σ</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>:</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[\begin{aligned}{}{\tilde{\boldsymbol{X}}^{(\boldsymbol{X})}}& =\left\{MD({\boldsymbol{X}_{i}};{\hat{\mu }_{\boldsymbol{X}}},{\hat{\Sigma }_{\boldsymbol{X}}}):1\le i\le m\right\}\\ {} {\tilde{\boldsymbol{Y}}^{(\boldsymbol{X})}}& =\left\{MD({\boldsymbol{Y}_{j}};{\hat{\mu }_{\boldsymbol{X}}},{\hat{\Sigma }_{\boldsymbol{X}}}):1\le j\le n\right\},\end{aligned}\]]]></tex-math></alternatives>
</disp-formula> 
where the superscript <inline-formula id="j_nejsds54_ineq_226"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(\boldsymbol{X})$]]></tex-math></alternatives></inline-formula> indicates that means and covariances are estimated using the <inline-formula id="j_nejsds54_ineq_227"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> sample. If <inline-formula id="j_nejsds54_ineq_228"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_229"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula> come from the same multivariate distribution, then the collections <inline-formula id="j_nejsds54_ineq_230"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\tilde{\boldsymbol{X}}^{(\boldsymbol{X})}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_231"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\tilde{\boldsymbol{Y}}^{(\boldsymbol{X})}}$]]></tex-math></alternatives></inline-formula> should have similar univariate distributions. As a result, at a given depth <italic>d</italic>, it is reasonable to test the univariate samples <inline-formula id="j_nejsds54_ineq_232"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\tilde{\boldsymbol{X}}^{(\boldsymbol{X})}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_233"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\tilde{\boldsymbol{Y}}^{(\boldsymbol{X})}}$]]></tex-math></alternatives></inline-formula> as an assessment of the distributional equality of the multivariate samples <inline-formula id="j_nejsds54_ineq_234"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_235"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula>. Our choice of Mahalanobis distance is also motivated by the wide class of nonparametric tests based on data depth, particularly Mahalanobis depth [<xref ref-type="bibr" rid="j_nejsds54_ref_005">5</xref>, <xref ref-type="bibr" rid="j_nejsds54_ref_030">30</xref>].</p>
<p>From the univariate method, recall that vectors of symmetry statistics quantify regions of imbalance between the univariate samples, as imbalances in distribution appear as non-uniformity in the vector of cell probabilities. Under a Mahalanobis distance transformation, cells in the domain of <inline-formula id="j_nejsds54_ineq_236"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\tilde{\boldsymbol{X}}^{(\boldsymbol{X})}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_237"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\tilde{\boldsymbol{Y}}^{(\boldsymbol{X})}}$]]></tex-math></alternatives></inline-formula> correspond to nested elliptical rings centered on <inline-formula id="j_nejsds54_ineq_238"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">μ</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\hat{\mu }_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula>. This principle extends interpretability of symmetry statistics to the multivariate case.</p>
<p>As in the univariate case, it is desirable for the test statistic to be invariant to the transposition of <inline-formula id="j_nejsds54_ineq_239"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_240"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula>. To achieve this, we can use the statistic 
<disp-formula id="j_nejsds54_eq_011">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mi mathvariant="italic">u</mml:mi>
<mml:mi mathvariant="italic">l</mml:mi>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mo movablelimits="false">max</mml:mo>
<mml:mo maxsize="1.61em" minsize="1.61em" fence="true" mathvariant="normal">(</mml:mo>
</mml:mtd>
<mml:mtd class="align-even">
<mml:mtext>AUGUST</mml:mtext>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="align-odd"/>
<mml:mtd class="align-even">
<mml:mtext>AUGUST</mml:mtext>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo maxsize="1.61em" minsize="1.61em" fence="true" mathvariant="normal">)</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[\begin{aligned}{}{S_{multi}}=\max \Big(& \text{AUGUST}\left({\tilde{\boldsymbol{X}}^{(\boldsymbol{X})}},{\tilde{\boldsymbol{Y}}^{(\boldsymbol{X})}},d\right),\\ {} & \text{AUGUST}\left({\tilde{\boldsymbol{X}}^{(\boldsymbol{Y})}},{\tilde{\boldsymbol{Y}}^{(\boldsymbol{Y})}},d\right)\Big)\end{aligned}\]]]></tex-math></alternatives>
</disp-formula> 
wherein we use both possible Mahalanobis distance transformations for <inline-formula id="j_nejsds54_ineq_241"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_242"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula>, compute two test statistics, and take the maximum. Aside from ensuring transposition invariance, the simultaneous use of the two test statistics is important for detecting some asymmetric alternatives. As a contrived example, take <inline-formula id="j_nejsds54_ineq_243"><alternatives><mml:math>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>2</mml:mn></mml:math><tex-math><![CDATA[$k=2$]]></tex-math></alternatives></inline-formula>; suppose <inline-formula id="j_nejsds54_ineq_244"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> comprises <italic>m</italic> independent samples of the bivariate standard normal <inline-formula id="j_nejsds54_ineq_245"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">∼</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">I</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${({Z_{1}},{Z_{2}})^{T}}\sim {N_{2}}(0,{I_{2}})$]]></tex-math></alternatives></inline-formula>, while <inline-formula id="j_nejsds54_ineq_246"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula> comprises <italic>n</italic> independent samples of <inline-formula id="j_nejsds54_ineq_247"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">χ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">χ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${2^{-1/2}}{({\chi _{2}},{\chi _{2}})^{T}}$]]></tex-math></alternatives></inline-formula>, where <inline-formula id="j_nejsds54_ineq_248"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">χ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\chi _{2}}$]]></tex-math></alternatives></inline-formula> follows a chi distribution with two degrees of freedom. In this case, both <inline-formula id="j_nejsds54_ineq_249"><alternatives><mml:math>
<mml:mi mathvariant="italic">M</mml:mi>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">I</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$MD({({Z_{1}},{Z_{2}})^{T}};0,{I_{2}})$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_250"><alternatives><mml:math>
<mml:mi mathvariant="italic">M</mml:mi>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">χ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">χ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">I</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$MD({2^{-1/2}}{({\chi _{2}},{\chi _{2}})^{T}};0,{I_{2}})$]]></tex-math></alternatives></inline-formula> follow a chi distribution with two degrees of freedom, which indicates that the test based solely on <inline-formula id="j_nejsds54_ineq_251"><alternatives><mml:math>
<mml:mtext>AUGUST</mml:mtext>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:math><tex-math><![CDATA[$\text{AUGUST}\left({\tilde{\boldsymbol{X}}^{(\boldsymbol{X})}},{\tilde{\boldsymbol{Y}}^{(\boldsymbol{X})}},d\right)$]]></tex-math></alternatives></inline-formula> is powerless. (In this highly degenerate situation, the transposed statistic <inline-formula id="j_nejsds54_ineq_252"><alternatives><mml:math>
<mml:mtext>AUGUST</mml:mtext>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:math><tex-math><![CDATA[$\text{AUGUST}\left({\tilde{\boldsymbol{X}}^{(\boldsymbol{Y})}},{\tilde{\boldsymbol{Y}}^{(\boldsymbol{Y})}},d\right)$]]></tex-math></alternatives></inline-formula> is technically undefined, because <inline-formula id="j_nejsds54_ineq_253"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula> has no variance in the <inline-formula id="j_nejsds54_ineq_254"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${(1,-1)^{T}}$]]></tex-math></alternatives></inline-formula> direction.)</p>
<p>In practice, we calculate <italic>p</italic>-values for the multivariate statistic using permutation. While this current necessity sacrifices the computational advantage of the asymptotic result Theorem <xref rid="j_nejsds54_stat_004">3</xref>, the multivariate method may nonetheless take advantage of Theorem <xref rid="j_nejsds54_stat_003">2</xref>, as it is built upon the univariate procedure. As we show in Section <xref rid="j_nejsds54_s_013">5</xref>, a depth of <inline-formula id="j_nejsds54_ineq_255"><alternatives><mml:math>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>2</mml:mn></mml:math><tex-math><![CDATA[$d=2$]]></tex-math></alternatives></inline-formula> is sufficiently large to detect common multivariate alternatives with empirical power comparable to existing tests.</p>
</sec>
</sec>
<sec id="j_nejsds54_s_012">
<label>4</label>
<title>Distributional Insights</title>
<p>To simplify <italic>p</italic>-value calculation and simulation analysis, we provide theoretical results regarding the univariate procedure outlined in Section <xref rid="j_nejsds54_s_010">3.1</xref>. These asymptotic insights follow from the adaptation of classical <italic>U</italic>-statistic theory. For each <inline-formula id="j_nejsds54_ineq_256"><alternatives><mml:math>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$k\in \{1,\dots ,{2^{d}}\}$]]></tex-math></alternatives></inline-formula>, define the function <inline-formula id="j_nejsds54_ineq_257"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="double-struck">R</mml:mi>
<mml:mo stretchy="false">→</mml:mo>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[${p_{k}^{F}}:\mathbb{R}\to [0,1]$]]></tex-math></alternatives></inline-formula> by 
<disp-formula id="j_nejsds54_eq_012">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mtd>
<mml:mtd class="align-even">
<mml:mo>=</mml:mo>
<mml:mfenced separators="" open="(" close=")">
<mml:mfrac linethickness="0.0pt">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:mfrac>
</mml:mfenced>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="align-odd"/>
<mml:mtd class="align-even">
<mml:mspace width="1em"/>
<mml:mo>+</mml:mo>
<mml:mfenced separators="" open="(" close=")">
<mml:mfrac linethickness="0.0pt">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mfrac>
</mml:mfenced>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[\begin{aligned}{}{p_{k}^{F}}(x)& =\left(\genfrac{}{}{0.0pt}{}{{2^{d+1}}-1}{2k-2}\right)F{(x)^{2k-2}}{(1-F(x))^{{2^{d+1}}-1-(2k-2)}}\\ {} & \hspace{1em}+\left(\genfrac{}{}{0.0pt}{}{{2^{d+1}}-1}{2k-1}\right)F{(x)^{2k-1}}{(1-F(x))^{{2^{d+1}}-1-(2k-1)}},\end{aligned}\]]]></tex-math></alternatives>
</disp-formula> 
with <inline-formula id="j_nejsds54_ineq_258"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">G</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="double-struck">R</mml:mi>
<mml:mo stretchy="false">→</mml:mo>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[${p_{k}^{G}}:\mathbb{R}\to [0,1]$]]></tex-math></alternatives></inline-formula> defined analogously. These functions can be thought of as theoretical analogs of the data-dependent probabilities <inline-formula id="j_nejsds54_ineq_259"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\boldsymbol{p}_{k}}(x)$]]></tex-math></alternatives></inline-formula> from Section <xref rid="j_nejsds54_s_006">2.2</xref>. Further, define the integrated quantities 
<disp-formula id="j_nejsds54_eq_013">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>=</mml:mo><mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∫</mml:mo></mml:mstyle>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mspace width="2.5pt"/>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>=</mml:mo><mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∫</mml:mo></mml:mstyle>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">G</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ {p_{k}^{F:G}}=\int {p_{k}^{F}}(x)dG(x),\hspace{2.5pt}{p_{k}^{G:F}}=\int {p_{k}^{G}}(x)dF(x).\]]]></tex-math></alternatives>
</disp-formula> 
For reasons that will soon be apparent, we refer to <inline-formula id="j_nejsds54_ineq_260"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${p_{k}^{F:G}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_261"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${p_{k}^{G:F}}$]]></tex-math></alternatives></inline-formula> as the <italic>limiting cell probabilities</italic> of AUGUST.</p><statement id="j_nejsds54_stat_004"><label>Theorem 3.</label>
<p><italic>Suppose that</italic> <inline-formula id="j_nejsds54_ineq_262"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${\{{\boldsymbol{X}_{i}}\}_{i=1}^{m}}$]]></tex-math></alternatives></inline-formula> <italic>and</italic> <inline-formula id="j_nejsds54_ineq_263"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">}</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${\{{\boldsymbol{Y}_{j}}\}_{j=1}^{n}}$]]></tex-math></alternatives></inline-formula> <italic>are independent univariate observations, where</italic> <inline-formula id="j_nejsds54_ineq_264"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">∼</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi></mml:math><tex-math><![CDATA[${\boldsymbol{X}_{i}}\sim G$]]></tex-math></alternatives></inline-formula> <italic>and</italic> <inline-formula id="j_nejsds54_ineq_265"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">∼</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi></mml:math><tex-math><![CDATA[${\boldsymbol{Y}_{j}}\sim F$]]></tex-math></alternatives></inline-formula> <italic>for continuous distributions G, F. Let</italic> <inline-formula id="j_nejsds54_ineq_266"><alternatives><mml:math>
<mml:mi mathvariant="italic">N</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi></mml:math><tex-math><![CDATA[$N=n+m$]]></tex-math></alternatives></inline-formula><italic>, and assume that</italic> <inline-formula id="j_nejsds54_ineq_267"><alternatives><mml:math>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo stretchy="false">→</mml:mo>
<mml:mi>∞</mml:mi></mml:math><tex-math><![CDATA[$n,m\to \infty $]]></tex-math></alternatives></inline-formula> <italic>in such a way that</italic> <inline-formula id="j_nejsds54_ineq_268"><alternatives><mml:math>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:mi mathvariant="italic">N</mml:mi>
<mml:mo stretchy="false">→</mml:mo>
<mml:mi mathvariant="italic">λ</mml:mi></mml:math><tex-math><![CDATA[$m/N\to \lambda $]]></tex-math></alternatives></inline-formula> <italic>for some</italic> <inline-formula id="j_nejsds54_ineq_269"><alternatives><mml:math>
<mml:mi mathvariant="italic">λ</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\lambda \in (0,1)$]]></tex-math></alternatives></inline-formula><italic>. Then</italic> 
<disp-formula id="j_nejsds54_eq_014">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:mtable equalrows="false" columnlines="none none none none none none none none none" equalcolumns="false" columnalign="center">
<mml:mtr>
<mml:mtd class="array">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="array">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">μ</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo stretchy="false">→</mml:mo>
<mml:mi mathvariant="italic">N</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="normal">Σ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ {N^{1/2}}\left(\left(\begin{array}{c}{\boldsymbol{S}_{\boldsymbol{X}}}\\ {} {\boldsymbol{S}_{\boldsymbol{Y}}}\end{array}\right)-\mu \right)\to N({0_{{2^{d+1}}-2}},\Sigma )\]]]></tex-math></alternatives>
</disp-formula> 
<italic>in distribution, where</italic> 
<disp-formula id="j_nejsds54_eq_015">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:mtable columnspacing="10.0pt" equalrows="false" columnlines="none none none none none none none none none" equalcolumns="false" columnalign="center center">
<mml:mtr>
<mml:mtd class="array">
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold">H</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
</mml:mtd>
<mml:mtd class="array">
<mml:msub>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>×</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="array">
<mml:msub>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>×</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
</mml:mtd>
<mml:mtd class="array">
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold">H</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ \mu =\left(\begin{array}{c@{\hskip10.0pt}c}{\tilde{\mathbf{H}}_{{2^{d}}}}& {0_{({2^{d}}-1)\times {2^{d}}}}\\ {} {0_{({2^{d}}-1)\times {2^{d}}}}& {\tilde{\mathbf{H}}_{{2^{d}}}}\end{array}\right){p^{F,G}}\]]]></tex-math></alternatives>
</disp-formula> 
<italic>for</italic> <inline-formula id="j_nejsds54_ineq_270"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${p^{F,G}}={\left({p_{1}^{F:G}},\dots ,{p_{{2^{d}}}^{F:G}},{p_{1}^{G:F}},\dots ,{p_{{2^{d}}}^{G:F}}\right)^{T}}$]]></tex-math></alternatives></inline-formula><italic>, and</italic> Σ <italic>is a matrix depending on λ, d, F, and G.</italic></p></statement>
<p>Because the exact form of Σ is notation-intensive, we state it in the supplementary materials. In light of the above result, given distributions <italic>F</italic>, <italic>G</italic>, it is possible to compute the limit in probability of the symmetry statistics <inline-formula id="j_nejsds54_ineq_271"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\left({\boldsymbol{S}_{\boldsymbol{X}}},{\boldsymbol{S}_{\boldsymbol{Y}}}\right)^{T}}$]]></tex-math></alternatives></inline-formula>. This limit <italic>μ</italic> encodes asymmetry at the population level, analogous to how <inline-formula id="j_nejsds54_ineq_272"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\left({\boldsymbol{S}_{\boldsymbol{X}}},{\boldsymbol{S}_{\boldsymbol{Y}}}\right)^{T}}$]]></tex-math></alternatives></inline-formula> encodes asymmetry between the finite samples <inline-formula id="j_nejsds54_ineq_273"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_274"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula>. Moreover, using this theorem, one can efficiently simulate the test statistic <italic>S</italic> under the alternative, yielding a benchmark against a predetermined <inline-formula id="j_nejsds54_ineq_275"><alternatives><mml:math>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo stretchy="false">≠</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi></mml:math><tex-math><![CDATA[$F\ne G$]]></tex-math></alternatives></inline-formula> in large samples. In applications that require an <italic>a priori</italic> power analysis, this approach could simplify the process of determining the sample size necessary for detecting a given effect.</p>
<p>Building on these ideas, the limit <italic>μ</italic> indicates a small framework for understanding how symmetry statistics encode distributional differences. This principle helps explain AUGUST’s power against alternatives at each depth <italic>d</italic>.</p>
<p>For convenience with inverse functions, we assume the CDFs <italic>F</italic> and <italic>G</italic> are differentiable and strictly increasing on <inline-formula id="j_nejsds54_ineq_276"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">R</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{R}$]]></tex-math></alternatives></inline-formula>, though similar reasoning applies under weaker conditions. Let <italic>E</italic> denote the CDF of the uniform distribution on <inline-formula id="j_nejsds54_ineq_277"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$[0,1]$]]></tex-math></alternatives></inline-formula>. By a change of variables, 
<disp-formula id="j_nejsds54_eq_016">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">K</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∫</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mi>∞</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>∞</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mi mathvariant="italic">g</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo>=</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∫</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">E</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">u</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">g</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">u</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">f</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">u</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mi mathvariant="italic">u</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ {p_{K}^{F:G}}={\int _{-\infty }^{\infty }}{p_{k}^{F}}(x)g(x)dx={\int _{0}^{1}}{p_{k}^{E}}(u)\frac{g({F^{-1}}(u))}{f({F^{-1}}(u))}du,\]]]></tex-math></alternatives>
</disp-formula> 
wherein all information about <italic>F</italic> and <italic>G</italic> is contained in the likelihood ratio on the right. Moreover, for any other differentiable and strictly increasing CDF <italic>H</italic>, we have 
<disp-formula id="j_nejsds54_eq_017">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">g</mml:mi>
<mml:mo>∘</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">f</mml:mi>
<mml:mo>∘</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo>∘</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo>∘</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>∘</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo>∘</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ \frac{g\circ {F^{-1}}}{f\circ {F^{-1}}}={(G\circ {F^{-1}})^{\prime }}={\left((G\circ {H^{-1}})\circ {(F\circ {H^{-1}})^{-1}}\right)^{\prime }},\]]]></tex-math></alternatives>
</disp-formula> 
which shows that the limit <italic>μ</italic> is invariant to such transformations of <italic>F</italic> and <italic>G</italic>. That is, <inline-formula id="j_nejsds54_ineq_278"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(F,G)$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_279"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo>∘</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo>∘</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(F\circ {H^{-1}},G\circ {H^{-1}})$]]></tex-math></alternatives></inline-formula> are in an equivalence class of distribution pairs whose symmetry statistics have the same limit. Defining <inline-formula id="j_nejsds54_ineq_280"><alternatives><mml:math>
<mml:mi mathvariant="italic">Q</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo>∘</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$Q=G\circ {F^{-1}}$]]></tex-math></alternatives></inline-formula>, we arrive at 
<disp-formula id="j_nejsds54_eq_018">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">K</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∫</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">E</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">u</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mi mathvariant="italic">q</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">u</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mi mathvariant="italic">u</mml:mi>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ {p_{K}^{F:G}}={\int _{0}^{1}}{p_{k}^{E}}(u)q(u)du.\]]]></tex-math></alternatives>
</disp-formula> 
Asymmetry for the equivalence class of <inline-formula id="j_nejsds54_ineq_281"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(F,G)$]]></tex-math></alternatives></inline-formula> is characterized by the deviation of <italic>Q</italic> from the uniform distribution <italic>E</italic>. Similar transformation invariance properties (perhaps phrased differently) are common among distribution-free tests; for AUGUST, we can be quite specific about the implications for <italic>Q</italic>. From the beginning of this section, recall that the population-level probability <inline-formula id="j_nejsds54_ineq_282"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${p_{k}^{F}}(x)$]]></tex-math></alternatives></inline-formula> is the sum of two binomial-like terms, and compare with the following result.</p><statement id="j_nejsds54_stat_005"><label>Theorem 4.</label>
<p><italic>Let</italic> <inline-formula id="j_nejsds54_ineq_283"><alternatives><mml:math>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo stretchy="false">≥</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$t\ge 1$]]></tex-math></alternatives></inline-formula> <italic>be an integer, and assume the CDFs F and G are differentiable and strictly increasing on</italic> <inline-formula id="j_nejsds54_ineq_284"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">R</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{R}$]]></tex-math></alternatives></inline-formula><italic>, with</italic> <inline-formula id="j_nejsds54_ineq_285"><alternatives><mml:math>
<mml:mi mathvariant="italic">Q</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo>∘</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$Q=G\circ {F^{-1}}$]]></tex-math></alternatives></inline-formula><italic>. The following are equivalent:</italic> 
<list>
<list-item id="j_nejsds54_li_004">
<label>1.</label>
<p><inline-formula id="j_nejsds54_ineq_286"><alternatives><mml:math>
<mml:mo largeop="false" movablelimits="false">∫</mml:mo>
<mml:mfenced separators="" open="(" close=")">
<mml:mfrac linethickness="0.0pt">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">t</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mfenced>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">u</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">u</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">t</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mi mathvariant="italic">Q</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">u</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$\textstyle\int \left(\genfrac{}{}{0.0pt}{}{{2^{t}}-1}{k}\right){u^{k}}{(1-u)^{{2^{t}}-1-k}}dQ(u)={2^{-t}}$]]></tex-math></alternatives></inline-formula> <italic>for all integers</italic> <inline-formula id="j_nejsds54_ineq_287"><alternatives><mml:math>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">t</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$k=0,\dots ,{2^{t}}-1$]]></tex-math></alternatives></inline-formula><italic>.</italic></p>
</list-item>
<list-item id="j_nejsds54_li_005">
<label>2.</label>
<p><inline-formula id="j_nejsds54_ineq_288"><alternatives><mml:math>
<mml:mo largeop="false" movablelimits="false">∫</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">u</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mi mathvariant="italic">Q</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">u</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="bold">E</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">U</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$\textstyle\int {u^{k}}dQ(u)=\mathbf{E}[{U^{k}}]$]]></tex-math></alternatives></inline-formula> <italic>for all integers</italic> <inline-formula id="j_nejsds54_ineq_289"><alternatives><mml:math>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">t</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$k=0,\dots ,{2^{t}}-1$]]></tex-math></alternatives></inline-formula><italic>, where</italic> <inline-formula id="j_nejsds54_ineq_290"><alternatives><mml:math>
<mml:mi mathvariant="italic">U</mml:mi>
<mml:mo stretchy="false">∼</mml:mo>
<mml:mtext mathvariant="italic">Unif</mml:mtext>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$U\sim \textit{Unif}[0,1]$]]></tex-math></alternatives></inline-formula><italic>.</italic></p>
</list-item>
</list>
</p></statement>
<p>The importance of this theorem is as follows. In Section <xref rid="j_nejsds54_s_006">2.2</xref>, we define the data-dependent function <inline-formula id="j_nejsds54_ineq_291"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\boldsymbol{p}_{k}}(x)$]]></tex-math></alternatives></inline-formula> to be the probability that <inline-formula id="j_nejsds54_ineq_292"><alternatives><mml:math>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>2</mml:mn></mml:math><tex-math><![CDATA[$2k-2$]]></tex-math></alternatives></inline-formula> or <inline-formula id="j_nejsds54_ineq_293"><alternatives><mml:math>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$2k-1$]]></tex-math></alternatives></inline-formula> elements of the subsampled <inline-formula id="j_nejsds54_ineq_294"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\boldsymbol{Y}^{\ast }}$]]></tex-math></alternatives></inline-formula> are less than or equal to <italic>x</italic>, which reflects the choice of resample size <inline-formula id="j_nejsds54_ineq_295"><alternatives><mml:math>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$r={2^{d+1}}-1$]]></tex-math></alternatives></inline-formula>. In the language of equation (<xref rid="j_nejsds54_eq_006">2.1</xref>), both <inline-formula id="j_nejsds54_ineq_296"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\boldsymbol{p}_{k}}(x)$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_297"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${p_{k}^{F}}(x)$]]></tex-math></alternatives></inline-formula> are a sum of two probability terms because the interval <inline-formula id="j_nejsds54_ineq_298"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$[(k-1)/{2^{d}},k/{2^{d}}]$]]></tex-math></alternatives></inline-formula> contains exactly two point masses of <inline-formula id="j_nejsds54_ineq_299"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>∗</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\hat{F}_{{\boldsymbol{Y}^{\ast }}}}(x)$]]></tex-math></alternatives></inline-formula>. If we instead select <inline-formula id="j_nejsds54_ineq_300"><alternatives><mml:math>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$r={2^{d}}-1$]]></tex-math></alternatives></inline-formula>, then each dyadic interval at depth <italic>d</italic> contains only one point mass of <inline-formula id="j_nejsds54_ineq_301"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\hat{F}_{\boldsymbol{Y}}}(x)$]]></tex-math></alternatives></inline-formula>, and the limiting cell probabilities are instead 
<disp-formula id="j_nejsds54_eq_019">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt">
<mml:mtr>
<mml:mtd class="align-odd">
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="italic">G</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mtd>
<mml:mtd class="align-even">
<mml:mo>=</mml:mo><mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∫</mml:mo></mml:mstyle>
<mml:mfenced separators="" open="(" close=")">
<mml:mfrac linethickness="0.0pt">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mfenced>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">F</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mi mathvariant="italic">G</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="align-odd"/>
<mml:mtd class="align-even">
<mml:mo>=</mml:mo><mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∫</mml:mo></mml:mstyle>
<mml:mfenced separators="" open="(" close=")">
<mml:mfrac linethickness="0.0pt">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mfenced>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">u</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">u</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mi mathvariant="italic">Q</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">u</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[\begin{aligned}{}{p_{k}^{F:G}}& =\int \left(\genfrac{}{}{0.0pt}{}{{2^{d}}-1}{k}\right)F{(x)^{k}}{(1-F(x))^{{2^{d}}-1-k}}dG(x)\\ {} & =\int \left(\genfrac{}{}{0.0pt}{}{{2^{d}}-1}{k}\right){u^{k}}{(1-u)^{{2^{d}}-1-k}}dQ(u).\end{aligned}\]]]></tex-math></alternatives>
</disp-formula> 
Symmetry statistics are nonzero precisely when the underlying cell probabilities are imbalanced. By Theorem <xref rid="j_nejsds54_stat_005">4</xref>, when <inline-formula id="j_nejsds54_ineq_302"><alternatives><mml:math>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$r={2^{d}}-1$]]></tex-math></alternatives></inline-formula>, these limiting probabilities are balanced exactly when the first <inline-formula id="j_nejsds54_ineq_303"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[${2^{d}}-1$]]></tex-math></alternatives></inline-formula> raw moments of <italic>Q</italic> match the corresponding raw moments of the uniform distribution.</p>
<p>From this perspective, fixing <italic>d</italic> while increasing <italic>r</italic> involves higher moments of <italic>Q</italic> while nonetheless performing inference at a binary expansion depth of <italic>d</italic>. This would suggest that with <italic>d</italic> fixed, increasing <italic>r</italic> gives superior performance on more peculiar alternatives while overcomplicating simple cases, like location shift. We observe exactly this phenomenon in empirical studies, which we include in the supplementary materials. Our standard choice of <inline-formula id="j_nejsds54_ineq_304"><alternatives><mml:math>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$r={2^{d+1}}-1$]]></tex-math></alternatives></inline-formula> represents a compromise between these two competing forces.</p>
<p>We conclude by remarking that as a heuristic, this discussion is relevant to interpreting the multivariate version of AUGUST, whose symmetry statistics measure imbalance in the transformed collections <inline-formula id="j_nejsds54_ineq_305"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\tilde{\boldsymbol{X}}^{(\boldsymbol{X})}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_306"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">˜</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="bold-italic">X</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\tilde{\boldsymbol{Y}}^{(\boldsymbol{X})}}$]]></tex-math></alternatives></inline-formula>. Alternative transformations to Mahalanobis distance may yield more suitable information for some applications, though building off of the univariate test yields convenient intuition for the parameters <italic>d</italic> and <italic>r</italic>.</p>
</sec>
<sec id="j_nejsds54_s_013">
<label>5</label>
<title>Empirical Performance</title>
<sec id="j_nejsds54_s_014">
<label>5.1</label>
<title>Univariate Performance</title>
<p>Here, we compare AUGUST to a sampling of other nonparametric two-sample tests: Kolmogorov–Smirnov distance [<xref ref-type="bibr" rid="j_nejsds54_ref_026">26</xref>], Wasserstein distance [<xref ref-type="bibr" rid="j_nejsds54_ref_015">15</xref>], energy distance [<xref ref-type="bibr" rid="j_nejsds54_ref_042">42</xref>], and the recent DTS [<xref ref-type="bibr" rid="j_nejsds54_ref_016">16</xref>]. For these simulations, we use a sample size of <inline-formula id="j_nejsds54_ineq_307"><alternatives><mml:math>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>128</mml:mn></mml:math><tex-math><![CDATA[$n=m=128$]]></tex-math></alternatives></inline-formula>, and for our resolution-based test, we set a depth of <inline-formula id="j_nejsds54_ineq_308"><alternatives><mml:math>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>3</mml:mn></mml:math><tex-math><![CDATA[$d=3$]]></tex-math></alternatives></inline-formula>. For all tests, we use a <italic>p</italic>-value cutoff of <inline-formula id="j_nejsds54_ineq_309"><alternatives><mml:math>
<mml:mi mathvariant="italic">α</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0.05</mml:mn></mml:math><tex-math><![CDATA[$\alpha =0.05$]]></tex-math></alternatives></inline-formula>. Simulation results are graphed in Fig. <xref rid="j_nejsds54_fig_003">1</xref>.</p>
<p>The first two plots of Fig. <xref rid="j_nejsds54_fig_003">1</xref> correspond to normal and Laplace location alternatives, situations where differences in the first distributional moment are most diagnostic. Third, we have a symmetric beta versus asymmetric beta alternative, and fourth, we include a Laplace scale family. The last two plots of Fig. <xref rid="j_nejsds54_fig_003">1</xref> focus on families with identical first and second moments: normal versus mean-centered gamma in the fifth position, and standard normal versus symmetric, variance-scaled normal mixture in the sixth. For this final alternative distribution, samples are generated by first drawing from a symmetric mixture of normals with unit variance and then dividing by the theoretical standard deviation of the mixture distribution.</p>
<fig id="j_nejsds54_fig_003">
<label>Figure 1</label>
<caption>
<p>Univariate comparison of power between AUGUST in red, Kolmogorov–Smirnov distance in black, Wasserstein distance in green, DTS in blue, and energy distance in yellow. Our method performs comparably to existing approaches, with superior power in some circumstances.</p>
</caption>
<graphic xlink:href="nejsds54_g003.jpg"/>
</fig>
<p>For the location alternatives, the power of each method depends on the shape of the distribution. DTS, Wasserstein, and energy distance tests perform slightly better than ours for normal and beta distributions, and ours in turn outperforms Kolmogorov–Smirnov. In contrast, for a Laplace location shift, Kolmogorov–Smirnov outperforms every test, with our test in second place and DTS last. For the Laplace scale family, Kolmogorov–Smirnov performs poorly, with DTS and our test leading. DTS has the edge on the gamma skewness family, while we outperform all other tests at detecting normal versus symmetric normal mixture.</p>
<p>As expected, no single test performs best in all situations. Even for simple alternatives such as location families, the precise shape of the distribution is highly influential as to the tests’ relative performance. In fact, the performance rankings of DTS, Wasserstein, energy distance, and Kolmogorov–Smirnov in the Laplace location trials are exactly reversed compared to the normal location trials. We theorize that because the symmetry statistics <inline-formula id="j_nejsds54_ineq_310"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">X</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{X}}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_311"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="bold-italic">Y</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\boldsymbol{S}_{\boldsymbol{Y}}}$]]></tex-math></alternatives></inline-formula> are weighted equally in every coordinate, AUGUST is very parsimonious towards the range of potential alternatives. In contrast, other univariate methods show relatively greater sensitivity to location and scale shifts, but may be less robust against more obscure alternatives.</p>
</sec>
<sec id="j_nejsds54_s_015">
<label>5.2</label>
<title>Multivariate Performance</title>
<p>In Fig. <xref rid="j_nejsds54_fig_004">2</xref>, we compare our multivariate resolution-based test at depth <inline-formula id="j_nejsds54_ineq_312"><alternatives><mml:math>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>2</mml:mn></mml:math><tex-math><![CDATA[$d=2$]]></tex-math></alternatives></inline-formula> to some other well-known nonparametric multivariate two-sample tests. We perform these simulations in a low-dimensional context with <inline-formula id="j_nejsds54_ineq_313"><alternatives><mml:math>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>2</mml:mn></mml:math><tex-math><![CDATA[$k=2$]]></tex-math></alternatives></inline-formula>, using sample size <inline-formula id="j_nejsds54_ineq_314"><alternatives><mml:math>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>128</mml:mn></mml:math><tex-math><![CDATA[$n=m=128$]]></tex-math></alternatives></inline-formula> and cutoff <inline-formula id="j_nejsds54_ineq_315"><alternatives><mml:math>
<mml:mi mathvariant="italic">α</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0.05</mml:mn></mml:math><tex-math><![CDATA[$\alpha =0.05$]]></tex-math></alternatives></inline-formula>. In particular, we again consider the energy distance test of [<xref ref-type="bibr" rid="j_nejsds54_ref_042">42</xref>], as well as the classifier test of [<xref ref-type="bibr" rid="j_nejsds54_ref_031">31</xref>], the generalized edge-count method of [<xref ref-type="bibr" rid="j_nejsds54_ref_009">9</xref>], and the ball divergence test of [<xref ref-type="bibr" rid="j_nejsds54_ref_036">36</xref>], where the choice of these comparisons is inspired by [<xref ref-type="bibr" rid="j_nejsds54_ref_041">41</xref>]. For [<xref ref-type="bibr" rid="j_nejsds54_ref_009">9</xref>], we use a 5-minimum spanning tree based on Euclidean interpoint distance.</p>
<p>We consider a variety of alternatives. In order:</p>
<list>
<list-item id="j_nejsds54_li_006">
<label>1.</label>
<p><inline-formula id="j_nejsds54_ineq_316"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">I</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${N_{2}}(0,{I_{2\times 2}})$]]></tex-math></alternatives></inline-formula> vs. <inline-formula id="j_nejsds54_ineq_317"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mtext>center</mml:mtext>
<mml:mo>×</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">I</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${N_{2}}(\text{center}\times {1_{2}},{I_{2\times 2}})$]]></tex-math></alternatives></inline-formula></p>
</list-item>
<list-item id="j_nejsds54_li_007">
<label>2.</label>
<p><inline-formula id="j_nejsds54_ineq_318"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">I</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${N_{2}}(0,{I_{2\times 2}})$]]></tex-math></alternatives></inline-formula> vs. <inline-formula id="j_nejsds54_ineq_319"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mtext>scale</mml:mtext>
<mml:mo>×</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">I</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${N_{2}}(0,\text{scale}\times {I_{2\times 2}})$]]></tex-math></alternatives></inline-formula></p>
</list-item>
<list-item id="j_nejsds54_li_008">
<label>3.</label>
<p><inline-formula id="j_nejsds54_ineq_320"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:mtable columnspacing="10.0pt" equalrows="false" columnlines="none none none none none none none none none" equalcolumns="false" columnalign="center center">
<mml:mtr>
<mml:mtd class="array">
<mml:mn>1</mml:mn>
</mml:mtd>
<mml:mtd class="array">
<mml:mn>0</mml:mn>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="array">
<mml:mn>0</mml:mn>
</mml:mtd>
<mml:mtd class="array">
<mml:mn>1</mml:mn>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
</mml:math><tex-math><![CDATA[${N_{2}}\left(0,\left(\begin{array}{c@{\hskip10.0pt}c}1& 0\\ {} 0& 1\end{array}\right)\right)$]]></tex-math></alternatives></inline-formula> vs. <inline-formula id="j_nejsds54_ineq_321"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:mtable columnspacing="10.0pt" equalrows="false" columnlines="none none none none none none none none none" equalcolumns="false" columnalign="center center">
<mml:mtr>
<mml:mtd class="array">
<mml:mn>1</mml:mn>
</mml:mtd>
<mml:mtd class="array">
<mml:mtext>cov</mml:mtext>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="array">
<mml:mtext>cov</mml:mtext>
</mml:mtd>
<mml:mtd class="array">
<mml:mn>1</mml:mn>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
</mml:math><tex-math><![CDATA[${N_{2}}\left(0,\left(\begin{array}{c@{\hskip10.0pt}c}1& \text{cov}\\ {} \text{cov}& 1\end{array}\right)\right)$]]></tex-math></alternatives></inline-formula></p>
</list-item>
<list-item id="j_nejsds54_li_009">
<label>4.</label>
<p><inline-formula id="j_nejsds54_ineq_322"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:mtable columnspacing="10.0pt" equalrows="false" columnlines="none none none none none none none none none" equalcolumns="false" columnalign="center center">
<mml:mtr>
<mml:mtd class="array">
<mml:mn>1</mml:mn>
</mml:mtd>
<mml:mtd class="array">
<mml:mn>0</mml:mn>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="array">
<mml:mn>0</mml:mn>
</mml:mtd>
<mml:mtd class="array">
<mml:mn>9</mml:mn>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
</mml:math><tex-math><![CDATA[${N_{2}}\left(0,\left(\begin{array}{c@{\hskip10.0pt}c}1& 0\\ {} 0& 9\end{array}\right)\right)$]]></tex-math></alternatives></inline-formula> vs. <inline-formula id="j_nejsds54_ineq_323"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:mtable columnspacing="10.0pt" equalrows="false" columnlines="none none none none none none none none none" equalcolumns="false" columnalign="center center">
<mml:mtr>
<mml:mtd class="array">
<mml:mn>1</mml:mn>
</mml:mtd>
<mml:mtd class="array">
<mml:mn>0</mml:mn>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="array">
<mml:mn>0</mml:mn>
</mml:mtd>
<mml:mtd class="array">
<mml:mn>9</mml:mn>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
</mml:math><tex-math><![CDATA[${R_{\theta }}{N_{2}}\left(0,\left(\begin{array}{c@{\hskip10.0pt}c}1& 0\\ {} 0& 9\end{array}\right)\right)$]]></tex-math></alternatives></inline-formula>, where <inline-formula id="j_nejsds54_ineq_324"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${R_{\theta }}$]]></tex-math></alternatives></inline-formula> is the <inline-formula id="j_nejsds54_ineq_325"><alternatives><mml:math>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mn>2</mml:mn></mml:math><tex-math><![CDATA[$2\times 2$]]></tex-math></alternatives></inline-formula> rotation matrix through an angle <italic>θ</italic></p>
</list-item>
<list-item id="j_nejsds54_li_010">
<label>5.</label>
<p><inline-formula id="j_nejsds54_ineq_326"><alternatives><mml:math>
<mml:mo movablelimits="false">exp</mml:mo>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">I</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfenced>
</mml:math><tex-math><![CDATA[$\exp \left({N_{2}}(0,{I_{2\times 2}})\right)$]]></tex-math></alternatives></inline-formula> vs. <inline-formula id="j_nejsds54_ineq_327"><alternatives><mml:math>
<mml:mo movablelimits="false">exp</mml:mo>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">μ</mml:mi>
<mml:mo>×</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">I</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfenced>
</mml:math><tex-math><![CDATA[$\exp \left({N_{2}}(\mu \times {1_{2}},{I_{2\times 2}})\right)$]]></tex-math></alternatives></inline-formula></p>
</list-item>
<list-item id="j_nejsds54_li_011">
<label>6.</label>
<p><inline-formula id="j_nejsds54_ineq_328"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">I</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>×</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${N_{2}}(0,{I_{2\times 2}})$]]></tex-math></alternatives></inline-formula> vs. <inline-formula id="j_nejsds54_ineq_329"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">Z</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">B</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(Z,B)$]]></tex-math></alternatives></inline-formula>, where <inline-formula id="j_nejsds54_ineq_330"><alternatives><mml:math>
<mml:mi mathvariant="italic">Z</mml:mi>
<mml:mo stretchy="false">∼</mml:mo>
<mml:mi mathvariant="italic">N</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$Z\sim N(0,1)$]]></tex-math></alternatives></inline-formula> and <italic>B</italic> independently follows the bimodal mixture distribution from Section <xref rid="j_nejsds54_s_014">5.1</xref>.</p>
</list-item>
</list>
<p>In Fig. <xref rid="j_nejsds54_fig_004">2</xref>, we see that the energy and ball divergence tests dominate the other methods when mean shift is a factor (i.e. in the normal location and log-normal families). On a scale alternative, AUGUST has the best power, with ball divergence at a close second. In contrast, for correlation, rotation, and multimodal alternatives, the edge-count test has superior power, with ball divergence and energy distance coming at or near last place.</p>
<p>Overall, our test is robust against a wide range of possible alternatives, and it has particularly high performance against a scale alternative, where it outperforms all other methods considered. We theorize that, in part, this is because some of the other methods rely heavily on interpoint distances. The scale alternative does not result in good separation between <inline-formula id="j_nejsds54_ineq_331"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">X</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{X}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_332"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">Y</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{Y}$]]></tex-math></alternatives></inline-formula>, meaning that interpoint distances are not as diagnostic as they would be in, say, a location shift.</p>
<p>In the supplementary materials, we include additional comparisons with <inline-formula id="j_nejsds54_ineq_333"><alternatives><mml:math>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>5</mml:mn></mml:math><tex-math><![CDATA[$k=5$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds54_ineq_334"><alternatives><mml:math>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>20</mml:mn></mml:math><tex-math><![CDATA[$k=20$]]></tex-math></alternatives></inline-formula>, keeping the sample size fixed with <inline-formula id="j_nejsds54_ineq_335"><alternatives><mml:math>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>128</mml:mn></mml:math><tex-math><![CDATA[$m=n=128$]]></tex-math></alternatives></inline-formula>. Performance follows the same general pattern as when <inline-formula id="j_nejsds54_ineq_336"><alternatives><mml:math>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>2</mml:mn></mml:math><tex-math><![CDATA[$k=2$]]></tex-math></alternatives></inline-formula>: we lag against location alternatives but are very strong at scale, with no universal winner across all scenarios.</p>
<fig id="j_nejsds54_fig_004">
<label>Figure 2</label>
<caption>
<p>Multivariate comparison of power between AUGUST in red, [<xref ref-type="bibr" rid="j_nejsds54_ref_009">9</xref>] in green, energy distance in blue, [<xref ref-type="bibr" rid="j_nejsds54_ref_036">36</xref>] in black, and [<xref ref-type="bibr" rid="j_nejsds54_ref_031">31</xref>] in yellow for two dimensions. Our method has comparable power with existing methods, and it outperforms all others against a scale alternative.</p>
</caption>
<graphic xlink:href="nejsds54_g004.jpg"/>
</fig>
</sec>
</sec>
<sec id="j_nejsds54_s_016">
<label>6</label>
<title>A Study of NBA Shooting Data</title>
<p>We demonstrate the interpretability of AUGUST using 2015–2016 NBA play-by-play data.<xref ref-type="fn" rid="j_nejsds54_fn_001">2</xref><fn id="j_nejsds54_fn_001"><label><sup>2</sup></label>
<p>Source: <ext-link ext-link-type="uri" xlink:href="https://www.nbastuffer.com/">nbastuffer.com</ext-link></p></fn> Consider the distributions of throw distances and angles from the net. Are these distributions different for shots and misses? How about for the first two quarters versus the last two quarters? To address these questions, we acquired play-by-play data for the 2015-2016 NBA season. For each throw, the location of the throw was recorded as a pair of <italic>x</italic>, <italic>y</italic> coordinates. These coordinates were converted into a distance and angle from the target net, using knowledge of NBA court dimensions. This data processing yielded a data set on the order of <inline-formula id="j_nejsds54_ineq_337"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mn>10</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>6</mml:mn>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${10^{6}}$]]></tex-math></alternatives></inline-formula> observations.</p>
<p>Data were split according to shots versus misses and early game versus late game. Four separate AUGUST tests at a depth of <inline-formula id="j_nejsds54_ineq_338"><alternatives><mml:math>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>2</mml:mn></mml:math><tex-math><![CDATA[$d=2$]]></tex-math></alternatives></inline-formula> were performed to analyze the distribution of throw distances and angles. For shot vs. miss distance, shot vs. miss angle, and early vs. late game distance, AUGUST reports <inline-formula id="j_nejsds54_ineq_339"><alternatives><mml:math>
<mml:mi mathvariant="italic">p</mml:mi>
<mml:mo mathvariant="normal">&lt;</mml:mo>
<mml:mn>0.001</mml:mn></mml:math><tex-math><![CDATA[$p\lt 0.001$]]></tex-math></alternatives></inline-formula>, while for early vs. late game angle, AUGUST returns <inline-formula id="j_nejsds54_ineq_340"><alternatives><mml:math>
<mml:mi mathvariant="italic">p</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0.004</mml:mn></mml:math><tex-math><![CDATA[$p=0.004$]]></tex-math></alternatives></inline-formula>. For comparison, Kolmogorov-Smirnov yields the same result for the first three scenarios, giving <inline-formula id="j_nejsds54_ineq_341"><alternatives><mml:math>
<mml:mi mathvariant="italic">p</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0.086</mml:mn></mml:math><tex-math><![CDATA[$p=0.086$]]></tex-math></alternatives></inline-formula> for the fourth. DTS produces <inline-formula id="j_nejsds54_ineq_342"><alternatives><mml:math>
<mml:mi mathvariant="italic">p</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0.033</mml:mn></mml:math><tex-math><![CDATA[$p=0.033$]]></tex-math></alternatives></inline-formula> for the fourth.</p>
<p>To demonstrate interpretability, we provide visualizations in Fig. <xref rid="j_nejsds54_fig_005">3</xref> as alluded to in Section <xref rid="j_nejsds54_s_008">2.4</xref>. Each histogram corresponds to one of the two samples in the test: this reference sample is indicated on the <italic>x</italic>-axis. The shaded rectangles overlaid on these histograms illustrate the largest symmetry statistic from the corresponding AUGUST test. For example, the top plot corresponds to throw distance for shots versus misses. The histogram records the distribution of missed throw distances.</p>
<p>Each plot in Fig. <xref rid="j_nejsds54_fig_005">3</xref> yields a specific interpretation as to the greatest distributional imbalance. From the top plot, we see that successful throws tend to be closer to the net than misses. Next, successful throws come from the side more often than misses. Following that, throws early in the game are more frequently from an intermediate distance than late game throws. Finally, throws early in the game come more frequently from the side than they do in the late game. The second of these four is perhaps most counterintuitive, as conventional wisdom suggests that throws from in front of the net are more accurate than throws from the sides. This apparent paradox comes from the fact that throws from the sides are typically at a much closer range.</p>
<fig id="j_nejsds54_fig_005">
<label>Figure 3</label>
<caption>
<p>Greatest asymmetries in NBA data. Successful shots are closer to the net than missed shots and come from more extreme angles. Shots in the early game come from a more intermediate distance than in the late game, as well as from more extreme angles.</p>
</caption>
<graphic xlink:href="nejsds54_g005.jpg"/>
</fig>
</sec>
<sec id="j_nejsds54_s_017">
<label>7</label>
<title>Discussion</title>
<p>An important future direction involves refining the multivariate approach. The simulations of Section <xref rid="j_nejsds54_fig_004">2</xref> speak solely to low-dimensional contexts. We emphasize that other multivariate tests such as [<xref ref-type="bibr" rid="j_nejsds54_ref_009">9</xref>] enjoy remarkable power properties in growing dimensions. As such, accurate estimation of covariance matrices remains a hindrance to the mutual Mahalanobis distance approach as the dimension <italic>k</italic> nears a significant fraction of <italic>n</italic>. In a multivariate context, the test of Section <xref rid="j_nejsds54_s_011">3.2</xref> serves as a useful starting point for future multi-resolution methods [<xref ref-type="bibr" rid="j_nejsds54_ref_029">29</xref>], and future work will focus on extending asymptotic theory in light of the Mahalanobis distance transformation, or other transformations. Permutation, especially in the multivariate context, is feasible but still costly. Repeated evaluations of the hypergeometric probability mass function drive up the constant factor on the <inline-formula id="j_nejsds54_ineq_343"><alternatives><mml:math>
<mml:mi mathvariant="italic">O</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo movablelimits="false">log</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$O((n+m)\log (n+m))$]]></tex-math></alternatives></inline-formula> running time, compared to simpler methods of the same order, such as Kolmogorov-Smirnov. Computational burdens could be eased by performing inference across a carefully-selected range of binary depths. For example, as a multivariate test of dependence, the coarse-to-fine sequential adaptive method of [<xref ref-type="bibr" rid="j_nejsds54_ref_019">19</xref>] chooses a subset of available univariate tests at each resolution using spatial knowledge of dependency structures.</p>
<p>The interpretability of our two-sample test also sheds light on transformations of data from one distribution to the other. This problem is a fundamental subject in transportation theory [<xref ref-type="bibr" rid="j_nejsds54_ref_043">43</xref>]. We plan to study this problem with recent developments in multi-resolution nonparametric modeling [<xref ref-type="bibr" rid="j_nejsds54_ref_008">8</xref>] to provide insights on the optimal transportation.</p>
</sec>
</body>
<back>
<ack id="j_nejsds54_ack_001">
<title>Acknowledgements</title>
<p>The authors thank the editor, associate editor, and reviewers for their helpful feedback. The authors additionally thank Shankar Bhamidi, Hao Chen, Jan Hannig, Michael Kosorok, Xiao-Li Meng, and Richard Smith for valuable comments and suggestions.</p></ack>
<ref-list id="j_nejsds54_reflist_001">
<title>References</title>
<ref id="j_nejsds54_ref_001">
<label>[1]</label><mixed-citation publication-type="journal"><string-name><surname>Anderson</surname>, <given-names>T. W.</given-names></string-name> and <string-name><surname>Darling</surname>, <given-names>D. A.</given-names></string-name> (<year>1952</year>). <article-title>Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes</article-title>. <source>The Annals of Mathematical Statistics</source> <fpage>193</fpage>–<lpage>212</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1214/aoms/1177729437" xlink:type="simple">https://doi.org/10.1214/aoms/1177729437</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=0050238">MR0050238</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_002">
<label>[2]</label><mixed-citation publication-type="journal"><string-name><surname>Aslan</surname>, <given-names>B.</given-names></string-name> and <string-name><surname>Zech</surname>, <given-names>G.</given-names></string-name> (<year>2005</year>). <article-title>New test for the multivariate two-sample problem based on the concept of minimum energy</article-title>. <source>Journal of Statistical Computation and Simulation</source> <volume>75</volume>(<issue>2</issue>) <fpage>109</fpage>–<lpage>119</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/00949650410001661440" xlink:type="simple">https://doi.org/10.1080/00949650410001661440</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=2117010">MR2117010</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_003">
<label>[3]</label><mixed-citation publication-type="other"><string-name><surname>Banerjee</surname>, <given-names>B.</given-names></string-name> and <string-name><surname>Ghosh</surname>, <given-names>A. K.</given-names></string-name> (2022). On high dimensional behaviour of some two-sample tests based on ball divergence. <italic>arXiv preprint</italic> <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/arXiv:2212.08566"><italic>arXiv:2212.08566</italic></ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_004">
<label>[4]</label><mixed-citation publication-type="journal"><string-name><surname>Baumgartner</surname>, <given-names>W.</given-names></string-name>, <string-name><surname>WeiSS</surname>, <given-names>P.</given-names></string-name> and <string-name><surname>Schindler</surname>, <given-names>H.</given-names></string-name> (<year>1998</year>). <article-title>A nonparametric test for the general two-sample problem</article-title>. <source>Biometrics</source> <fpage>1129</fpage>–<lpage>1135</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_005">
<label>[5]</label><mixed-citation publication-type="journal"><string-name><surname>Bhattacharya</surname>, <given-names>B. B.</given-names></string-name> (<year>2019</year>). <article-title>A general asymptotic framework for distribution-free graph-based two-sample tests</article-title>. <source>Journal of the Royal Statistical Society: Series B (Statistical Methodology)</source> <volume>81</volume>(<issue>3</issue>) <fpage>575</fpage>–<lpage>602</lpage>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=3961499">MR3961499</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_006">
<label>[6]</label><mixed-citation publication-type="journal"><string-name><surname>Biswas</surname>, <given-names>M.</given-names></string-name> and <string-name><surname>Ghosh</surname>, <given-names>A. K.</given-names></string-name> (<year>2014</year>). <article-title>A nonparametric two-sample test applicable to high dimensional data</article-title>. <source>Journal of Multivariate Analysis</source> <volume>123</volume> <fpage>160</fpage>–<lpage>171</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.jmva.2013.09.004" xlink:type="simple">https://doi.org/10.1016/j.jmva.2013.09.004</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=3130427">MR3130427</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_007">
<label>[7]</label><mixed-citation publication-type="journal"><string-name><surname>Biswas</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Mukhopadhyay</surname>, <given-names>M.</given-names></string-name> and <string-name><surname>Ghosh</surname>, <given-names>A. K.</given-names></string-name> (<year>2014</year>). <article-title>A distribution-free two-sample run test applicable to high-dimensional data</article-title>. <source>Biometrika</source> <volume>101</volume>(<issue>4</issue>) <fpage>913</fpage>–<lpage>926</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1093/biomet/asu045" xlink:type="simple">https://doi.org/10.1093/biomet/asu045</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=3286925">MR3286925</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_008">
<label>[8]</label><mixed-citation publication-type="other"><string-name><surname>Brown</surname>, <given-names>B.</given-names></string-name>, <string-name><surname>Zhang</surname>, <given-names>K.</given-names></string-name> and <string-name><surname>Meng</surname>, <given-names>X. -L.</given-names></string-name> (2022). BELIEF in dependence: leveraging atomic linearity in data bits for rethinking generalized linear models. <italic>arXiv preprint</italic> <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/arXiv:2210.10852"><italic>arXiv:2210.10852</italic></ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_009">
<label>[9]</label><mixed-citation publication-type="journal"><string-name><surname>Chen</surname>, <given-names>H.</given-names></string-name> and <string-name><surname>Friedman</surname>, <given-names>J. H.</given-names></string-name> (<year>2017</year>). <article-title>A new graph-based two-sample test for multivariate and object data</article-title>. <source>Journal of the American statistical association</source> <volume>112</volume>(<issue>517</issue>) <fpage>397</fpage>–<lpage>409</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/01621459.2016.1147356" xlink:type="simple">https://doi.org/10.1080/01621459.2016.1147356</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=3646580">MR3646580</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_010">
<label>[10]</label><mixed-citation publication-type="journal"><string-name><surname>Chen</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Chen</surname>, <given-names>X.</given-names></string-name> and <string-name><surname>Su</surname>, <given-names>Y.</given-names></string-name> (<year>2018</year>). <article-title>A weighted edge-count two-sample test for multivariate and object data</article-title>. <source>Journal of the American Statistical Association</source> <volume>113</volume>(<issue>523</issue>) <fpage>1146</fpage>–<lpage>1155</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/01621459.2017.1307757" xlink:type="simple">https://doi.org/10.1080/01621459.2017.1307757</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=3862346">MR3862346</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_011">
<label>[11]</label><mixed-citation publication-type="journal"><string-name><surname>Chwialkowski</surname>, <given-names>K. P.</given-names></string-name>, <string-name><surname>Ramdas</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Sejdinovic</surname>, <given-names>D.</given-names></string-name> and <string-name><surname>Gretton</surname>, <given-names>A.</given-names></string-name> (<year>2015</year>). <article-title>Fast two-sample testing with analytic representations of probability measures</article-title>. <source>Advances in Neural Information Processing Systems</source> <volume>28</volume> <fpage>1981</fpage>–<lpage>1989</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_012">
<label>[12]</label><mixed-citation publication-type="journal"><string-name><surname>Cramér</surname>, <given-names>H.</given-names></string-name> (<year>1928</year>). <article-title>On the composition of elementary errors: First paper: Mathematical deductions</article-title>. <source>Scandinavian Actuarial Journal</source> <volume>1928</volume>(<issue>1</issue>) <fpage>13</fpage>–<lpage>74</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_013">
<label>[13]</label><mixed-citation publication-type="journal"><string-name><surname>Cucconi</surname>, <given-names>O.</given-names></string-name> (<year>1968</year>). <article-title>Un nuovo test non parametrico per il confronto fra due gruppi di valori campionari</article-title>. <source>Giornale degli Economisti e Annali di Economia</source> <fpage>225</fpage>–<lpage>248</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_014">
<label>[14]</label><mixed-citation publication-type="journal"><string-name><surname>DeCost</surname>, <given-names>B. L.</given-names></string-name> and <string-name><surname>Holm</surname>, <given-names>E. A.</given-names></string-name> (<year>2017</year>). <article-title>Characterizing powder materials using keypoint-based computer vision methods</article-title>. <source>Computational Materials Science</source> <volume>126</volume> <fpage>438</fpage>–<lpage>445</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_015">
<label>[15]</label><mixed-citation publication-type="journal"><string-name><surname>Dobrushin</surname>, <given-names>R. L.</given-names></string-name> (<year>1970</year>). <article-title>Prescribing a system of random variables by conditional distributions</article-title>. <source>Theory of Probability &amp; Its Applications</source> <volume>15</volume>(<issue>3</issue>) <fpage>458</fpage>–<lpage>486</lpage>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=0298716">MR0298716</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_016">
<label>[16]</label><mixed-citation publication-type="other"><string-name><surname>Dowd</surname>, <given-names>C.</given-names></string-name> (2020). A new ECDF two-sample test statistic. <italic>arXiv preprint</italic> <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/arXiv:2007.01360"><italic>arXiv:2007.01360</italic></ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_017">
<label>[17]</label><mixed-citation publication-type="journal"><string-name><surname>Duong</surname>, <given-names>T.</given-names></string-name> (<year>2013</year>). <article-title>Local significant differences from nonparametric two-sample tests</article-title>. <source>Journal of Nonparametric Statistics</source> <volume>25</volume>(<issue>3</issue>) <fpage>635</fpage>–<lpage>645</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/10485252.2013.810217" xlink:type="simple">https://doi.org/10.1080/10485252.2013.810217</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=3174288">MR3174288</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_018">
<label>[18]</label><mixed-citation publication-type="journal"><string-name><surname>Friedman</surname>, <given-names>J. H.</given-names></string-name> and <string-name><surname>Rafsky</surname>, <given-names>L. C.</given-names></string-name> (<year>1979</year>). <article-title>Multivariate generalizations of the Wald-Wolfowitz and Smirnov two-sample tests</article-title>. <source>The Annals of Statistics</source> <fpage>697</fpage>–<lpage>717</lpage>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=0532236">MR0532236</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_019">
<label>[19]</label><mixed-citation publication-type="journal"><string-name><surname>Gorsky</surname>, <given-names>S.</given-names></string-name> and <string-name><surname>Ma</surname>, <given-names>L.</given-names></string-name> (<year>2022</year>). <article-title>Multi-scale Fisher’s independence test for multivariate dependence</article-title>. <source>Biometrika</source> <volume>109</volume>(<issue>3</issue>) <fpage>569</fpage>–<lpage>587</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1093/biomet/asac013" xlink:type="simple">https://doi.org/10.1093/biomet/asac013</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=4472834">MR4472834</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_020">
<label>[20]</label><mixed-citation publication-type="chapter"><string-name><surname>Gretton</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Fukumizu</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Teo</surname>, <given-names>C. H.</given-names></string-name>, <string-name><surname>Song</surname>, <given-names>L.</given-names></string-name>, <string-name><surname>Schölkopf</surname>, <given-names>B.</given-names></string-name> and <string-name><surname>Smola</surname>, <given-names>A. J.</given-names></string-name> (<year>2007</year>). <chapter-title>A kernel statistical test of independence</chapter-title>. In: <source>Advances in Neural Information Processing Systems</source> <fpage>585</fpage>–<lpage>592</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_021">
<label>[21]</label><mixed-citation publication-type="journal"><string-name><surname>Gretton</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Borgwardt</surname>, <given-names>K. M.</given-names></string-name>, <string-name><surname>Rasch</surname>, <given-names>M. J.</given-names></string-name>, <string-name><surname>Schölkopf</surname>, <given-names>B.</given-names></string-name> and <string-name><surname>Smola</surname>, <given-names>A.</given-names></string-name> (<year>2012</year>). <article-title>A kernel two-sample test</article-title>. <source>The Journal of Machine Learning Research</source> <volume>13</volume>(<issue>1</issue>) <fpage>723</fpage>–<lpage>773</lpage>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=2913716">MR2913716</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_022">
<label>[22]</label><mixed-citation publication-type="chapter"><string-name><surname>Harchaoui</surname>, <given-names>Z.</given-names></string-name>, <string-name><surname>Bach</surname>, <given-names>F. R.</given-names></string-name> and <string-name><surname>Moulines</surname>, <given-names>E.</given-names></string-name> (<year>2007</year>). <chapter-title>Testing for homogeneity with kernel Fisher discriminant analysis</chapter-title>. In <source>NIPS</source> <fpage>609</fpage>–<lpage>616</lpage>. <publisher-name>Citeseer</publisher-name>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_023">
<label>[23]</label><mixed-citation publication-type="journal"><string-name><surname>Hazelton</surname>, <given-names>M. L.</given-names></string-name> and <string-name><surname>Davies</surname>, <given-names>T. M.</given-names></string-name> (<year>2022</year>). <article-title>Pointwise comparison of two multivariate density functions</article-title>. <source>Scandinavian Journal of Statistics</source> <volume>49</volume>(<issue>4</issue>) <fpage>1791</fpage>–<lpage>1810</lpage>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=4544820">MR4544820</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_024">
<label>[24]</label><mixed-citation publication-type="journal"><string-name><surname>Hettmansperger</surname>, <given-names>T. P.</given-names></string-name>, <string-name><surname>Möttönen</surname>, <given-names>J.</given-names></string-name> and <string-name><surname>Oja</surname>, <given-names>H.</given-names></string-name> (<year>1998</year>). <article-title>Affine invariant multivariate rank tests for several samples</article-title>. <source>Statistica Sinica</source> <fpage>785</fpage>–<lpage>800</lpage>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=1651508">MR1651508</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_025">
<label>[25]</label><mixed-citation publication-type="other"><string-name><surname>Jitkrittum</surname>, <given-names>W.</given-names></string-name>, <string-name><surname>Szabó</surname>, <given-names>Z.</given-names></string-name>, <string-name><surname>Chwialkowski</surname>, <given-names>K. P.</given-names></string-name> and <string-name><surname>Gretton</surname>, <given-names>A.</given-names></string-name> (2016). Interpretable distribution features with maximum testing power. <italic>Advances in Neural Information Processing Systems</italic> <bold>29</bold>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_026">
<label>[26]</label><mixed-citation publication-type="journal"><string-name><surname>Kolmogorov</surname>, <given-names>A.</given-names></string-name> (<year>1933</year>). <article-title>Sulla determinazione empirica di una lgge di distribuzione</article-title>. <source>Inst. Ital. Attuari, Giorn.</source> <volume>4</volume> <fpage>83</fpage>–<lpage>91</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_027">
<label>[27]</label><mixed-citation publication-type="journal"><string-name><surname>Lepage</surname>, <given-names>Y.</given-names></string-name> (<year>1971</year>). <article-title>A combination of Wilcoxon’s and Ansari-Bradley’s statistics</article-title>. <source>Biometrika</source> <volume>58</volume>(<issue>1</issue>) <fpage>213</fpage>–<lpage>217</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1093/biomet/58.1.213" xlink:type="simple">https://doi.org/10.1093/biomet/58.1.213</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=0408101">MR0408101</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_028">
<label>[28]</label><mixed-citation publication-type="journal"><string-name><surname>Li</surname>, <given-names>J.</given-names></string-name> (<year>2018</year>). <article-title>Asymptotic normality of interpoint distances for high-dimensional data with applications to the two-sample problem</article-title>. <source>Biometrika</source> <volume>105</volume>(<issue>3</issue>) <fpage>529</fpage>–<lpage>546</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1093/biomet/asy020" xlink:type="simple">https://doi.org/10.1093/biomet/asy020</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=3842883">MR3842883</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_029">
<label>[29]</label><mixed-citation publication-type="journal"><string-name><surname>Li</surname>, <given-names>X.</given-names></string-name> and <string-name><surname>Meng</surname>, <given-names>X. -L.</given-names></string-name> (<year>2021</year>). <article-title>A multi-resolution theory for approximating infinite-<italic>p</italic>-zero-<italic>n</italic>: Transitional inference, individualized predictions, and a world without bias-variance tradeoff</article-title>. <source>Journal of the American Statistical Association</source> <volume>116</volume>(<issue>533</issue>) <fpage>353</fpage>–<lpage>367</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/01621459.2020.1844210" xlink:type="simple">https://doi.org/10.1080/01621459.2020.1844210</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=4227699">MR4227699</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_030">
<label>[30]</label><mixed-citation publication-type="journal"><string-name><surname>Liu</surname>, <given-names>R. Y.</given-names></string-name> (<year>1992</year>). <article-title>Data depth and multivariate rank tests</article-title>. <italic>L1-Statistical Analysis and Related Methods</italic> <fpage>279</fpage>–<lpage>294</lpage>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=1214839">MR1214839</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_031">
<label>[31]</label><mixed-citation publication-type="other"><string-name><surname>Lopez-Paz</surname>, <given-names>D.</given-names></string-name> and <string-name><surname>Oquab</surname>, <given-names>M.</given-names></string-name> (2016). Revisiting classifier two-sample tests. <italic>arXiv preprint</italic> <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/arXiv:1610.06545"><italic>arXiv:1610.06545</italic></ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_032">
<label>[32]</label><mixed-citation publication-type="journal"><string-name><surname>Mahajan</surname>, <given-names>K. K.</given-names></string-name>, <string-name><surname>Gaur</surname>, <given-names>A.</given-names></string-name> and <string-name><surname>Arora</surname>, <given-names>S.</given-names></string-name> (<year>2011</year>). <article-title>A nonparametric test for a two-sample scale problem based on subsample medians</article-title>. <source>Statistics &amp; Probability Letters</source> <volume>81</volume>(<issue>8</issue>) <fpage>983</fpage>–<lpage>988</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.spl.2011.01.018" xlink:type="simple">https://doi.org/10.1016/j.spl.2011.01.018</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=2803733">MR2803733</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_033">
<label>[33]</label><mixed-citation publication-type="journal"><string-name><surname>Mann</surname>, <given-names>H. B.</given-names></string-name> and <string-name><surname>Whitney</surname>, <given-names>D. R.</given-names></string-name> (1947). <article-title>On a test of whether one of two random variables is stochastically larger than the other</article-title>. <source>The Annals of Mathematical Statistics</source> <fpage>50</fpage>–<lpage>60</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1214/aoms/1177730491" xlink:type="simple">https://doi.org/10.1214/aoms/1177730491</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=0022058">MR0022058</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_034">
<label>[34]</label><mixed-citation publication-type="chapter"><string-name><surname>Mueller</surname>, <given-names>J. W.</given-names></string-name> and <string-name><surname>Jaakkola</surname>, <given-names>T.</given-names></string-name> (2015). <chapter-title>Principal differences analysis: Interpretable characterization of differences between distributions</chapter-title>. In: <source>Advances in Neural Information Processing Systems</source> <volume>28</volume>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_035">
<label>[35]</label><mixed-citation publication-type="book"><string-name><surname>Oja</surname>, <given-names>H.</given-names></string-name> (<year>2010</year>) <source>Multivariate Nonparametric Methods with R: An Approach Based on Spatial Signs and Ranks</source>. <publisher-name>Springer Science &amp; Business Media</publisher-name>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/978-1-4419-0468-3" xlink:type="simple">https://doi.org/10.1007/978-1-4419-0468-3</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=2598854">MR2598854</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_036">
<label>[36]</label><mixed-citation publication-type="journal"><string-name><surname>Pan</surname>, <given-names>W.</given-names></string-name>, <string-name><surname>Tian</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Wang</surname>, <given-names>X.</given-names></string-name> and <string-name><surname>Zhang</surname>, <given-names>H.</given-names></string-name> (<year>2018</year>). <article-title>Ball divergence: nonparametric two sample test</article-title>. <source>Annals of Statistics</source> <volume>46</volume>(<issue>3</issue>) <fpage>1109</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1214/17-AOS1579" xlink:type="simple">https://doi.org/10.1214/17-AOS1579</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=3797998">MR3797998</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_037">
<label>[37]</label><mixed-citation publication-type="journal"><string-name><surname>Pandit</surname>, <given-names>P. V.</given-names></string-name>, <string-name><surname>Kumari</surname>, <given-names>S.</given-names></string-name> and <string-name><surname>Javali</surname>, <given-names>S.</given-names></string-name> (<year>2014</year>). <article-title>Tests for two-sample location problem based on subsample quantiles</article-title>. <source>Open Journal of Statistics</source> <volume>2014</volume>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_038">
<label>[38]</label><mixed-citation publication-type="journal"><string-name><surname>Robert Stephenson</surname>, <given-names>W.</given-names></string-name> and <string-name><surname>Ghosh</surname>, <given-names>M.</given-names></string-name> (<year>1985</year>). <article-title>Two sample nonparametric tests based on subsamples</article-title>. <source>Communications in Statistics-Theory and Methods</source> <volume>14</volume>(<issue>7</issue>) <fpage>1669</fpage>–<lpage>1684</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/03610928508829003" xlink:type="simple">https://doi.org/10.1080/03610928508829003</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=0801632">MR0801632</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_039">
<label>[39]</label><mixed-citation publication-type="journal"><string-name><surname>Rosenbaum</surname>, <given-names>P. R.</given-names></string-name> (<year>2005</year>). <article-title>An exact distribution-free test comparing two multivariate distributions based on adjacency</article-title>. <source>Journal of the Royal Statistical Society: Series B (Statistical Methodology)</source> <volume>67</volume>(<issue>4</issue>) <fpage>515</fpage>–<lpage>530</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1111/j.1467-9868.2005.00513.x" xlink:type="simple">https://doi.org/10.1111/j.1467-9868.2005.00513.x</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=2168202">MR2168202</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_040">
<label>[40]</label><mixed-citation publication-type="journal"><string-name><surname>Rousson</surname>, <given-names>V.</given-names></string-name> (<year>2002</year>). <article-title>On distribution-free tests for the multivariate two-sample location-scale model</article-title>. <source>Journal of Multivariate Analysis</source> <volume>80</volume>(<issue>1</issue>) <fpage>43</fpage>–<lpage>57</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1006/jmva.2000.1981" xlink:type="simple">https://doi.org/10.1006/jmva.2000.1981</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=1889832">MR1889832</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_041">
<label>[41]</label><mixed-citation publication-type="other"><string-name><surname>Song</surname>, <given-names>H.</given-names></string-name> and <string-name><surname>Chen</surname>, <given-names>H.</given-names></string-name> (2020). Generalized kernel two-sample tests. <italic>arXiv preprint</italic> <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/arXiv:2011.06127"><italic>arXiv:2011.06127</italic></ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_042">
<label>[42]</label><mixed-citation publication-type="journal"><string-name><surname>Székely</surname>, <given-names>G. J.</given-names></string-name> and <string-name><surname>Rizzo</surname>, <given-names>M. L.</given-names></string-name> (<year>2013</year>). <article-title>Energy statistics: A class of statistics based on distances</article-title>. <source>Journal of Statistical Planning and Inference</source> <volume>143</volume>(<issue>8</issue>) <fpage>1249</fpage>–<lpage>1272</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.jspi.2013.03.018" xlink:type="simple">https://doi.org/10.1016/j.jspi.2013.03.018</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=3055745">MR3055745</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_043">
<label>[43]</label><mixed-citation publication-type="book"><string-name><surname>Villani</surname>, <given-names>C.</given-names></string-name> (<year>2009</year>) <source>Optimal Transport: Old and New</source> <volume>338</volume>. <publisher-name>Springer</publisher-name>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/978-3-540-71050-9" xlink:type="simple">https://doi.org/10.1007/978-3-540-71050-9</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=2459454">MR2459454</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds54_ref_044">
<label>[44]</label><mixed-citation publication-type="other"><string-name><surname>Yamada</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Wu</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Tsai</surname>, <given-names>Y. q. H. H.</given-names></string-name>, <string-name><surname>Takeuchi</surname>, <given-names>I.</given-names></string-name>, <string-name><surname>Salakhutdinov</surname>, <given-names>R.</given-names></string-name> and <string-name><surname>Fukumizu</surname>, <given-names>K.</given-names></string-name> (2018). Post selection inference with incomplete maximum mean discrepancy estimator. <italic>arXiv preprint</italic> <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/arXiv:1802.06226"><italic>arXiv:1802.06226</italic></ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_045">
<label>[45]</label><mixed-citation publication-type="journal"><string-name><surname>Zhang</surname>, <given-names>K.</given-names></string-name> (<year>2019</year>). <article-title>BET on Independence</article-title>. <source>Journal of the American Statistical Association</source> <volume>114</volume>(<issue>528</issue>) <fpage>1620</fpage>–<lpage>1637</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1080/01621459.2018.1537921" xlink:type="simple">https://doi.org/10.1080/01621459.2018.1537921</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds54_ref_046">
<label>[46]</label><mixed-citation publication-type="other"><string-name><surname>Zhang</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Zhao</surname>, <given-names>Z.</given-names></string-name> and <string-name><surname>Zhou</surname>, <given-names>W.</given-names></string-name> (2021). BEAUTY powered BEAST. <italic>arXiv preprint</italic> <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/arXiv:2103.00674"><italic>arXiv:2103.00674</italic></ext-link>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
