The New England Journal of Statistics in Data Science logo


  • Help
Login Register

  1. Home
  2. Issues
  3. Volume 1, Issue 2 (2023)
  4. New Perspectives on Centering

The New England Journal of Statistics in Data Science

Submit your article Information Become a Peer-reviewer
  • Article info
  • Full article
  • More
    Article info Full article

New Perspectives on Centering
Volume 1, Issue 2 (2023), pp. 216–236
Jack Prothero   Jan Hannig   J.S. Marron  

Authors

 
Placeholder
https://doi.org/10.51387/23-NEJSDS31
Pub. online: 26 April 2023      Type: Methodology Article      Open accessOpen Access
Area: Statistical Methodology

Accepted
20 March 2023
Published
26 April 2023

Abstract

Data matrix centering is an ever-present yet under-examined aspect of data analysis. Functional data analysis (FDA) often operates with a default of centering such that the vectors in one dimension have mean zero. We find that centering along the other dimension identifies a novel useful mode of variation beyond those familiar in FDA. We explore ambiguities in both matrix orientation and nomenclature. Differences between centerings and their potential interaction can be easily misunderstood. We propose a unified framework and new terminology for centering operations. We clearly demonstrate the intuition behind and consequences of each centering choice with informative graphics. We also propose a new direction energy hypothesis test as part of a series of diagnostics for determining which choice of centering is best for a data set. We explore the application of these diagnostics in several FDA settings.

References

[1] 
Canudas-Romo, V., Glei, D., Gómez-Redondo, R., Coelho, E. and Boe, C. (2008). Mortality changes in the Iberian Peninsula in the last decades of the twentieth century. Population 63 319–343.
[2] 
Ciriello, G., Gatza, M. L., Beck, A. H., Wilkerson, M. D., Rhie, S. K., Pastore, A., Zhang, H., McLellan, M., Yau, C., Kandoth, C., Bowlby, R., Shen, H., Hayat, S., Fieldhouse, R., Lester, S. C., Tse, G. M. K., Factor, R. E., Collins, L. C., Allison, K. H., Chen, Y.-Y., Jensen, K., Johnson, N. B., Oesterreich, S., Mills, G. B., Cherniack, A. D., Robertson, G., Benz, C., Sander, C., Laird, P. W., Hoadley, K. A., King, T. A., Network, T. R. and Perou, C. M. (2015). Comprehensive Molecular Portraits of Invasive Lobular Breast Cancer. Cell 163(2) 506–519. https://doi.org/10.1016/j.cell.2015.09.033
[3] 
Feng, Q., Jiang, M., Hannig, J. and Marron, J. S. (2018). Angle-based joint and individual variation explained. J. Multivariate Anal. 166 241–265. https://doi.org/10.1016/j.jmva.2018.03.008. MR3799646
[4] 
Gómez-Redondo, R. and Boe, C. (2005). Decomposition analysis of Spanish life expectancy at birth: Evolution and changes in the components by sex and age. Demographic Research S4(20) 521–546. https://www.demographic-research.org/special/4/20/s4-20.pdf. https://doi.org/10.4054/DemRes.2005.13.20
[5] 
Horváth, L. and Kokoszka, P. (2012) Inference for Functional Data with Applications. Springer Series in Statistics. Springer New York. https://books.google.com/books?id=OVezLB__ZpYC. https://doi.org/10.1007/978-1-4614-3655-3. MR2920735
[6] 
Hotelling, H. (1936). Relations between two sets of variates. Biometrika 28(3-4) 321–377. https://academic.oup.com/biomet/article-pdf/28/3-4/321/586830/28-3-4-321.pdf. https://doi.org/10.1093/biomet/28.3-4.321
[7] 
Hsing, T. and Eubank, R. (2015). Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators. Wiley Series in Probability and Statistics. Wiley. https://books.google.com/books?id=YjsbAAAACAAJ. https://doi.org/10.1002/9781118762547. MR3379106
[8] 
Kimes, P. K., Cabanski, C. R., Wilkerson, M. D., Zhao, N., Johnson, A. R., Perou, C. M., Makowski, L., Maher, C. A., Liu, Y., Marron, J. S. and Hayes, D. N. (2014). SigFuge: single gene clustering of RNA-seq reveals differential isoform usage among cancer samples. Nucleic Acids Res. 42(14) 113.
[9] 
Kokoszka, P. and Reimherr, M. (2017). Introduction to Functional Data Analysis. Chapman & Hall/CRC Texts in Statistical Science. CRC Press. https://books.google.com/books?id=aHE3DwAAQBAJ.
[10] 
Marron, J. S. and Alonso, A. M. (2014). Overview of object oriented data analysis. Biom. J. 56(5) 732–753. https://doi.org/10.1002/bimj.201300072. MR3258083
[11] 
Rosipal, R. and Krämer, N. (2005). Overview and Recent Advances in Partial Least Squares. 3940 34–51. https://doi.org/10.1007/11752790_2
[12] 
Schofield, R., Reher, D. and Bideau, A., eds. (1991). The Decline of Mortality in Europe. Oxford University Press. https://EconPapers.repec.org/RePEc:oxp:obooks:9780198283287.
[13] 
Wang, J.-L., Chiou, J.-M. and Müller, H.-G. (2016). Functional Data Analysis. Annual Review of Statistics and Its Application 3(1) 257–295. https://doi.org/10.1146/annurev-statistics-041715-033624
[14] 
Wilmoth, J. R. and Shkolnikov, V. Human mortality database. University of California, Berkeley (USA), and Max Planck Institute for Demographic Research (Germany).
[15] 
Zhang, L., Marron, J. S., Shen, H. and Zhu, Z. (2007). Singular Value Decomposition and Its Visualization. Journal of Computational and Graphical Statistics 16(4) 833–854. https://doi.org/10.1198/106186007X256080

Full article PDF XML
Full article PDF XML

Copyright
© 2023 New England Statistical Society
by logo by logo
Open access article under the CC BY license.

Keywords
Data Matrix Object Centering Trait Centering Functional Data Analysis

Funding
Research partially supported by the National Science Foundation under Grant Nos. IIS-1633074, DMS-1916115, DMS-2113404, and DMS-2210337.

Metrics
since December 2021
301

Article info
views

111

Full article
views

182

PDF
downloads

62

XML
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

The New England Journal of Statistics in Data Science

  • ISSN: 2693-7166
  • Copyright © 2021 New England Statistical Society

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer
Powered by PubliMill  •  Privacy policy