New Perspectives on Centering
Volume 1, Issue 2 (2023), pp. 216–236
Pub. online: 26 April 2023
Type: Statistical Methodology
Open Access
Accepted
20 March 2023
20 March 2023
Published
26 April 2023
26 April 2023
Abstract
Data matrix centering is an ever-present yet under-examined aspect of data analysis. Functional data analysis (FDA) often operates with a default of centering such that the vectors in one dimension have mean zero. We find that centering along the other dimension identifies a novel useful mode of variation beyond those familiar in FDA. We explore ambiguities in both matrix orientation and nomenclature. Differences between centerings and their potential interaction can be easily misunderstood. We propose a unified framework and new terminology for centering operations. We clearly demonstrate the intuition behind and consequences of each centering choice with informative graphics. We also propose a new direction energy hypothesis test as part of a series of diagnostics for determining which choice of centering is best for a data set. We explore the application of these diagnostics in several FDA settings.
References
Ciriello, G., Gatza, M. L., Beck, A. H., Wilkerson, M. D., Rhie, S. K., Pastore, A., Zhang, H., McLellan, M., Yau, C., Kandoth, C., Bowlby, R., Shen, H., Hayat, S., Fieldhouse, R., Lester, S. C., Tse, G. M. K., Factor, R. E., Collins, L. C., Allison, K. H., Chen, Y.-Y., Jensen, K., Johnson, N. B., Oesterreich, S., Mills, G. B., Cherniack, A. D., Robertson, G., Benz, C., Sander, C., Laird, P. W., Hoadley, K. A., King, T. A., Network, T. R. and Perou, C. M. (2015). Comprehensive Molecular Portraits of Invasive Lobular Breast Cancer. Cell 163(2) 506–519. https://doi.org/10.1016/j.cell.2015.09.033
Feng, Q., Jiang, M., Hannig, J. and Marron, J. S. (2018). Angle-based joint and individual variation explained. J. Multivariate Anal. 166 241–265. https://doi.org/10.1016/j.jmva.2018.03.008. MR3799646
Gómez-Redondo, R. and Boe, C. (2005). Decomposition analysis of Spanish life expectancy at birth: Evolution and changes in the components by sex and age. Demographic Research S4(20) 521–546. https://www.demographic-research.org/special/4/20/s4-20.pdf. https://doi.org/10.4054/DemRes.2005.13.20
Horváth, L. and Kokoszka, P. (2012) Inference for Functional Data with Applications. Springer Series in Statistics. Springer New York. https://books.google.com/books?id=OVezLB__ZpYC. https://doi.org/10.1007/978-1-4614-3655-3. MR2920735
Hotelling, H. (1936). Relations between two sets of variates. Biometrika 28(3-4) 321–377. https://academic.oup.com/biomet/article-pdf/28/3-4/321/586830/28-3-4-321.pdf. https://doi.org/10.1093/biomet/28.3-4.321
Hsing, T. and Eubank, R. (2015). Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators. Wiley Series in Probability and Statistics. Wiley. https://books.google.com/books?id=YjsbAAAACAAJ. https://doi.org/10.1002/9781118762547. MR3379106
Kokoszka, P. and Reimherr, M. (2017). Introduction to Functional Data Analysis. Chapman & Hall/CRC Texts in Statistical Science. CRC Press. https://books.google.com/books?id=aHE3DwAAQBAJ.
Marron, J. S. and Alonso, A. M. (2014). Overview of object oriented data analysis. Biom. J. 56(5) 732–753. https://doi.org/10.1002/bimj.201300072. MR3258083
Rosipal, R. and Krämer, N. (2005). Overview and Recent Advances in Partial Least Squares. 3940 34–51. https://doi.org/10.1007/11752790_2
Schofield, R., Reher, D. and Bideau, A., eds. (1991). The Decline of Mortality in Europe. Oxford University Press. https://EconPapers.repec.org/RePEc:oxp:obooks:9780198283287.
Wang, J.-L., Chiou, J.-M. and Müller, H.-G. (2016). Functional Data Analysis. Annual Review of Statistics and Its Application 3(1) 257–295. https://doi.org/10.1146/annurev-statistics-041715-033624
Zhang, L., Marron, J. S., Shen, H. and Zhu, Z. (2007). Singular Value Decomposition and Its Visualization. Journal of Computational and Graphical Statistics 16(4) 833–854. https://doi.org/10.1198/106186007X256080