Fairness and Randomness in Machine Learning: Statistical Independence and Relativization
Pub. online: 19 November 2024
Type: Machine Learning And Data Mining
Open Access
Published
19 November 2024
19 November 2024
Abstract
Fair Machine Learning endeavors to prevent unfairness arising in the context of machine learning applications embedded in society. To this end, several mathematical fairness notions have been proposed. The most known and used notions turn out to be expressed in terms of statistical independence, which is taken to be a primitive and unambiguous notion. However, two choices remain (and are largely unexamined to date): what exactly is the meaning of statistical independence and what are the groups to which we ought to be fair? We answer both questions by leveraging Richard Von Mises’ theory of probability, which starts with data, and then builds the machinery of probability from the ground up. In particular, his theory places a relative definition of randomness as statistical independence at the center of statistical modelling. Much in contrast to the classically used, absolute i.i.d.-randomness, which turns out to be “orthogonal” to his conception. We show how Von Mises’ frequential modeling approach fits well to the problem of fair machine learning and show how his theory (suitably interpreted) demonstrates the equivalence between the contestability of the choice of groups in the fairness criterion and the contestability of the choice of relative randomness. We thus conclude that the problem of being fair in machine learning is precisely as hard as the problem of defining what is meant by being random. In both cases there is a consequential choice, yet there is no universal “right” choice possible.
References
Altman, A. Discrimination. In (E. N. Zalta, ed.) The Stanford Encyclopedia of Philosophy (Summerition), 2020th ed. Stanford University (2020). https://plato.stanford.edu/entries/discrimination/
Ambos-Spies, K., Mayordomo, E., Wang, Y. and Zheng, X. Resource-bounded balanced genericity, stochasticity and weak randomness. In Annual Symposium on Theoretical Aspects of Computer Science 61–74. Springer (1996). MR1462086
Bandyopadhyay, P. S. and Forster, M. R. Philosophy of statistics: An introduction. In (P. S. Bandyopadhyay and M. R. Forster, eds.) Philosophy of Statistics 7. Elsevier (2011). https://doi.org/10.1016/B978-0-444-51862-0.50001-0. MR3295937
Solon Barocas, Hardt, M. and Narayanan, A. Fairness and Machine Learning: Limitations and Opportunities (2019). http://www.fairmlbook.org
Bennett, D. Defining randomness. In (P. S. Bandyopadhyay and M. R. Forster, eds.) Philosophy of Statistics 7. Elsevier (2011). https://doi.org/10.1016/B978-0-444-51862-0.50001-0. MR3295937
Berkovitz, J., Frigg, R. and Kronz, F. The ergodic hierarchy, randomness and hamiltonian chaos. Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics 37(4) 661–691 (2006). https://doi.org/10.1016/j.shpsb.2006.02.003. MR2344115
Bienvenu, L., Shafer, G. and Shen, A. On the history of martingales in the study of randomness. Electronic Journal for History of Probability and Statistics 5(1) 1–40 (2009). MR2520666
Buss, S. and Minnes, M. Probabilistic algorithmic randomness. The Journal of Symbolic Logic 78(2) 579–601 (2013). MR3145197
Toon, C. and Verwer, S. Three naive Bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery 21(2) 277–292 (2010). https://doi.org/10.1007/s10618-010-0190-x. MR2720507
Casella, G. and Berger, R. L. Statistical Inference 2nd ed. Duxbury Advanced Series (2002). MR1051420
Chaitin, G. J. On the length of programs for computing finite binary sequences. Journal of the ACM 13(4) 547–569 (1966). https://doi.org/10.1145/321356.321363. MR0210520
Chichilnisky, G. The foundations of statistics with black swans. Mathematical Social Sciences 59(2) 184–192 (2010). https://doi.org/10.1016/j.mathsocsci.2009.09.007. MR2650318
Church, A. On the concept of a random sequence. Bulletin of the American Mathematical Society 46(2) 130–135 (1940). https://doi.org/10.1090/S0002-9904-1940-07154-X. MR0000911
Dawid, P. On individual risk. Synthese 194(9) 3445–3474 (2017). https://doi.org/10.1007/s11229-015-0953-4. MR3704899
De Cooman, G. and De Bock, J. Randomness is inherently imprecise. International Journal of Approximate Reasoning 141. 28–68 (2022). https://doi.org/10.1016/j.ijar.2021.06.018. MR4364895
De Moivre, A. The Doctrine of Chances: A Method of Calculating the Probabilities of Events in Play 2nd ed. Frank Cass and Company Limited (1738/1967). MR0231695
Deutsch, D. The Beginning of Infinity: Explanations that Transform the World. Penguin (2011). MR2984795
Devroye, L., Györfi, L. and Lugosi, G. A Probabilistic Theory of Pattern Recognition. Springer (1996). https://doi.org/10.1007/978-1-4612-0711-5. MR1383093
Downey, R. G. and Hirschfeldt, D. R. Algorithmic Randomness and Complexity. Springer (2010). https://doi.org/10.1007/978-0-387-68441-3. MR2732288
Durrett, R. Probability: Theory and Examples 5th ed. Cambridge University Press (2019). https://doi.org/10.1017/9781108591034. MR3930614
Dwork, C., Hardt, M., Pitassi, T., Reingold, O. and Zemel, R. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference – ITCS 2012 214–226 (2012). MR3388391
Fine, T. L. Theories of Probability: An Examination of Foundations. Academic Press (1973). MR0433529
Fröhlich, C., Derr, R. and Williamson, R. C. Towards a strictly frequentist theory of imprecise probability. In (E. Miranda, I. Montes, E. Quaeghebeur and B. Vantaggi, eds.) Proceedings of the Thirteenth International Symposium on Imprecise Probability: Theories and Applications. Proceedings of Machine Learning Research 215 230–240. PMLR (2023). MR4663313
Frongillo, R. M. and Nobel, A. B. Memoryless sequences for general losses. J. Mach. Learn. Res. 21 80 (2020). MR4119148
Grädel, E. and Väänänen, J. Dependence and independence. Studia Logica 101(2) 399–410 (2013). https://doi.org/10.1007/s11225-013-9479-2. MR3038039
Gudder, S. P. Quantum probability spaces. Proceedings of the American Mathematical Society 21(2) 296–302 (1969). https://doi.org/10.2307/2036988. MR0243793
Gudder, S. P. Stochastic Methods in Quantum Mechanics. Dover Publications Inc. (1979). MR0543489
Hájek, A. Fifteen arguments against hypothetical frequentism. Erkenntnis 70(2) 211–235 (2009). https://doi.org/10.1007/s10670-009-9154-1. MR2481794
Hu, L. and Kohler-Hausmann, I. What’s sex got to do with fair machine learning? arXiv preprint arXiv:2006.01770 (2020).
Humphreys, P. W. Randomness, independence, and hypotheses. Synthese 36 415–426 (1977). https://doi.org/10.1007/BF00486105. MR0517129
Ivanenko, V. I. Decision Systems and Nonstochastic Randomness. Springer (2010). https://doi.org/10.1007/978-1-4419-5548-7. MR2606231
Kleinberg, J., Mullainathan, S. and Raghavan, M. Inherent trade-offs in the fair determination of risk scores. In Leibniz International Proceedings in Informatics, LIPIcs 67 1–22 (2017). MR3754967
Kolmogorov, A. N. Grundbegriffe de Wahrscheinlichkeitsrechnung. Springer (1933). MR0362415
Kolmogorov, A. N. Foundations of the Theory of Probability: Second English Edition. Chelsea Publishing Company (1956). MR0079843
Kolmogorov, A. N. Three approaches to the definition of the concept “quantity of information”. Problemy Peredachi Informatsii 1(1) 3–11 (1965). MR0184801
Lafferty, J. and Wasserman, L. Challenges in statistical machine learning. Statistica Sinica 16(2) 307 (2006). MR2267237
Levin, L. A. The concept of random sequence. Soviet Mathematics Doklady 14. 1413–1416 (1973). MR0366096
Loi, M., Herlitz, A. and Heidari, H. A philosophical theory of fairness for prediction-based decisions. Technical report, Politecnico di Milano, 2019. https://ssrn.com/abstract=3450300
Martin-Löf, P. The definition of random sequences. Information and Control 9(6) 602–619 (1966). MR0223179
Muchnik, A. A., Semenov, A. L. and Uspensky, V. A. Mathematical metaphysics of randomness. Theoretical Computer Science 207(2) 263–317 (1998). https://doi.org/10.1016/S0304-3975(98)00069-3. MR1643438
Narens, L. An introduction to lattice based probability theories. Journal of Mathematical Psychology 74. 66–81 (2016). https://doi.org/10.1016/j.jmp.2016.04.013. MR3552130
Nathanson, M. B. Elementary Methods in Number Theory 195. Springer (2008). MR1732941
Naylor, A. W. On decomposition theory: generalized dependence. IEEE Transactions on Systems, Man, and Cybernetics 11(10) 699–713 (1981). https://doi.org/10.1109/TSMC.1981.4308590. MR0641948
Donald and Ornstein, S. Ergodic theory, randomness, and “chaos”. Science 243(4888) 182–187 (1989). https://doi.org/10.1126/science.243.4888.182. MR0981173
Popper, K. The Logic of Scientific Discovery. Routledge (2010). MR1195353
Porter, C. P. Mathematical and philosophical perspectives on algorithmic randomness. PhD thesis, University of Notre Dame, 2012. MR3217940
Pták, P. Concrete quantum logics. International Journal of Theoretical Physics 39(3) 827–837 (2000). https://doi.org/10.1023/A:1003626929648. MR1792201
Bhaskara Rao, K. P. S. and Bhaskara Rao, M. Theory of Charges: A Study of Finitely Additive Measures. Academic Press (1983). MR0751777
Räz, T. Group fairness: Independence revisited. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency 129–137 (2021). MR4297512
Schnorr, C.-P. Zufälligkeit und Wahrscheinlichkeit: eine algorithmische Begründung der Wahrscheinlichkeitstheorie. Lecture Notes in Mathematics 218. Springer (1971). MR0414225
Schnorr, C.-P. The process complexity and effective random tests. In Proceedings of the Fourth Annual ACM Symposium on Theory of Computing 168–176 (1972). https://doi.org/10.1016/S0022-0000(73)80030-3. MR0325366
Schnorr, C.-P. A survey of the theory of random sequences. In Basic Problems in Methodology and Linguistics 193–211. Springer (1977). MR0517133
Schurz, G. and Leitgeb, H. Finitistic and frequentistic approximation of probability measures with or without σ-additivity. Studia Logica 89(2) 257–283 (2008). https://doi.org/10.1007/s11225-008-9128-3. MR2429951
Shafer, G. Discussion on hedging predictions in machine learning by A. Gammerman and V. Vovk. The Computer Journal 50(2) 164–172 (2007). https://doi.org/10.1093/comjnl/bxl066
Shafer, G. and Vovk, V. Game-Theoretic Foundations for Probability and Finance 455. John Wiley & Sons (2019). https://doi.org/10.1002/0471249696. MR1852450
Shalev-Shwartz, S. and Ben-David, S. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press (2014). MR3277164
Spohn, W. Stochastic independence, causal independence, and shieldability. Journal of Philosophical Logic 9(1) 73–99 (1980). https://doi.org/10.1007/BF00258078. MR0563250
Steinwart, I., Hush, D. and Scovel, C. Learning from dependent observations. Journal of Multivariate Analysis 100(1) 175–194 (2009). https://doi.org/10.1016/j.jmva.2008.04.001. MR2460486
Tao, T. 275a, notes 2: Product measures and independence, October 2015. https://terrytao.wordpress.com/2015/10/12/275a-notes-2-product-measures-and-independence/
Uspenskii, V. A., Semenov, A. L. and Shen, A. Kh. Can an individual sequence of zeros and ones be random? Russian Mathematical Surveys 45(1) 121 (1990). https://doi.org/10.1070/RM1990v045n01ABEH002321. MR1050929
Van Lambalgen, M. Von Mises’ definition of random sequences reconsidered. The Journal of Symbolic Logic 52(3) 725–755 (1987). https://doi.org/10.2307/2274360. MR0902987
Van Lambalgen, M. The axiomatization of randomness. The Journal of Symbolic Logic 55(3) 1143–1167 (1990). https://doi.org/10.2307/2274480. MR1071321
Ville, J. Étude critique de la notion de collectif. Gauthier-Villars (1939). MR3533075
Volchan, S. B. What is a random sequence? The American Mathematical Monthly 109(1) 46–63 (2002). https://doi.org/10.2307/2695767. MR1903512
Von Mises, R. Grundlagen der Wahrscheinlichkeitsrechnung. Mathematische Zeitschrift 5(1) 52–99 (1919). https://doi.org/10.1007/BF01203155. MR1544374
Von Mises, R. Probability, Statistics, and Truth. Dover Publications, Inc. (1981). MR0668875
Von Mises, R. and Geiringer, H. Mathematical Theory of Probability and Statistics. Academic Press (1964). MR0178486
Vovk, V. G. A logic of probability, with application to the foundations of statistics. Journal of the Royal Statistical Society: Series B (Methodological) 55(2) 317–341 (1993). MR1224399
Wasserman, L. All of Statistics: A Concise Course in Statistical Inference. Springer (2004). https://doi.org/10.1007/978-0-387-21736-9. MR2055670