Statisticians Against Humanity

There is a famous saying, frequently misattributed to Joseph Stalin, concerning the difference between tragedy and statistics: The death of one man is a tragedy, but the death of ten thousand men is a statistic. Beneath this insight lies an even deeper one; namely, that the habit of treating people as statistics can blind us to the possibility of tragedy, in effect dehumanizing not only the human data points that make up graphs, pie charts, and confidence intervals but also the statisticians who seek to render reality in such abstract and quantitative terms. This lesson is amply demonstrated by the stories of three of the most important figures in the history of statistics, Francis Galton, Karl Pearson, and Ronald Fisher. Each was a mathematical genius, yet each became so habituated to handling human beings in aggregate terms of categories and frequencies that they found themselves mired in eugenics and racism.

Francis Galton, first cousin to Charles Darwin, ranks as one of the 19th century’s great polymaths. Not a great success at school, he made seminal contributions in fields as diverse as biology, criminology, geography, meteorology, psychology, psychometrics, and statistics, largely thanks to his obsession with measurement and quantitative analysis. He was known for saying, “Wherever you can, count.” Statistically, Galton made foundational contributions to our understanding of standard deviation, correlation, regression analysis, and regression toward the mean. As a eugenicist, Galton’s credentials are unsurpassed, in part because he invented the term in 1883. He also argued that the institution of marriage should not be allowed to interfere with improvements in the human stock, writing that “marriage places no restraint on debauchery as long as it is monogamic.” By contrast, eugenic breeding would

protect the mothers and fathers of the race from any abuse of their relations. As to the domestic and sympathetic function of marriage, or even its selfishly sexual function, we need not interfere with that. What we need is freedom for [well-born] people who have never seen each other before, and never intend to see one another again, to produce children under certain definite public conditions, without the loss of honor.

Galton’s racism was explicit. He defined eugenics as “the science that deals with all influences that improve the inborn qualities of a race; also those that develop them to the utmost advantage,“ saying that it provided the means “to give more suitable races a better chance of prevailing speedily over the less suitable.” He advocated for the reduction and eventual elimination of lesser races, decrying what he called the “unreasonable sentiment”

against the gradual extinction of an inferior race. It rests on some confusion between the race and the individual, as if the destruction of a race was equivalent to the destruction of a large number of men. It is nothing of the kind when the process of extinction works silently and slowly through [the control of reproduction].

Galton’s most important acolyte was Karl Pearson, who studied mathematics, physics, evolutionary biology, law, history, and German before garnering a professorship in mathematics and geometry, authoring a three-volume biography of Galton, and becoming the first holder of the Galton Chair of Eugenics at the University of London. Pearson’s contributions in statistics are extensive and include the founding of the first university statistics department, the development of the chi-square test, the concept of the p-value, and the introduction of the Pearson correlation coefficient, among many others.

Unsurprisingly, Pearson’s approach to eugenics was highly statistical. For example, he developed a proof that, on average, it is twice as good to have a fit parent as a fit grandparent. He sought to advance the fortunes of the British people, writing that “The student of national genetics desires in every way to improve and strengthen his own nation. He would do this by intra-national selection for parentage, and by the admission wherever and whenever possible of superior brains and muscles into his own country.”

Pearson’s racism made him an ardent proponent of colonialism. As the science of eugenics developed, he believed, it would help Britain to advance its domination and thereby promote the flourishing of a superior people. By taking land and resources from “dark-skinned tribes,” who had little grasp of how to use them to good effect, Britain and other colonial powers were advancing the triumph of the fittest groups of human beings over “inferior races.” He wrote, “The time is coming when we must consciously carry out that purification of the state and race which has hitherto been the work of the unconscious cosmic process. The higher patriotism and pride of race must come to our aid in stemming deterioration.”

Like Galton and Pearson, Fisher was a polymath who excelled in mathematics, statistics, and genetics, among other disciplines, and became the Galton Professor of Eugenics at University College London before accepting a professorship of genetics at Cambridge. His contributions to statistics include the principle of randomization, the analysis of variants (ANOVA), which made it possible to vary multiple factors in an experiment simultaneously, and his anonymous student’s t-distribution, which is widely used throughout statistics. Fisher founded the Cambridge Eugenics Society, and during his third year of undergraduate studies he highlighted the merits of Galton’s views that

It is of the utmost importance to select [superior] men from whatever class they may be born in, to enable them to rise in the world, to encourage them to marry women of their own intellectual class, and above all to see that their birth-rate is higher than that of the general population . . . , but at present, there is no doubt that the birth-rate of the most valuable classes is considerably lower than that of the population in general.

Fisher’s views on race were somewhat nuanced. He dissented from a 1950s United Nations Educational, Scientific, and Cultural Organization statement on race because, despite good intentions, it overlooked “real differences” that exist between groups of people. Fisher admitted that genetic differences in mental capacity may be less important than those caused by tradition and training, yet held that

In view of the admitted existence of some physically expressed hereditary differences of a conspicuous nature, between the averages or medians of the races, it would be strange if there were not also some hereditary differences affecting the mental characteristics which develop in a given environment. . . . To the great majority of geneticists, it seems absurd to suppose that psychological characteristics are subject to entirely different laws of heredity than other biological characteristics.

How might the deepest possible immersion in statistics predispose bright minds to eugenics and racism? For one thing, statistics deals with human beings in highly abstract terms. The human being is analyzed—etymologically, “cut up”—into various measurable parameters. The statistician then collects data on each parameter and looks for correlations between them—within individuals, between individuals, and across large groups of people. Individual human beings with distinctive characteristics hold little interest, precisely because their distinctiveness makes them resistant to categorization. Statisticians look at the world through a statistical lens and naturally end up viewing their subjects in quantitative terms. From a statistical point of view, there is little to object to. From a moral point of view, however, the situation looks quite different.

Suppose, for example, that a human being is a largely qualitative—as opposed to quantitative—phenomenon. To be sure, we can know someone’s body weight, life expectancy, intelligence quotient, and annual income, but even when we have compiled all such quantitative data, a vast residuum of personality, character, and biography remains unaccounted for. The same can be said about human relationships. A marriage may be described in terms of many quantitative parameters, but no data set can capture such a largely qualitative reality. Likewise, categories into which people can be assigned tell us something about them, but the range of traits within demographic groups often equals or exceeds the range between them. There is a lot more to a human being than statistics, and because statistics overlooks so many distinctive characteristics, it often dehumanizes those it presumes to account for.

Galton, Pearson, and Fisher were ardent partisans of measurement and the aggregation and statistical analysis of data. In their view, the most important truths about humankind emerge from the study of human beings en masse. But many significant insights, including some of the most salient of all, emerge only when we consider humans as individuals. What if the most important scale of study often requires a sample size not of hundreds or thousands but one? Sophocles, Shakespeare, and Tolstoy offered unsurpassed insights into human life but did so while eschewing quantification and statistical analysis. Properly applied, statistics can enlighten us, but to regard statistics as the best or only window on human reality is to engage in an essentially dehumanizing project with moral and political consequences that can prove nothing short of disastrous.