Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy
Cathy O'Neil
Cathy O’Neil was my professor for number theory in college, I think in 2006. I thought she was a great teacher, but didn’t keep in touch at all after the class. I was somewhat aware that she was involved with Occupy Wall Street’s financial policy arm, and after I heard about this book, also learned that she had been co-hosting a podcast on Slate (which she is now about to leave!–but I still have a lot of back-episodes to listen to).
I’m broadly in agreement with the thrust of her argument in this book, and only have quibbles with regard to ways I think it necessarily simplifies things to be a popular-consumption nonfiction book. The main thrust of the book is that people frequently conflate “algorithmic decision-making” with “neutral decision-making”, and that this is a fallacy (one that algorithm-purveyors are happy to perpetuate, for the most part). As “big data” and quantitative models get applied to more and more aspects of everyday life, it’s incumbent on us to understand this, and to consider ways in which algorithmic decision-making can be problematic, biased, or dangerous.
O’Neil describes three features that characterize a “weapon of math destruction”: scale, secrecy, and destructiveness. We mostly only need to focus on algorithms that have all three of these features, at least to some degree–a model that is in limited use, transparent, or harmless is not much of a cause for concern. She gives examples of algorithms in many fields that she sees as meeting all three of these criteria. One example is recidivism risk modeling, which is now used in many states to determine, at least in part, criminal sentencing. Errors or bias in these algorithms may result in additional years behind bars for the individuals they apply to, and they are both widespread and not publicly disclosed. There are many other interesting (/troubling) examples in the book, such as teacher value-added models.
An emergent property of many such algorithms is that they may engender undesirable feedback loops. For example, a recidivism risk model will be biased against black people if it is trained on historical data that cover an environment characterized by bias against black people. (If black people are more heavily prosecuted in general, they are likely to appear as higher recidivism risks, and even if the algorithm doesn’t use race directly, it will pick up on correlated factors and amount to the same thing.) This is largely a function of the opacity characteristic–if an algorithm is publicly disclosed, people can bring scrutiny to it and highlight flaws.
The issue of model opacity is an especially interesting one to me, as someone who works on regulatory financial models that are intentionally not disclosed. There are strong reasons for not disclosing any models used for high-stakes decisions (in my case, setting minimum capital levels for banks). A primary concern is that a transparent algorithm will be “gamed”, in the sense that those subject to it will figure out ways to make themselves “look good” to the algorithm that are driven more by the details and limitations of the model rather than by the underlying substance. A second concern is that a transparent regulatory model can encourage a “monoculture” in which those subject to it will simply adopt the model for themselves, rather than developing their own models that, while still flawed, will have different flaws than the regulatory model.
I don’t think there is an obvious solution to this transparency problem. One solution that I definitely don’t agree with is to eschew quantitative decision-making altogether. As O’Neil clearly states in the book, we shouldn’t assume that pre-algorithmic decision-making was unbiased either–it seems quite apparent that, for example, there is bias in judgmental sentencing, perhaps more than in algorithmic sentencing. O’Neil herself has one proposed solution to this (which I don’t think she really discusses in the book)–she has started a consulting company whose intent is to audit existing algorithms for potential biases or other damaging impacts. This would allow some degree of independent assessment while not disclosing the model generally. I think this is an interesting idea and I hope it takes off, but there are limitations. Especially in the private sector, it’s not clear why a company would voluntarily request such an audit, especially if an algorithm is making them lots of money. Fear of regulatory penalties could be one motivation, but we’re clearly entering an era of deregulation. Regulators themselves might force audits, but again that requires a strong regulator (also, who watches the watchers?).
One approach that I think could make sense in at least some contexts is a hybrid algorithmic-judgmental process. (At least right now, hybrid processes can be most effective in many fields–for example, the best chess player is neither a computer nor a human, but a human assisted by a computer.) To take the example of recidivism risk, we might have an algorithm that is publicly disclosed that produces a publicly-disclosed outcome to the judge (perhaps a recommended range). The judge may then choose to depart from the recommendation, but needs to give a written description of her reasoning for doing so. In this way, the algorithm can be audited by any outside party for potential biases, but the final judgmental step serves as insurance against flagrant cases of gaming the system, or cases with significant factors that are not considered by the algorithm.
As I said earlier, my only real quibbles with the book are around simplifications that I think are a reality of publishing a non-fiction book for popular consumption. For example, the terminology of “WMD” encourages us to think in a binary way (is it or isn’t it one?), rather than seeing a continuum from OK to troubling, which I think is a better reflection of reality.
Finally, I’ll add that this book, published in September 2016, proved to be quite prescient. The public controversy about “fake news” and the Facebook newsfeed algorithm arose shortly after its publication. Interestingly, I think before this happened, few would have identified the newsfeed algorithm as a potential WMD, because the vector for “destructiveness” was non-obvious. O’Neil has written some articles on this topic since the publication of the book, which are worth looking up.