"Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it." (Dan Ariely)
The above quote from the Duke University Professor of Psychology and Behavioural Economics Dan Ariely is not only funny, but it also says something about the 'big data hype' that we've been witnessing in the past couple of years. It seems like everyone indeed is talking about it, and everyone - companies, organisations, academics - should be doing it. Some of the reasons for the hype, and for the justifications as to why everyone should be doing it, have to do with the mythology around big data. In the words of danah boyd and Kate Crawford, that mythology is about "the widespread belief that large data sets offer a higher form of intelligence and knowledge that can generate insights that were previously impossible, with the aura of truth, objectivity, and accuracy." (p. 663)
Regardless of the hype, the big data phenomenon has not been without its critics - and several headlines have gone as far as to suggest that big data is, in fact, already dead (for examples, see here and here). While the reports of the death of big data do seem to be exaggerated, more and more critical voices are emerging and it seems like a broader public debate on the phenomenon is finally in the making. Cathy O'Neil's new book Weapons of Math Destruction is a timely and highly accessible intervention in the discussion.
O'Neil is a data scientist with a PhD in mathematics from Harvard, also known for her blog Math Babe. After her PhD, she taught at Barnard College, and then moved to the private sector, working as a Wall Street quant for "the Harvard of hedge funds" (p. 33), i.e. the leading hedge fund D. E. Shaw, and various start-ups where she worked on models predicting people's online clicks and purchases.
In her book, O'Neil discusses what she calls Weapons of Math Destruction (WMDs), i.e. harmful, opaque mathematical models which "are, by design, inscrutable black boxes." (p. 29) and due to their opaque nature, difficult to hold accountable. The book covers different domains in society and everyday life - such as education, health, online advertising, the justice system, and insurance - where O'Neil shows the disastrous effects of bad models. She debunks the big data mythology - how "the Big Data economy (...) promised spectacular gains" and "not only saved time but also was marketed as fair and objective. After all, it didn't involve prejudiced humans digging through reams of paper, just machines processing cold numbers." (p. 3)
A self-described 'math nerd', O'Neil was left disillusioned after the 2008 financial crash, having witnessed through her job as a Wall Street quant what kind of damage her beloved mathematics had helped cause. "It was when the markets collapsed in 2008 that the ugly truth struck home in a big way. (...) the finance industry was in the business of creating WMDs, and I was playing a small part." (p. 35) She realised how mathematics "was not only deeply entangled in the world's problems but also fueling many of them. The housing crisis, the collapse of major financial institutions, the rise of unemployment - all had been aided and abetted by mathematicians wielding magic formulas." (p. 2)
"Big Data has plenty of evangelists, but I'm not one of them."
Based on her extensive experience in the world of big data and algorithms, O'Neil concludes that "Big Data has plenty of evangelists, but I'm not one of them." (p. 13). Through her story of personal disillusionment as a data scientist, and at times truly shocking cases of models gone wrong, she illustrates how the myth about the 'neutrality' of big data is just that - a myth. As the book makes clear, "models, despite their reputation for impartiality, reflect goals and ideology." (p. 21)
O'Neil's book is also very timely reading in relation to recent events such as the heated discussion on fake news and their spread on social media - for instance the kinds of 'news' that were spread on Facebook at the time of the 2016 US presidential elections. O'Neil herself addresses another algorithmic social media phenomenon: the micro-targeting of potential voters by politicians on social media and its potential consequences. Micro-targeting, says O'Neil, "in part, explains why in 2015 more than 43 percent of Republicans, according to a survey, still believed the lie that President Obama is a Muslim." (p. 194) She points out that "As [microtargeting] happens, it will become harder to access the political messages our neighbors are seeing - and as a result, to understand why they believe what they do, often passionately." (p. 195) The kinds of new powerful models of political messaging that O'Neil discusses can thus contribute to the creation of comfortable echo chambers social media users find themselves in, oblivious to what is going on outside of their bubbles.
Apart from its timeliness and accessible style, what makes Weapons of Math Destruction such an interesting book is that it's written by an 'insider' - a data scientist who knows how things work, and who speaks from experience. O'Neil makes a very compelling case for people to start asking more questions and demand more transparency regarding the kinds of algorithmic models that are being used to regulate so many aspects of their lives - and this in the hope of creating big data models "that follow our ethical lead. Sometimes that will mean putting fairness ahead of profit." (p. 204)
While Weapons of Math Destruction focuses more on problems than solutions, this is a book that can be recommended for everybody: O'Neil does a kind of public service in explaining in an engaging and accessible manner the ways in which the highly opaque models lead to harmful effects to individuals, and what exactly is at stake. She shows how models can disadvantage the already disadvantaged, perpetuating inequalities, and how fatal the consequences can be on the level of (often unassuming) individuals. Rather than providing a critique on an abstract level, O'Neil makes the discussion understandable for the mathematically impaired among us through concerete examples, such as good teachers being fired due to the use of poor models in evaluating their performance as educators. The book powerfully shows why we need to "get a grip on our techno-utopia, that unbounded and unwarranted hope in what algorithms and technology can accomplish." (pp. 207-208)