‘Rogue Algorithms’ And The Dark Side Of Big Data by [email protected]
Most of us, unless we’re insurance actuaries or Wall Street quantitative analysts, have only a vague notion of algorithms and how they work. But they actually affect our daily lives by a considerable amount. Algorithms are a set of instructions followed by computers to solve problems. The hidden algorithms of Big Data might connect you with a great music suggestion on Pandora, a job lead on LinkedIn or the love of your life on Match.com.
These mathematical models are supposed to be neutral. But former Wall Street quant Cathy O’Neil, who had an insider’s view of algorithms for years, believes that they are quite the opposite. In her book, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, O’Neil says these WMDs are ticking time-bombs that are well-intended but ultimately reinforce harmful stereotypes, especially of the poor and minorities, and become “secret models wielding arbitrary punishments.”
Models and Hunches
Algorithms are not the exclusive focus of Weapons of Math Destruction. The focus is more broadly on mathematical models of the world — and on why some are healthy and useful while others grow toxic. Any model of the world, mathematical or otherwise, begins with a hunch, an instinct about a deeper logic beneath the surface of things. Here is where the human element, and our potential for bias and faulty assumptions, creeps in. To be sure, a hunch or working thesis is part of the scientific method. In this phase of inquiry, human intuition can be fruitful, provided there is a mechanism by which those initial hunches can be tested and, if necessary, corrected.
O’Neil cites the new generation of baseball metrics (a story told in Michael Lewis’s Moneyball) as a healthy example of this process. Moneyball began with Oakland A’s General Manager Billy Beane’s hunch that using performance metrics such as runs batted in (RBIs) were overrated, while other more obscure measures (like on base percentage) were better predictors of overall success. Statistician Bill James began crunching the numbers and putting together models that Beane could use in his decisions about which players to acquire and hold onto, and which to let go.
While sports enthusiasts love to debate the issue, this method of evaluating talent is now widely embraced across baseball, and gaining traction in other sports as well. The Moneyball model works, O’Neil says, for a few simple reasons. First, it is relatively transparent: Anyone with basic math skills can grasp the inputs and outputs. Second, its objectives (more wins) are clear, and appropriately quantifiable. Third, there is a self-correcting feedback mechanism: a constant stream of new inputs and outputs by which the model can be honed and refined.
These WMDs are ticking time-bombs that are well-intended but ultimately reinforce harmful stereotypes, especially of the poor and minorities.
Where models go wrong, the author argues, all three healthy attributes are often lacking. The calculations are opaque; the objectives attempt to quantify that which perhaps should not be; and feedback loops, far from being self-correcting, serve only to reinforce faulty assumptions.
WMDs on Wall Street
After earning a doctorate in mathematics at Harvard and then teaching at Barnard College, O’Neil got a job at the hedge fund D.E. Shaw. At first, she welcomed the change of pace from academia and viewed hedge funds as “morally neutral — scavengers in the financial system, at worst.” Hedge funds didn’t create markets like those for mortgage-backed securities, in which complicated derivatives played a key part in the financial crisis — they just “played in them.”
But as the subprime mortgage crisis spread, and eventually engulfed Lehman Bros., which owned a 20% stake in D.E. Shaw, the internal mood at the hedge fund “turned fretful.” Concern grew that the scope of the looming crisis might be unprecedented — and something that couldn’t be accounted for by their mathematical models. She eventually realized, as did others, that math was at the center of the problem.
The cutting-edge algorithms used to assess the risk of mortgage-backed securities became a smoke screen. Their “mathematically intimidating” design camouflaged the true level of risk. Not only were these models opaque; they lacked a healthy feedback mechanism. Importantly, the risk assessments were verified by credit-rating agencies that collected fees from the same companies that were peddling those financial products. This was a mathematical model that checked all the boxes of a toxic WMD.
Disenchanted, O’Neil left Shaw in 2009 for RiskMetrics Group, which provides risk analysis for banks and other financial services firms. But she felt that people like her who warned about risk were viewed as a threat to the bottom line. A few years later, she became a data scientist for a startup called Intent Media, analyzing web traffic and designing algorithms to help online companies maximize e-commerce. O’Neil saw disturbing similarities in the use of algorithms in finance and Big Data.
In both worlds, sophisticated mathematical models lacked truly self-correcting feedback. They were driven primarily by the market. So if a model led to maximum profits, it was on the right track. “Otherwise, why would the market reward it?” Yet that reliance on the market had produced disastrous results on Wall Street in 2008. Without countervailing analysis to ensure that efficiency was balanced with concern for fairness and truth, the “misuse of mathematics” would only accelerate in hidden but devastating ways. O’Neil left the company to devote herself to providing that analysis.
Misadventures in Education
Ever since the passage of the No Child Left Behind Act in 2002 mandating expanded use of standardized tests, there has been a market for analytical systems to crunch all the data generated by those tests. More often than not, that data has been used to try to identify “underperforming” teachers. However well-intentioned, O’Neil finds these models promise a scientific precision they can’t deliver, victimizing good teachers and creating incentives for behavior that does nothing to advance the cause of education.
In 2009, the Washington D.C. school system implemented a teacher assessment tool called IMPACT. Using a complicated algorithm, IMPACT measured the progress of students and attempted to isolate the extent to which their advance (or decline) could be attributed to individual teachers. The lowest-scoring teachers each year were fired — even when the targeted teachers had received excellent evaluations from parents and the principal.
O’Neil examines a similar effort to evaluate teacher performance in New York City. She profiles a veteran teacher who scored a dismal 6 out of 100 on the new test one year, only to rebound the next year to 96. One critic of the evaluations found that, of teachers who had taught the same subject in consecutive years, 1 in 4 registered a 40-point difference from year to year.
The cutting-edge algorithms used to assess the risk of mortgage-backed securities became a smoke screen.
There is little transparency in these evaluation models, O’Neil writes, making them “arbitrary, unfair, and deaf to appeals.” Whereas a company like Google has the benefit of large sample sizes and constant statistical feedback allowing them to immediately identify and correct errors, teacher evaluation systems attempt to render judgments based on annual tests of just a few dozen students. Moreover, there