As computing power has increased and data science has expanded into nearly every area of our lives, we have entered the age of the algorithm. While our personal and professional data is being compiled and crunched, mathematical models are informing and even making essential decisions that have a direct impact on us — from what university we will attend and what job we will take, to our access to and costs for car loans, mortgages and health insurance. Even the news we see on social media is guided by algorithms.
In some cases, these models are useful and enhance our lives, suggesting music, books, or films we may be interested in based on our past online interactions, for example. But, as Cathy O’Neil, a data scientist, researcher and entrepreneur, pointed out in a talk at the Bloomberg Quant Seminar in New York, when it comes to truly important, life-shaping decisions, the models being used today are opaque, unregulated and uncontestable — even when they are wrong. Nevertheless, they are often viewed as being fair, scientific and objective, because they draw on vast collections of data processed by dispassionate machines.
A central question in the era of machine learning and data science: how can we assess and redefine whether an algorithm works, taking all stakeholders into consideration? If it fails on some levels, what are those levels and what are the implications for all parties involved? “We need to appreciate what we can and cannot expect from artificial intelligence,” states Dr. O’Neil. “It has been the subject of tremendous hype, but, in fact, it is fairly limited. This doesn’t mean AI is not useful, but it should be treated with scientific skepticism — don’t blindly trust the answers, you should also verify them.”
The ethical matrix: judging algorithms that judge human behavior
Algorithms can be, as Dr. O’Neil has written in her recent book, Weapons of Math Destruction, dangerous tools if used without careful critique. Some of the most striking cases of misuse are found in the criminal justice and child protective services systems. She has examined data and decisions related to both recidivism and child abuse and highlighted this work during her talk. In the realm of public policy, these are prominent examples of efforts to tackle difficult social problems with the tools of data science. However, the resulting analyses are subject to serious biases and defects — to the detriment of the populations they are intended to assess and serve.
In the case of recidivism, the research looked at data on individuals who had been arrested and tried to predict the likelihood of their being arrested again. The output can lead to longer prison sentences, even though the algorithms are trained to look at the status of simply being arrested and are not linked to whether or not the defendant has committed a violent crime. Unfortunately, the data showed a tendency to produce false positives for African-American men at twice the rate it produced false positives for white men. Part of Dr. O’Neil’s analysis entails constructing an “ethical matrix,” a grid that depicts each stakeholder and their perspective on certain outcomes.
In this case, there are three types of stakeholders: the court, African-American men and white men. Clearly, the court is concerned with false negatives — such results suggest weakness in the justice system and pose a threat to society. However, a false positive is an even greater concern for African-American men as they may spend more time in prison (or even be imprisoned when innocent), if the judge acts on the analysis. In the graphical matrix, the court perspective on false positives — shown in yellow — is a serious concern; the African-American male perspective on false positive is shown in red — a major problem for those individuals and for the system as a whole as the decisions made erroneously could constitute a violation of the civil rights of those so identified.
An expanded example of the ethical matrix included the public and Northpointe, Inc., the vendor of the commercial tool (the COMPAS recidivism algorithm) in question. The research uncovers the need for philosophical and ethical discussion of how we should define fairness and racism in an algorithmic context. Yet, the core of the problem is the decision to take arrest data as a good proxy for crime. Machines will not critique such decisions and methodologies; the inventors and users must think more deeply about how such tools are constructed and deployed.
Making the right call — hotlines and action
The research on child protective services tells a similar story. The data was taken from a hotline for suspected child abuse in Allegheny County, Pennsylvania, where people who had concerns for a child’s well-being (e.g., teachers, neighbors, doctors) could leave information. Based on these calls, if the state decided a child might be at risk, a social worker would be sent to the home. From the outset, two problems were apparent with the data and analytics: there was far more data on poor, black families as they were already in the social welfare system, and the algorithm was trained with “success” defined as a child being removed from the home. The issue here is that children could be removed from their homes for reasons other than actual abuse (e.g., poverty, lack of heat or food), so “removal” is not a clear signal. A better definition for the algorithm would have been “if abuse was substantiated.”
In the ethical matrix, those most concerned are families, who would fear false positives, and the children themselves, who would fear both false positives and false negatives. Developing an understanding of the key stakeholders and their perspectives can help to refine the research methodology and reframe the questions that we are trying to answer through data analysis. Such actions can also help us to move some of the red categories into the yellow zone through more thoughtful and informed policies and responses.
Raising public interest
Other research areas involving algorithms include voter fraud, college admissions and the teacher value-add model (VAM) — this last resulted in layoffs based on inconsistent results. Stepping back from these areas of statistical vulnerability, Dr. O’Neil emphasized that there should be public debate and a high level of attention given to the ethical issues involved in data science, particularly in the context of public policy. “When it comes to worst case scenarios,” says Dr. O’Neil, “we know they are there, but we are not necessarily focusing on them or thinking about how to prevent them. In the cases we have discussed, it’s important to decide how to address the problems directly and it may involve examining what our underlying values are and what we hope to achieve with our data analysis.”
In the realm of the financial markets, awareness and understanding of the underlying assumptions in our quant models are crucial. In addition, how questions are framed will influence how answers are formulated. And finally, there is always a balance to be attained between too much complication and too much simplification. If we can find the right blend, then perhaps we can avoid destruction!
Insights from Quant industry experts
Following a short Q&A session, Bruno Dupire, the host of the event, kicked off a series of “lightning talks,” 5-minute presentations where industry experts, researchers and academics present a wide range of subjects to stimulate fresh thinking and interaction between various disciplines. Each talk examines a way that the industry is evolving and serves as an essential exploratory aspect of the Bloomberg Quant Seminar series.
Ioana Boier, an independent researcher, spoke about the nuances of neural nets; David Mitchell of Bloomberg L.P. demonstrated how to visualize China’s debt; Markus Dochantschi, an architect at Studio MDA, presented a series of art galleries along The High Line, in “Artseen”; and Menglu Jiang of the Stevens Institute of Technology offered an in-depth study on oil demand in the United States.
In addition, Mohsen Mazaheri of FF Capital Partners, explained recent work on expected global convexity, and Luca Mertens of Bloomberg L.P. showed how state space models can be an effective tool for evaluating equity market impact.
About the Bloomberg Quant seminar series
The Bloomberg Quant (BBQ) seminar series takes place in New York and covers a wide range of topics in quantitative finance. Each session is chaired by Bruno Dupire, head of Quantitative Research at Bloomberg LP, and features a keynote speaker presenting his or her current research. This presentation is followed by several “lightning talks” of five minutes each in quick succession. This format gives the audience the opportunity to be exposed to a wider variety of topics. Sign up to receive invitations to future events in this series.