Algorithms Can Be Pretty Crude Toward Women

Computers are only as good as the data that humans provide.

Algorithms from the 1950s.

Photographer: George Marks/Retrofile/Getty Images

My friend Michael Harris, a fellow mathematician and author, recently sent me his psychological profile as compiled by Apply Magic Sauce, an algorithm operated by the Cambridge University Psychometrics Centre. This is a cousin of an algorithm that profiles people via access to their Facebook accounts, some version of which was used last year by the software company Cambridge Analytica to help them sway U.S. voters.

Michael opted to input some of his writing, and was immediately profiled as “highly masculine,” among other personality traits. This intrigued me, and not only because I don’t think of Michael as the most macho of men -- he’s got the heart of a poet -- so I immediately tried it on myself. This recent Bloomberg View column about fake news and algorithms was rated 99 percent masculine, while this one about Snap’s business model was 94 percent masculine. Even my New Year’s resolutions were determined to be 99 percent masculine. Maybe because I discussed my favorite planar geometry app?

This gave me an idea. What would it say about another woman writing about math? I found a recent blogpost by mathematician Evelyn Lamb discussing the Kakeya needle problem. Magic Sauce says: 99 percent masculine. What about a man writing about fashion? Just 1 percent masculine.

That’s not an enormous amount of testing, but I’m willing to wager that this model represents a stereotype, assigning the gender of the writer based on the subject they’ve chosen. Math and algorithms, from this point of view, is a “male” topic, so I have been assigned a male “psychological gender.” Pretty crude.

Even so, I know enough about algorithms to know that an effect along these lines can arise naturally, depending on the training set for the model. Imagine that the creators collected Facebook updates and interactions, with associated labels indicating gender and age. The resulting data set, and the model trained on it, would reflect whatever bias is embedded in the population that created it.

This is not a new phenomenon, nor it is constrained to social media data. A July 2016 paper written by researchers from Boston University and Microsoft Research uncovered similar sexism bias embedded in Google News articles. Specifically, they trained an algorithm with that data set to perform old-fashioned SAT analogy questions such as “man is to computer programmer as woman is to what?” To this question, the algorithm spit back “homemaker.” Oops.

What’s interesting is that the researchers managed to pick apart this sexism and even adjust for it, so that the resulting algorithm would be able to answer SAT questions in a gender neutral way. This is an important advance because machine learning doesn’t simply reflect back our current reality; it’s often used to create reality. If a machine-learning algorithm at a job searching website knows I’m a woman and decides to send me job listings that are considered “interesting to women,” then it not only propagates stereotypes but actually magnifies them, by preventing me from getting certain jobs. Better for me, and for our future society, that the algorithms are deliberately made gender neutral.

This column does not necessarily reflect the opinion of the editorial board or Bloomberg LP and its owners.

    To contact the author of this story:
    Cathy O'Neil at

    To contact the editor responsible for this story:
    Stacey Shick at

    Before it's here, it's on the Bloomberg Terminal.