The Future of AI Depends on a Huge Workforce of Human Teachers

VCs are investing heavily in startups that tap people to make artificial intelligence better at speaking, seeing, and driving.
Illustration: Nejc Prah

When Katharine Rubin has a spare moment on the way to school, she helps a big-name tech company smarten up its artificial intelligence. Rubin, a 22-year-old accounting major at New York City’s Baruch College, is part of a growing workforce that spends anywhere from 5 minutes to 40 hours a week increasing the I in AI. Specifically, Rubin and others provide training data for machine learning algorithms, a form of AI that can be taught from experience.

For an autonomous car to recognize pedestrians and stop signs, it’s typically fed thousands or millions of photos, all hand-labeled. To nail a conversation, a digital assistant needs to be told over and over when it’s failed. And so Rubin spends 10 to 30 hours a week on her phone or computer evaluating search results and chat retorts through a site called Clickworker. Her income, generally $10 to $14 an hour, pays for part of her college commute from New Jersey and some of her mom’s groceries. Each task pays 3¢ to 15¢ apiece, she says, and “they’re easy, so it quickly adds up.”

As automation and AI eliminate a range of relatively rote jobs, the need to train software is also creating other employment opportunities. People must label massive collections of unsorted data so computers can perform more complex tasks, such as driving cars and carrying on conversations. Clickworker GmbH is one of several companies feeding the need for training data as machine learning spreads into more business processes. All together, more than 1 million people around the world are chipping in, one click at a time.

Many of the startups are being fed by eager venture capitalists. So far this year, Alegion, Scale, CloudFactory, Mighty AI, and CrowdFlower have received about $50 million in investment funding, and is expecting to raise a few million this month.

Some of these companies have specialties. Mighty AI Inc. and focus on annotating images for autonomous driving. DefinedCrowd tackles natural language processing, so workers record or transcribe speech samples, among other tasks. Microwork photographs and tags brand logos to, say, track exposure on Instagram. Other companies are generalists, tagging vehicle damage, categorizing media, handwriting notes, or assessing product reviews as needed.

Clients range from startups to the likes of Google parent Alphabet,, Apple, Facebook, International Business Machines, Microsoft, and big automakers. (At that level, most also have in-house sorters.) Jacques Bughin, a director of the McKinsey Global Institute, speculates that the nine-figure market could hit $5 billion in five years. Jonathan Roosevelt, a partner at Industry Ventures who led CrowdFlower’s $20 million round of funding in June, says that’s optimistic but possible. “One of the things that got us excited is how valuable this is to some very rich companies,” he says.

Beyond recruiting workers and sorting data, AI training companies typically create the software interfaces for workers to label data, as well as the quality-control methods. Some of them hire people one task at a time. Alegion Inc. and Clickworker each have about 1 million data sorters, with most of the tasks aimed at machine learning. Daryn Nakhuda, the chief executive officer of Mighty AI, says his company tries to add gamelike elements (experience points, badges, online discussion forums) to make the jobs more fun and less fatiguing. These services pay anywhere from a penny a task to $2,000 a pop for a radiologist to tag a medical image.

Other companies offer full-time work. IndiVillage Tech Solutions LLP hosts about 100 women and youth at its office in the Indian town of Yemmiganur and spends profits on education and drinking water for the community. “We think it’s pretty cool that a tiny community in a rural Indian village today is helping with providing data for artificial intelligence,” says Chirasmita Amin, the company’s business development manager. In Serbia, Microwork pays an hourly wage of at least $3 an hour, more than twice the local minimum, to 100 people in an area where jobs are scarce, and it says it aims to expand its ranks to 1,000 this year. Samasource trains and employs people in Africa, India, and Haiti.

“I can imagine an AI that is connected to all of humanity,” says Andy Gough, the CEO of Microwork, “and whenever it needs to learn something, it simply employs humans to generate the data it needs.” Rubin, the Baruch student, doesn’t worry about possibly training her AI replacement someday. “No matter what the profession,” she says, “we will be working alongside AI in our everyday lives.” 

    BOTTOM LINE - The nine-figure market for AI trainers has attracted about $50 million in venture funding this year to a cadre of startups dependent on armies of workers to sort data.
    Before it's here, it's on the Bloomberg Terminal.