It's "less expensive" than beat reporters, says a customer
Below are the opening lines of three stories written about a recent college baseball game. Two are from schools' sports information departments. The other was produced by software that takes box scores and spits out news articles. Which one was done by machine?
a) "The University of Michigan baseball team used a four-run fifth inning to salvage the final game in its three-game weekend series with Iowa, winning 7-5 on Saturday afternoon (April 24) at the Wilpon Baseball Complex, home of historic Ray Fisher Stadium."
b) "Michigan held off Iowa for a 7-5 win on Saturday. The Hawkeyes (16-21) were unable to overcome a four-run sixth inning deficit. The Hawkeyes clawed back in the eighth inning, putting up one run."
c) "The Iowa baseball team dropped the finale of a three-game series, 7-5, to Michigan Saturday afternoon. Despite the loss, Iowa won the series having picked up two wins in the twinbill at Ray Fisher Stadium Friday."
The correct answer: b). It was composed by the computers of Narrative Science, a five-month-old company in Evanston, Ill., that specializes in "machine-generated content." "There's no human author and no human editing," says Stuart Frankel, 44, the company's CEO and a former executive at DoubleClick. "But the stories sound really good." Narrative Science licenses the software from Northwestern University, where a team of computer science and journalism professors developed the technology. (The professors' name for the project: "Stats Monkey.")
Frankel says his company has three customers. One is the Big Ten Network, a joint venture between the collegiate athletic conference and Fox Cable, which began using the service for baseball and softball coverage on its Web site this spring. "It's considerably less expensive for us to go this route than for us to try to have our own beat reporters at each one of these games," says Michael Calderon, Big Ten's director of new media. "In fact, it would be logistically impossible for us to do that." After a game, scorekeepers e-mail game data to Narrative Science, which feeds it into a computer. A story can appear online in minutes.
Frankel and the Big Ten Network won't say how much Narrative Science charges. The company says it is continually trying to make the system "a less bad writer," as Kristian Hammond, a Northwestern computer scientist and his partner in Narrative Science, puts it. Adds Frankel: "The 1,000th story of a subject is materially better than the first."
Frankel says his service can render stories about crime stats, medical study results, surveys, financial announcements, or any other data-intensive subject matter. Hammond says the company is starting with athletics because only about 1% of U.S. sporting events are covered by reporters. Next year it plans to approach Little League about using the service. It's "certainly something I think we would consider," says Little League spokesman Steve Barr. "It sounds pretty innovative."
The bottom line: Narrative Science can make some writing by humans obsolete. After tackling sports, it will move on to medical, financial, and survey data.