How the New York Times uses clouds

One paragraph that didn’t make it into the Google cover described how the New York Times used Amazon’s cloud to great effect. The Times’ Derek Gottfrid blogged about it 6 weeks ago. But I think it’s worth revisiting.

His challenge was to turn 11 million archived articles, Times coverage from 1851 to 1980, into pdf files so that all readers could access them quickly from the Web. I won’t go into detail here, but he implemented the open-source software, Hadoop, bought computing power on Amazon’s cloud, and he did the entire work in one day. That’s something that would have taken weeks, or longer, on in-house machines at most publishing companies.

Two important points:

1) If he had screwed up, it would only take another day to try something else. This is how cloud computing fuels innovation (and why it’s so important for scientists). It’s fast and you can experiment.

2) Media companies, including the NYTimes (and BusinessWeek) are increasingly competing against cloud giants such as Google, Yahoo and Microsoft. If media companies don’t harness these tremendous computing resources, they’ll be ceding precious speed and innovation to the Web companies—and to competitors like the NYT which are already riding the clouds.

Gottfrid adds: “The one caveat I will offer to people who are interested in doing something like this is that it is highly addictive. We have already completed the [Amazon] portion of another project and I have ideas for countless more.

UPDATE: In a comment, Gottfrid notes that this was one-person show. In my first version, I assumed he was working with a team.

Before it's here, it's on the Bloomberg Terminal.