Why Technorati feels slow

Responding to a complaint that Technorati is running slowly, the company's chief engineer details the challenges in blog search.
Stephen Baker

The last few weeks, I've found Technorati frustratingly slow. So I called the company asked what's up. Adam Hertz, vice president and chief engineer, gave me the lowdown. In short, Technorati is struggling to keep pace with explosive growth of blogs. Adjustments, he says, are like "changing a flat tire on a moving car." New services on the site add more complications. These challenges show no signs of slowing. The upshot? While Technorati is the leading brand in blog search, it's in a daunting tech race. This spells opportunities for others, from Google to PubSub, if they muster the machinery and algorithms to master the blogosphere.

In the last year, with the blogosphere doubling twice in size, Technorati has had to re-engineer its system. Originally, says Hertz, it dealt with all the data in one big (and ever-expanding) pool. In the last nine months, engineers have rearranged the data in different segments. At the same time, they're enabling it to comb through the data more intelligently, sorting each piece so that it can be cross-referenced. For example, this post can be associated with me as a blogger, with Blogspotting, with BW, with Technorati, with the search industry, and with any of you who link to it. Each one of those relations has meaning and value. But offering all these dimensions adds layer upon layer of complexity to blog search. "In general, our traffic isn’t the big gating factor," he says. "It’s the amount of new data that we’re managing."

New services will continue to add to the complexity. In the future, says Hertz, Technorati will organize bloggers by their specialties, and perhaps even rank the authority they have on certain subject matters. (Just imagine the controversies that will create: A blogger writes a post slamming Intel's new chip, and another, boosting a far higher semiconductor rank in Technorati, rebuts it.)

For many, the first impulse when faced with a crush of blog data would be to add servers. That's the easiest part, says Hertz. "The trick here is when we have to break things into pieces, or invent brand new systems to do the data management."

What's more, blog search engines, unlike Google, have to update this data continuously. They're providing a look at time as it passes. Yesterday, with the London bombing, traffic exploded, taxing the Technorati system. Instead of the usual 800,000 new posts, Technorati was on track yesterday to process 1.2 million of them.

I'll attach the notes here for those who want to read more. Download file