With Yandex at CERN, Search and Science Collide
As a physicist for the European Organization for Nuclear Research, or CERN, Andrei Golutvin spends his days smashing subatomic particles into one another at the Large Hadron Collider, a 16.8-mile ring of superconducting magnets buried 328 feet under Switzerland and France. The high-energy collisions of his experiment, one of four currently under way at the LHC, hint at the answers to some of nature’s greatest mysteries—and generate about 20 billion data points each year. Searching the enormous archive for collisions that match specific criteria can take hours.
Golutvin got weary of waiting. A few months ago he asked for help from Yandex, the dominant Web search company in his native Russia, and on April 10 the two organizations unveiled the result of their collaboration. The custom-built search engine lets more than 700 physicists working on Golutvin’s experiment instantly sift through about one-twentieth of the data they produce, and tailor searches by 600 criteria, such as time of collision. The Yandex software also produces QR codes to embed into scientific papers, so that other scientists can easily access the underlying data with their smartphones. Overall, the technology “can shorten the chain from idea to realization” of an experiment, says Golutvin.
Yandex is helping the research group for free, and even lending the scientists server capacity: About 13 percent of the computing power for Golutvin’s experiment is supplied by the Moscow-based company. Andrey Ustyuzhanin, a Yandex researcher, headed the search company’s five-person team, which created the CERN tool in three months. The software crawled tens of thousands of files spread across CERN’s servers, working at night while the scientists slept. Only a portion of CERN’s existing records have been crawled, but Ustyuzhanin wants to index all of the 20 billion or so particle collisions recorded this year—a number that exceeds the total volume of indexed Web pages.
The pro bono work could help Yandex hold its ground against Google, which has slowly but steadily gained market share in Russia and now controls 26 percent of searches. (Yandex accounts for 60 percent.) Giving away technology to CERN—where a young Tim Berners-Lee wrote the paper that laid the groundwork for the World Wide Web in 1989—is a great marketing move, says Ram Akella, a professor of information systems and technology management at the University of California at Santa Cruz. Consumers might think, “If those smart people are using Yandex, maybe we should, too,” he says. “It’s about branding.” It could help Yandex abroad, too. Since its $1.3 billion U.S. initial public offering in May 2011, Yandex has opened two offices in Switzerland and launched its search service in Turkey.
Yandex’s Ustyuzhanin says the company hasn’t decided whether to undertake similar science projects in the future. There’s definitely demand for it, according to Akella. As research becomes more computational, especially in fields such as genomics and biomedicine, scientists “are drowning in data, and they don’t know how to find what they want,” says Akella. Customized search for scientists “is the Holy Grail,” he says. Google offers a tool called Fusion Tables that helps scientists collect, crunch, and share data. The company also donates some of its server capacity to run computations for select researchers.
For Yandex, building the technology was fairly straightforward; several of the team members had backgrounds in physics and understood the scientists’ needs. The hardest part was figuring out where CERN’s IT specialists stored the data. “You have to find the person who maintains the infrastructure, and to have coffee with him,” says Ustyuzhanin, who visited CERN twice during the project. “I drank lots of coffee.”