Q&A with Tim Berners-Lee

The inventor of the Web explains how the new Semantic Web could have profound effects on the growth of knowledge and innovation

Tim Berners-Lee is far from finished with the World Wide Web. Having invented the Web in 1989, he's now working on ways to make it a whole lot smarter.

For the last decade or so, as director of the World Wide Web Consortium (W3C), Berners-Lee has been working on an effort he's dubbed the "Semantic Web." At the heart of the Semantic Web is technology that makes it easier for people to find and correlate the information they need, whether that data resides on a Web site, in a corporate database, or in desktop software.

The Semantic Web, as Berners-Lee envisions it, represents a change so profound that it's not always easy for others to grasp. This isn't the first time he's encountered that problem. "It was really hard explaining the Web before people just got used to it because they didn't even have words like click and jump and page," Berners-Lee says. In a recent conversation with BusinessWeek.com writer Rachael King, Berners-Lee discussed his vision for the Semantic Web and how it can alter the way companies operate. Edited excerpts follow.

It seems one of the problems the Semantic Web can solve is helping unlock information in various silos, in different software applications, and different places that currently cannot be connected easily.

Exactly. When you use the word "silos," that's the word we hear when somebody in the enterprise talks about the stovepipe problem. Different words for the same problem: that business information inside the company is managed by different sorts of software, and you have to go to a different person and learn a different program to see it. Any enterprise CEO really ought to be able to ask a question that involves connecting data across the organization, be able to run a company effectively, and especially to be able to respond to unexpected events. Most organizations are missing this ability to connect all the data together.

Even outside data can be integrated, as I understand it.

Absolutely. Anybody making real decisions uses data from many sources, produced by many sorts of organizations, and we're stymied. We tend to have to use backs of envelopes to do this and people have to put data in spreadsheets, which they painfully prepare. In a way, the Semantic Web is a bit like having all the databases out there as one big database. It's difficult to imagine the power that you're going to have when so many different sorts of data are available.

It seems to me that we're overwhelmed with data and this might be a good way to help us find the data we need.

When you can treat something as data, your querying can be much more powerful.

In your speech at Princeton last year, you said that maybe you had made a mistake in naming it the Semantic Web. Do you think the name confuses some people?

I don't think it's a very good name but we're stuck with it now. The word semantics is used by different groups to mean different things. But now people understand that the Semantic Web is the Data Web. I think we could have called it the Data Web. It would have been simpler. I got in a lot of trouble for calling the World Wide Web "www" because it was so long and difficult to pronounce. At the end, when people understand what it is, they understand that it connects all applications together or gives them access to data across the company when they see a few general Semantic Web applications.

Some of the early work with the Semantic Web seems to have been done by government agencies such as the Defense Advanced Research Projects Agency and the National Aeronautics & Space Administration. Why do you think the government has been an early adopter of this technology?

I understand that DARPA had its own serious problems with huge amounts of data from all different sources about all sorts of things. So, they saw the Semantic Web rightly as something that was aimed directly at solving the problems they had on a large scale. I know that DARPA then funded some of the early development.

You have touched on the idea that the Semantic Web will make it easier to discover cures for diseases. How will it do that?

Well, when a drug company looks at a disease, they take the specific symptoms that are connected with specific proteins inside a human cell which might lead to those symptoms. So the art of finding the drug is to find the chemical that will interfere with the bad things happening and encourage the good things happening inside the cell, which involves understanding the genetics and all the connections between the proteins and the symptoms of the disease.

It also requires looking at all the other connections, whether there are federal regulations about the use of the protein and how it's been used before. We've got government regulatory information, clinical trial data, the genomics data, and the proteomics data that are all in different departments and different pieces of software. A scientist who is going through that creative process of brainstorming to find something that could possibly solve the disease has to somehow keep everything in their head at the same time or be able to explore all these different axes in a connected way. The Semantic Web is a technology designed to specifically do that—to open up the boundaries between the silos, to allow scientists to explore hypotheses, to look at how things connect in new combinations that have never before been dreamt of.

The Semantic Web makes it so much easier to find and correlate information about nearly anything, including people. What happens if that information gets into the wrong hands? Is there anything that can be done to safeguard privacy?

Here at [MIT], we are doing research and building systems that are aware of the social issues. They are aware of privacy constraints, of the appropriate uses of information. We think it's important to build systems that help you do the right thing, but also we're building systems that, when they take data from many, many sources and combine it and allow you to come to a conclusion, are transparent in the sense that you can ask them what they based their decision on and they can go back and you can check if these are things that are appropriate to use and that you feel are trustworthy.

Developing Semantic Web standards has taken years. Has it taken a long time because the Semantic Web is so complex?

The Semantic Web isn't inherently complex. The Semantic Web language, at its heart, is very, very simple. It's just about the relationships between things.