The Next Web

Think the World Wide Web is a godsend? By 2005, Tim Berners-Lee aims to be replacing it with the Semantic Web, which will understand human language

Whatever else 1955 is remembered for, it boasts two notable birthdays. That June, Timothy J. Berners-Lee popped into the world in London, and a few months later, William H. Gates III opened his eyes in Seattle. Gates went on to become the richest person on earth as head of Microsoft Corp. (MSFT ) Tim Berners-Lee might be giving Gates a run for the money, but he passed up his shot at fabulous wealth--intentionally--in 1990. That's when he decided not to patent the technology used to create the most important software innovation in the final decade of the 20th century: the World Wide Web. Berners-Lee wanted to make the world a richer place, not amass personal wealth. So he gave his brainchild to us all.

Now, the idealistic father of the Web plans an even grander gift: a next-generation Web that almost certainly will rank as the most important software of this decade. Berners-Lee regards today's Web as a rebellious adolescent that can never fulfill his original expectations. By 2005, he hopes to begin replacing it with the Semantic Web--a smart network that will finally understand human languages and make computers virtually as easy to work with as other humans.

This new project is a collaborative effort of hundreds of minds, with Berners-Lee as maestro. The ultimate goal: to turn the Web into a gigantic brain. Every computer connected to the Internet would have access to all the knowledge that humankind has accumulated in science, business, and the arts since we began painting the walls of caves 30,000 years ago. This racial memory would be a constant source of inspiration for dreaming sublime dreams, boosting human creativity, and solving previously intractable problems. Online commerce chores and Web services would be handled by software modules that snap together like toy Lego blocks. "We expect the Semantic Web to be as big a revolution as the original Web itself," says Richard Hayes-Roth, Hewlett-Packard Co.'s (HWP ) chief technology officer for software.

To get there, though, Berners-Lee must navigate some very muddy waters. Development of the Semantic Web is being funded mainly by the World Wide Web Consortium (W3C), which he heads. Founded in 1994 and based at Massachusetts Institute of Technology, the W3C is the guardian of Web technology and standards. Its budget relies heavily on membership dues from more than 400 companies. And while making money may not be a primary motivator for Berners-Lee, it's what business is all about. Conflicts, in short, were inevitable--and not just centering around Berners-Lee. Indeed, mediating the inevitable clashes among W3C's hundreds of companies, each with its own agenda, will be the acid test of Berners-Lee's leadership.

A particularly thorny issue cropped up last August. A W3C committee of 13 members, including IBM (IBM ) and Microsoft, proposed installing tollbooths on the Information Highway by allowing patented software to be included in W3C-approved standards. The committee reasoned that as online offerings grow more sophisticated, the developers of software for handling advanced Web services, such as supply-chain management and collaborative engineering, should be permitted to collect royalties on their investments. But Berners-Lee is philosophically opposed to standards that would impose fees, and many other W3C members, such as the Free Software Foundation and the Open Source Initiative, also denounced the notion. "Things have calmed down a bit," says Robert S. Sutor, IBM's director of e-business standards, and the committee is now rethinking its stance. Berners-Lee says the mood has now shifted "strongly toward a royalty-free position."

Meanwhile, the W3C is taking heat on other fronts. Critics say the organization is moving too slowly on developing standards to ensure that different Web-service offerings can work together. Business sees major revenue growth from better tools that can deal with complicated travel arrangements, say, or deliver new entertainment options. But companies are reluctant to invest in developing such software until big corporations are on the same page. What good would it do, for example, to create a program under Microsoft's Web-services initiative, dubbed .Net, if it couldn't link up with a related program written in Java for Sun Microsystems Inc.'s (SUNW ) counterpart? Or if a computer-aided design program at Boeing Corp. were unable to talk to the company's engineering or manufacturing software?

A W3C draft specification aimed at harmonizing Web services was published in January, 2001, "but the W3C then sat on its hands for a whole year" complains Uttam M. Narsu, an analyst at Giga Information Group. Not until late January did the W3C organize several working groups to tackle standards for Web services. "My sense is that [W3C staffers] are too visionary," Narsu says. "They're devoting too much effort to the Semantic Web, believing it will change the world yet again, and not enough effort to less sexy things that are important to business in the near term."

The Semantic Web is certainly sexy. As envisioned by Berners-Lee, it would understand not only the meaning of words and concepts but also the logical relationships among them. That has awesome potential. Most knowledge is built on two pillars: semantics and mathematics. In number-crunching, computers already outclass people. Machines that are equally adroit at dealing with language and reason won't just help people uncover new insights; they could blaze new trails on their own.

Even with a fairly crude version of this future Web, mining online repositories for nuggets of knowledge would no longer force people to wade through screen after screen of extraneous data. Instead, computers would dispatch intelligent agents, or software messengers, to explore Web sites by the thousands and logically sift out just what's relevant. That alone would provide a major boost in productivity at work and at home. But there's far more.

Software agents could also take on many routine business chores, such as helping manufacturers find and negotiate with lowest-cost parts suppliers and handling help-desk questions. The Semantic Web would also be a bottomless trove of eureka insights. Most inventions and scientific breakthroughs, including today's Web, spring from novel combinations of existing knowledge. The Semantic Web would make it possible to evaluate more combinations overnight than a person could juggle in a lifetime. "A lot of scientific research is now interdisciplinary, like global climate change, and the scientists need to talk to each other," says Chaitanya Baru, a data-mining expert at the San Diego Supercomputer Center. "But they use different jargon."

Sure, scientists and other people can post ideas on the Web today for others to read. But with machines doing the reading and translating jargon terms, related ideas from millions of Web pages could be distilled and summarized. That will lift the ability to assess and integrate information to new heights.

As a result, Berners-Lee envisions a new age of enlightenment. The Semantic Web, he predicts, "will help more people become more intuitive as well as more analytical. It will foster global collaborations among people with diverse cultural perspectives, so we have a better chance of finding the right solutions to the really big issues--like the environment and climate warming." In short, it will change the world even more than his original creation.

The capital-Q question is: Can he pull it off? There's no shortage of doubters. Still, most people who know the reclusive Berners-Lee are optimistic. "Tim has a gift for seeing the future and making it happen," says John R. Patrick, a retired IBM senior exec who helped found the W3C. Eric E. Schmidt, formerly of Sun and now chairman of search-engine innovator Google Inc., says Berners-Lee would be a shoo-in for a Nobel prize--if Nobels were given in computer science. And Larry L. Smarr, director of the California Institute of Telecommunications & Information Technology at the University of California at San Diego, predicts the Semantic Web will cast Berners-Lee as "an historic-level figure."

What impresses those elder statesmen of computing is Berners-Lee's leadership track record. For a somewhat shy software nerd, he has demonstrated a surprising flair for diplomacy, combined with bulldog tenacity. In the midst of the dot-com bust two years ago, Berners-Lee persuaded the W3C's hard-nosed denizens of commerce to begin developing the Semantic Web. And before that, in 1998, he persuaded them to approve extensible markup language (XML), an important new Web lingo. "Tim did a great job shepherding XML through the W3C," notes Smarr.

Indeed, the evolution of XML may be a useful foretaste of what's in store for the Berners-Lee's new vision. In the late 1990s, this language was constructed to help computers identify different types of data on the Web. "When we started work on XML, it was considered pretty esoteric," recalls Sutor of IBM. "But now it's the underpinnings of everything we're doing in e-business." Ditto for hundreds of others, including the 300 companies already using XML software from Open Applications Group Inc. OAGI predicts that number will double this year.

Berners-Lee worked tirelessly to win support for XML because it's a quantum leap beyond today's witless hypertext markup language (HTML)--and it's the cornerstone of the Semantic Web. HTML is the language that Berners-Lee concocted while on a fellowship as a database engineer at the European Organization for Nuclear Research (CERN) in Geneva. But the language merely specifies the appearance of a Web page: what colors go where, which type sizes to use, and where to put graphic elements. To a Web browser, or most other computer programs, these words and numbers are just squiggles of gibberish. Without some kind of clue, computers parsing a Web page can't determine if "buy" is a noun or a verb, or whether "20031" is a Zip Code, a price, or the number of orders placed last month.

In contrast, XML tags imbue the Web with meaning. Examples might be such labels as <patient ID>, <drug name> and <known interaction> for medical records. The "name" tag would have links to relevant sections of online literature, also coded with XML, and "interaction" would point to other drugs that interfere with the medication. Then, when a doctor bats out a prescription on a computer, a software agent could verify that the drug is appropriate for the diagnosis, check the patient's records to see what other medicines the person is taking, and determine whether any of them is likely to interfere with the new prescription. A group of university and industrial researchers is already working on such a scheme with the Veterans Administration and the National Library of Medicine.

Without Berners-Lee's dogged persistence, today's Web might never have hatched at CERN. But he pulled it off, right under the disapproving noses of senior management.

To physicists around the world, CERN's huge atom-smashing collider, which traces a 17-mile-long circle beneath the Swiss-French border, is a kind of Alpine mecca. Part of Berners-Lee's job was to keep track of who was where, doing what, and using which kind of computer. That was a major headache for the young researcher, who admits to having a bad memory for names and faces. Twice he proposed a pre-Web database that would store data on thousands of researchers and help organize the results of their work into an institutional memory. Both times his idea got spurned.

Undeterred, Berners-Lee then went underground and cajoled a small band of cohorts, including his Belgian-born supervisor, Robert Cailliau, to help create the original Web on the sly. Even after the first prototype was done and winning converts in the outside world, convincing CERN's staff of its utility took several months, with Cailliau playing the role of Web salesman. Berners-Lee praises the contributions of Cailliau and the other co-conspirators in his 1999 book, Weaving the Web. "Tim's a very modest person, and he has been careful to credit the people who've worked with him," says Vinton G. Cerf, senior vice-president at WorldCom Inc., who helped spawn the Internet in the early 1970s as a medium for scientific collaborations.

Cailliau and Berners-Lee considered forming a startup to commercialize the Web, but they quickly nixed the notion. That was 1990, a hectic year for Berners-Lee. He got married, made the momentous decision not to patent the Web software, and, on Christmas Day, switched on the world's first Web-site server.

Even though Berners-Lee has little time for anything other than software and family, he does occasionally play the piano, "but I'm terrible at it," he confesses. He won't talk much about his private life, but he admits to feeling a tug from the theater. While at CERN, he helped out backstage at the Geneva English Drama Society and played bit parts, including Nana, the dog in Peter Pan. That's where he met his future wife, Nancy Carlson, a software analyst from Fairfield, Conn. She was working at the World Health Organization and managing the little theater group in the evening.

Today, Berners-Lee presides over a research octopus whose tentacles extend to all five continents. The 60 staffers at W3C headquarters coordinate the efforts of hundreds of researchers at 50 university and government laboratories that are W3C members, plus two-score additional universities around the world. For now, most of the actual work on the Semantic Web is being done by academics because, Berners-Lee quips, "only a few industry people have been given a little leeway to go off and explore my crazy ideas."

Those ideas spring from a childhood fascination with computers--encouraged by parents who both were mathematicians and computer-science pioneers. Conway Berners-Lee and Mary Lee Woods met in the early 1950s while on the team that developed the Ferranti Mark 1 computer. Their first son was treated to breakfast discussions about programming and dinner-table talk of abstract math and imaginary numbers. As a child, Tim built let's-pretend computers from cardboard boxes and drew miniature pictures by connecting the holes in the punched paper tape that was then used to load programs into computers. At Oxford University's Queen's College, he studied physics, graduating with honors in 1976. "Physics was a compromise," he says, between two loves--math and engineering.

As Berners-Lee readily admits, all the components of the Web already existed when he arrived at CERN. His main contribution was writing the software to combine them cohesively. The Internet was commonly used to exchange scientific reports, but there were no point-and-click links: You had to type in arcane commands and addresses. There were hypertext programs with clickable links, but each version was tailored for a specific breed of computer--and often for a single location--each with its own peculiar command-line entries.

Berners-Lee tore down this Tower of Babel, making it a breeze to share information. Uniform resource locators (URLs) established a common format for Web-page addresses, and HTML ensured that Web pages always looked the same. The new tools spread quickly among researchers, but the public didn't pay much heed because the Web remained a slave to abstruse typed-in commands.

What launched the Web's explosive growth was the now-familiar browser, pioneered by Netscape Communications Corp. Berners-Lee still chafes at Netscape's one-way limitation. From the outset, he thought the Web should be a two-way street, with browsers making it easy to create and annotate Web pages. Removing this limitation is one of the goals of the Semantic Web.

XML is a start--but only the tip of the iceberg. XML tags are essentially just labels that point to a definition in a combination dictionary and thesaurus. That's how a software agent can determine that two different tags actually mean the same thing--say, <purchase> and <buy>. When an agent needs further details, there's an online encyclopedia, called an ontology. It lays out the logical rules and relationships among XML terms.

Merging these elements is where semantics gets sticky. Because we humans assimilate language gradually, we end up unaware of how complicated things are--until we try to construct a new digital grammar from scratch, with numerous dialects for various industries. Devising software that can comprehend words, concepts, and relationships has long been a major hangup in artificial intelligence (AI) research. Adding a pervasive layer of standardization will test the limits of human ingenuity--and patience.

In the fast-paced Internet Age, the time needed to build consensus on the smallest of these details could be the Semantic Web's chief obstacle, says MCI's Cerf. He worries that standards could "fall victim to business maneuvering" by the W3C's corporate members. The result might end up similar to today's systems for electronic data interchange (EDI)--with a lot of proprietary systems, each with its own lingo. On the other hand, partly because the industry is acutely aware of EDI's problems and limitations, executives are optimistic. "It'll be a chicken-or-egg situation until a killer app comes along--but I'm very confident that that will happen," says W. Daniel Hillis, a supercomputer pioneer who now heads startup Applied Minds Inc.

Some academics are enthusiastic about the corporate involvement that Berners-Lee has attracted. James A. Hendler, a computer scientist at the University of Maryland, says he has worked on AI for 20 years and "it has been almost impossible to get the attention of business." But now, he says, "the advances we made in the 1990s are being readied for actual use with the Semantic Web, out there in the real world."

One other factor could give Berners-Lee's vision an enormous boost: The Pentagon's Defense Advanced Research Projects Agency (DARPA) is pushing it. This is the outfit that created the guts of the Internet three decades ago. In 1998, it launched the DARPA Agent MarkUp Language (DAML) program--initially managed by Hendler, who took a leave of absence from Maryland. DARPA is now a W3C member, and DAML is being developed in concert with XML.

DARPA wants to develop agent-based systems for command-and-control jobs in joint military operations, whether they be multiservice or multinational. For example, an international team of 16 organizations--led by a spin-off of Britain's Defense Ministry called QinetiQ Ltd.--is working on a "coalition of agents" project. With DAML tags pointing to online databases, plus access to satellite reconnaissance images, the agents would be aware of the capabilities and locations of the many different weapons and logistics systems deployed to such spots as Afghanistan. So they could provide commanders with instant advice for coping with shifting conditions.

DARPA is also funding research at MIT, headed by Berners-Lee but separate from the W3C, aimed at creating new AI tools for tomorrow's Web. One result would be Semantic Web logic language (Swell). Another goal is to marry the Semantic Web with MIT's Oxygen project, which aims to make various digital systems as easy to use as breathing, thanks to advanced machine-learning tricks and new AI software. Cailliau, Berners-Lee's former boss at CERN, figures the Web's inventor relishes this research. "I think Tim does not really like the role" of leading a big outfit like the W3C, says Cailliau. "He is more comfortable with a small team [and] joining in the fun of writing actual code."

Berners-Lee admits that building consensus among the W3C's members can be trying at times. But someone needs to keep development of the Semantic Web on course toward enriching the world--and nobody is better qualified than Tim.

By Otis Port

Before it's here, it's on the Bloomberg Terminal.