From Mouse To Mouth

Speech and translation software is transforming computing of all kinds

It's a glimpse of the world before Babel. In a laboratory in Heidelberg, Germany, researchers sit down for a marathon teleconference with colleagues in Japan, South Korea, Italy, France, and the U.S. Everyone speaks his native tongue, and the computer system, put together by global speech consortium C-Star, translates into any of six languages. As long as the conversation sticks to the computer's specialty--travel--communication is glitchless. "You can say, `I, uh, sort of would like to find a place to, you know, sleep,' and the computer knows you're looking for a hotel," says Alex Waibel, a computer science professor at both Germany's Karlsruhe University and Pittsburgh's Carnegie Mellon University.

After 25 years of development, the computer is growing ever closer to mastering the spoken word. To date, the $1.3 billion voice software industry has stuck largely to rudimentary applications, from telephone computers that understand spoken numbers to dictation devices that take notes for slow-talking doctors. But now, machines talk and understand well enough to start taking on headier assignments. Today's muscle-bound personal computers are better equipped than ever to host these large programs.

Europe is smack in the middle of this digital rush for the spoken word. European companies, such as Philips Electronics, Siemens, and Lernout & Hauspie, are right up there with America's IBM and Dragon Systems Inc. as leaders in the field of voice-recognition software. Earlier this year, Intel Corp. poured $30 million into L&H, aiming to develop speech chips for customer service call centers. Microsoft Corp. is working with L&H and others to free the computer from the keyboard--a prerequisite for Bill Gates's dream of a car PC to handle everything from navigation to E-mail on the road.

The Europeans, with their bent for languages, lead in translation programs. The Continent boasts top players in cellular telephony, a key technology for voice. And Europe, led by the British, will soon plunge into interactive TV, a technology that lends itself to voice commands. "Speech is where consumer electronics meets telecommunications," says Ralph Preclik, associate director of Philips Electronics' speech division in Vienna.

Add it all up, and voice software has the power to transform the Internet. Indeed, while the Internet was designed for machines equipped with keyboards and screens, voice systems promise an entire supplementary Internet based on audio. This would permit people, whether using car PCs or on mobile phones, simply to ask for football scores or directions, or to tune the Web radio to a Mozart channel. Untethered from the PC and the phone line, the voice-powered Internet could see explosive growth. Mobile phones alone are expected to reach 1 billion by 2003--about one-quarter of them Web-surfing computer phones. "With the keyboards on phones as small as they are, voice becomes the natural alternative," says Charles Rutstein, an analyst at Forrester Research in Cambridge, Mass.

The U.S. and Europe each account for about 40% of the global speech business. While much of the early research funding came from the U.S. Defense Dept., companies are now picking up the slack. The C-Star Consortium of research universities, which produced the six-language translation program, is funded by a Who's Who of communications companies, from AT&T and Siemens to Japan's ATR.

Most of these companies have their eyes on customer-service programs. Many companies rely on call centers that give customers a machine droning through a menu, usually followed by a long wait. The idea now is to push smarter and more articulate voice programs into these centers. "The telephone companies are all very hungry for call-center applications," says Alfred Hauenstein, director of product management for speech-recognition products at Siemens. And for international businesses, such as hotels, the speech systems could fold in translating programs, such as C-Star's.

Even as speech software takes off, Europe's market leader, L&H, has been having a rough year. The $211 million Belgian company's roaring stock fell by a quarter in April, as the U.S. Securities & Exchange Commission disallowed research-and-development write-offs, leading L&H to restate and lower its earnings. Still, the company is pushing ahead on projects with minority shareholders Microsoft and Intel. And its Voice Xpress, an off-the-shelf computer-dictation program, is battling with IBM, Dragon Systems, and Philips in the under-$100 market.

At the same time, L&H is pushing translation applications onto the Internet. It offers a translation browser that scares up rough summaries of foreign- language Web pages. Such services should grow in demand as the Web, 54% of which is now in English, continues to grow faster in other languages. With Intel, L&H is developing machines that understand verbal queries, hunt down information, and provide answers over the phone. "We're talking about machines that can figure things out," says L&H Chief Executive Gaston Bastiaens. "You ask it about L&H stock, and it knows to call up"

STANDBY. Cell-phone companies already are programming voice-activated dialing into their handsets. Now, manufacturers are wondering how much more speech to give their machines. The trouble is that speech programs require computing and battery power, both at a premium in credit-card-size phones. Standby time, the number of hours a phone can remain on, is a key selling point, says Bernd Burchard, who leads the speech recognition unit at Infineon, the former semiconductor division of Siemens. "None of them wants to sacrifice any standby."

The likely solution is to host the speech-recognition programs at servers and to encourage customers to call in to talking computers. Eventually, customers will have a choice between having a heavier, smarter phone in hand, or choosing a lighter, dumber model that feeds more off the network.

Perhaps the biggest hurdle for interactive television is the lack of a keyboard substitute. This makes it a natural for voice systems, many of which will be unveiled at a trade show in late August in Berlin. The challenge, of course, is to filter out the TV noise so that the machine can hear what the viewer is yelling. "We have a lot of experience with noise cancellation from our work with cars," says Infineon's Burchard.

And cars? They may be the last to debut, possibly in another five years. Lengthy tests by car companies, led by DaimlerChrysler, are sure to drag out the process, say software developers. That means it's safe to keep swearing at the car for a few years, as long as the phone's out of range.