I've been talking to my computers for years. Recently, they've started listening. New software lets PCs understand spoken words well enough that dictating is becoming an alternative to typing. The programs remain a bit crude. But you can see the potential when you speak, and your words--or most of them--appear on the screen.
I spent a good amount of time with two new Windows products, NaturallySpeaking from Dragon Systems and ViaVoice from IBM, that are a breakthrough. Until now, I've had trouble with dictation programs that expected me to make a distinct pause between words. These new programs support "continuous" speech, which lets you speak, well, naturally.
SLIGHT EDGE. This is not software that you can install and immediately start using. First, you have to spend a tiresome half-hour or so reading text off the screen so the program can train itself to your voice. I much preferred Dragon's offering of a chapter of Dave Barry in Cyberspace to IBM's deadly text.
After training, both programs are surprisingly accurate at converting spoken words to text, though I'd give a slight edge in accuracy to Dragon. There are things you can do to improve the results. First, use a good, noise-canceling headset microphone. I like the Andrea ANC-500, available for around $40. Running the text of several of my columns through both programs provided another gain in accuracy by incorporating words I typically use into their vocabularies. The programs also learn as you correct mistakes. And finally, I modified my speaking style to avoid swallowing words and letting my voice trail off at the end of sentences.
The two products also share one big drawback. You can dictate only into the rudimentary word processors they provide or, in the case of ViaVoice, Microsoft Word. To send an E-mail message, for example, you first speak the text into the dictation program, then use your mouse to cut-and-paste it into your mail application. To address the message, you have to type. Future versions will work better with existing programs, and the next update of Lotus SmartSuite will have ViaVoice support built in. But don't expect seamless integration of voice technology into your E-mail, spreadsheets, or other applications for some time.
There's a big difference in style between NaturallySpeaking and ViaVoice. IBM seems to assume that the folks who dictate have secretaries. So ViaVoice lets you pass rough dictation on to an assistant for corrections. To deal with the inevitable flubs, the file includes your voice recording, any section of which can be played back. But navigating through the file to make corrections must be done by mouse or keyboard.
NaturallySpeaking is better suited for folks like me who produce their own finished copy. For example, in one dictation session, I said: "A long wait makes it," and the program understood: "Along Lake Mesa." When I said: "Select `along Lake Mesa,"' the program highlighted the garble for me to correct either by speaking or typing. If you make your own corrections, I think the Dragon product is well worth the extra $100 it costs.
I found the programs' raw ability to recognize speech very impressive. But the usefulness of these products will remain limited until they allow you to speak into any application. Another important feature is letting you control the programs by voice, too. This will take a complete rethinking of how people use software. Just replacing mouse clicks with talking your way through existing menu trees, as some expensive programs do now, is not much of an improvement. Instead, programs have to learn to deal with commands, such as "boldface the second paragraph and indent it."
Systems that can do this are on the way, not in computers but at those horrible telephone-call centers that expect you to remember that you're supposed to press 1 to order widgets and 7 to inquire about wickets. Companies such as Registry Magic and Applied Language Technologies are running prototypes that replace keypad-based menus with natural-language understanding. These systems are much simpler than most PC software because they deal with a limited range of speech; an airline's call center doesn't have to take a pizza order. But they help develop computer systems that really understand human beings.
For most people, spoken language is both the richest and easiest way to communicate. Once you can talk to your computer naturally in every sense, a huge barrier between man and machine will be gone.