Giving P Cs The Power To Readby
I've spent much of the past 30 years sitting at keyboards of one sort
or another, but I've never managed to become more than a wretched four-fingers-and-two-thumbs typist. Needless to say, I'm interested in finding another way to get stuff into my computer.
By far the most advanced alternative is scanning devices, and I'm finding more and more tasks for them. While voice- and handwriting-recognition still face fundamental challenges before becoming truly useful, the optical character recognition (OCR) technology that turns a scanned image into text is now a solved problem. The trick in making it work on the desktop has been
having enough computing power to throw at the task. Fortunately, today's personal computers are up to a job that until recently could be handled accurately and quickly only by high-powered workstations.
CHEAP HARDWARE. OCR, which produces text that can be searched, edited, and copied into other documents, is a tough challenge in artificial intelligence. In effect, it requires teaching a computer to read. Accuracy increases when the software can recognize whole words and is able to determine which letter combinations rarely or never occur in English (or whatever language the software has been specialized for). Pentium PCs or Power Macs with 16 megabytes of memory have enough horsepower to make this sort of heavy lifting practical.
Getting started in scanning is easy these days because good hardware has gotten cheap. Handheld scanners, available for as little as $100, are generally not suitable for OCR. But a new generation of sheet-fed scanners is bringing inexpensive and compact power to the desktop. Virtually any scanner you buy will come bundled with a competent OCR program, usually Caere Corp.'s WordScan or Xerox Corp.'s WordBridge. Of course, any scanner can read in black-and-white drawings. (I did not look at more complex and expensive color scanners.) And in conjunction with software, a fax modem, and a printer, a scanner can turn your personal computer into a fax machine and copier.
Possibilities include the $350 PaperPort from Visioneer (800 787-7007), the $450 PageOffice from Umax Technologies (510 651-8883), and the $299 WinFax Scanner from Delrina (800 268-6082). If you want the best, fastest scanning, it's hard to top a flatbed unit such as the $549 Hewlett-Packard IIp.
ROLODEXING. Corex Technologies Corp. (617 277-5344), a Boston startup, has taken an interesting approach to OCR that demonstrates the technology's maturity. Automatically transferring information from business cards to a computerized contact list is a big help to anyone who collects scads of cards. But few character-recognition challenges are harder. Not only must the software read the often highly stylized typefaces of business cards, it must also figure out what each bit of information is so that it can be transferred to the correct field in your database of contacts.
Corex CardScan software, which lists for $99 by itself or $299 in a bundle with a specialized card reader, does a surprisingly good job--enough so that I found it significantly easier, even after making manual corrections, than typing in the cards by hand. It does the critical job of recognizing numbers with almost perfect accuracy, and most of the time separates voice and fax numbers correctly. Only E-mail addresses consistently caused this program to stumble. CardScan can work with about two dozen personal-information and contact-management programs.
The fumble-fingered won't be truly comfortable until our computers understand what we want when we speak into their microphones and scribble on their screens. But scanning and optical character recognition technology are work savers that have a place on every desktop.