Online Extra: How the "Digital Object Identifier" Works

Think of it as a kind of catalog number that can instantly provide all kinds of information about printed material

The book industry and the record industry are deeply intertwined. The retail sale of recordings is based on the methods of book publishing, and many bookstores derive much of their revenue from the sale of recordings. Some of the book industry's giants, such as AOL Time Warner and Germany's Bertelsmann, are also huge players in recorded music.

On one critical issue, however, what are in some ways two branches of the same industry have taken dramatically different courses. Faced with the opportunities and challenges of digital distribution, the music industry has chosen to fight the new technology at every turn. While the lawsuit that has effectively shut down Napster's music-sharing service was the most visible part of the campaign, the record companies have opposed even paid downloads of music until an ironclad system of preventing unauthorized copying is in place.

The book industry, hesitantly at first but lately with much more enthusiasm, has embraced the idea of electronic books, or e-books. The DOI Foundation's Digital Object Identifier eBook project ( is part of a far-reaching industry effort to promote the availability of books and other published material in electronic form.

Want to know more about about digital books, digital music, and DOI? Here are some questions and answers:

Q: Aren't the piracy problems facing book publishers and record companies very different?

A: In some ways yes. The music industry faces the possibility that its rights will be violated by people passing around perfect digital copies of the original recording. (Although most MP3 versions are of significantly lower quality that the originals, perfect version are possible, given enough bandwidth.) An electronic version of a book, on the other hand, strikes me as a markedly inferior product compared to a printed and bound version. On the other hand, publishers have been dealing with widespread distribution of unauthorized copies ever since the use of copying machines became widespread in the late 1960s.

Q: Why have the strategies of book and record companies evolved so differently?

A: Recorded music is overwhelmingly bought by individuals and the record industry felt it had the upper hand, legally and in the market. The situation in books is very different. Libraries are a major force in the publishing industry, and both public lending libraries and university research libraries feel a strong interest in getting the most information to the largest number of users at the lowest price. They have been pushing hard for digital distribution (and have been strenuous opponents of copyright law changes that have strengthened the hand of rights owners vs. buyers).

Educators, from kindergarten through college, are anxious for alternatives to conventional textbooks, which are heavy (a major concern to parents who worry about the health impact of bulging backpacks), expensive, and rapidly become outdated. Some educators also want to pick and chose, assembling custom course materials using parts of several texts. Magazine publishers have also become increasingly interested in digital distribution because printing and mailing costs are huge budget items that could be eliminated if they could ship bits instead of paper. Print publishers are responding to big incentives for change.

Q: What is a Digital Object Identifier, and how does it work.

A: A DOI is just a number consisting of two parts. The first part, the prefix, identifies the original publisher of the material. The word "original" is important, because the prefix is permanent, remaining the same even if the rights to the content are transferred to another publisher. Any publisher can obtain a prefix for a payment of $1,000 to the DOI Foundation.

The second part, the suffix, uniquely identifies the work, which could be a book, a part of a book, or any other text. Publishers are responsible for assigning their own numbers, and they can be in any format. Since every edition of every book currently receives an International Standard Book Number (ISBN), these, perhaps modified to allow sale of partial books, will be used as the suffixes.

DOIs can be incorporated into Web pages much like current links. But instead of pointing to a specific Web location, the DOI sends the browser off to a database, where it retrieves and displays whatever information the publisher chooses to offer. At a minimum, this will be catalog information about the book, but more likely it will also include links to excerpts and to places where you can buy electronic or print copies. Assuming the publishers do their job of maintaining the databases, these centralized references, unlike current Web links, should never become outdated or broken.

Q: How does the DOI help control piracy?

A: In and of itself, the DOI does nothing to prevent copying or collect fees for use. But it can be vital in enabling both. The first step in any rights-management system is positive identification of material and of the rights owner. In the case of shared music, a la Napster, this has proven exceedingly difficult because as far as computers are concerned, two titles that differ by a single character are two completely different works.

If DOIs are used, typos and variant titles don't matter as long as the identifier code is entered correctly. Second, the DOI system has been designed from the beginning to integrate with existing digital-rights management systems such as InterTrust and with the copy protection schemes of e-book readers such as Microsoft Reader and Adobe eBook Reader. Though a foolproof way to identify the rights owner is not sufficient by itself to make a payments system work, it is a necessary condition.

Q: Where did the idea of the DOI come from?

A: It's actually one implementation of a more general method for identifying digital content called the Handle System ( The Handle System was developed by the Corporation for National Research Initiatives, a think tank funded mostly by the Defense Advanced Projects Research Agency and the National Science Foundation. Those are the agencies that sponsored the infant Internet and CNRI's president, Robert E. Kahn, is generally regarded as one of the creators of the Internet.

The Handle System is modeled on the Internet's domain name service, which allows any computer on the net to find a computer called, say, knowing only where the database containing the most basic information about the .com domain is kept. Similarly, a computer that wants to turn a DOI or other "handle" into a location where the content may be found need only know where to look for information on the owner of the prefix.

Q: How long will it take before every book has a DOI.

A: If the idea really takes hold with publishers, all books currently in print could receive identifiers very quickly. The actual assignment is as simple as sticking the publishers' prefix in front of the ISBN number, though of course, setting up and maintaining the database required to make the system work will take a good bit more effort.

It is hoped the publishers will also assign DOIs to out-of-print books in their backlists to which they still hold the rights. Probably the biggest part of the job is assigning identifiers to millions of long out-of-print works that live on in libraries. The burden of this will fall on the libraries themselves, but the payoff could be a giant, unified catalog of the world's research libraries, potentially an enormous benefit to scholars and researchers.

    Before it's here, it's on the Bloomberg Terminal.