In humanity's ongoing quest for knowledge, explorers are about to reach one of history's great milestones: sequencing the human genome--the entire set of human genes and everything in between. They have traveled the entire length of our twisting ladder-like strands of DNA, reading the code of each of the three billion individual "rungs" along the way. That will throw the door open to a vast continent containing the answers to many of the riddles of how we grow, how we get sick, and how we die.
In June, J. Craig Venter, the controversial president and chief scientific officer of Celera Genomics in Rockville, Md., will announce that he has virtually finished the task. Indeed, he has already turned to the next prize, disclosing on June 1 that Celera has decoded one-third of the 3 billion "rungs" in the mouse genome. That will give Celera a crucial leg up on rivals in trying to find and understand human genes.
Venter has plenty of competition, though. The federal government's genome project will soon announce that it, too, has completed a draft of the human genome. A goal that seemed impossible to some when it was first discussed in the mid-1980s will have been reached years sooner than expected. For the winners, "Nobel prizes and huge potential profits are out there," says geneticist Frederick R. Blattner of the University of Wisconsin.
The ultimate impact will be almost unfathomable. Trying to understand biology from the few thousand genes now known has been like exploring the U.S. with only the dim light of a few street-lamps scattered here and there across the nation. Now, suddenly, the entire country will be illuminated--every mountain peak, every cul-de-sac, every blade of grass.
Yet for today's genetic Lewises and Clarks, the journey of exploration has only just begun. Researchers can now see the genetic code, but that doesn't mean they understand it. Indeed, they know so little that they can't even say how many genes are there to be discovered. In late May, a team of scientists at the University of Washington estimated that the 3 billion bits of human DNA harbor 34,000 genes. Yet in the same scientific journal, a group at The Institute for Genomic Research (TIGR) used similar methods to put the estimate at 120,000 genes.
With the genome sequence in hand, scientists will be able to determine exactly how many genes there are and discover what they do. That will pave the way for a revolution in medicine, making today's treatments seem like "medieval anachronisms," as Venter puts it. But that will take years--perhaps even decades. And the crucial question is: Who will be the quickest and the best at turning the treasures of the genome into medical advances and commercial profits? "Once the sequence is there, the game begins," says Stanford biochemist and Nobel laureate Paul Berg.
Already, companies like Rockville's Human Genome Sciences and Incyte Genomics in Palo Alto, Calif., have staked out claims for thousands of genes they have fished out of cells. Others are using a variety of "DNA chips" and innovative laboratory techniques to speed the race to understand biology and develop new treatments. And still others are experimenting with fruit flies, zebra fish, and mice, because these simpler organisms share much of their fundamental biology with humans.
"BABBLING PHASE." Meanwhile, big pharmaceutical companies are seeking the best targets for drugs amid the flood of new gene discoveries. Even IBM is getting into the act. Big Blue has focused its most advanced computing resources on the problem of analyzing genomic data. The whole field of genome-based medicine is like a rapidly growing child, says analyst Viren Mehta, director of Mehta Partners. "Right now, we're in the babbling phase and are on the verge of writing paragraphs," he says. "The companies that can turn those paragraphs into masterpieces are the companies of the future."
If Venter has his way, one of those companies will be his upstart, Celera. Can he pull it off? Few are counting him out. After all, Venter is someone who has always taken his own path--and accomplished what many believed was not possible.
His career had an unlikely beginning. After high school, "I worked as a night clerk at Sears to support my surfing habit," Venter recalls. Then, faced with the Vietnam era draft, he signed up with the Navy--on the promise that he, a champion backstroker, would be on the swim team. The team was disbanded, but Venter, who had aced the Navy's intelligence tests, had his pick of assignments. He chose the duty that "gave me the least chance of being killed"-- hospital corpsman in Da Nang. "I learned that by taking active control to whatever extent I can, I can have a dramatic effect on the outcome," he says. His time in Vietnam "was a lifetime of education packed into one year," he recalls.
He came back with driving ambition. He zipped through college and earned a biochemistry PhD in a mere six years, while supporting himself as a respiratory therapist, and immediately landed a faculty post. "I basically caught up with my peers, only with a hell of a lot more experience of life than they had," he says. Winning the respect of his fellow scientists took a lot longer. He burst into the limelight in the early 1990s, at a time when it took years to find a single gene. Then at the National Institutes of Health, Venter developed an automated method for discovering hundreds in a few weeks.
BOLD PLANS. It was a direct challenge to the plans of the Human Genome Project, a 15-year effort to sequence the entire genome that began in 1990. And some of the project's leaders, who had dismissed Venter as a second- or third-tier scientist, saw him as "an intruder, a faker, and a blabbermouth," recalls one top scientist. Says Venter: "I had one of the most impressive enemy lists anyone had accumulated."
After being turned down for funding for his bold plans to discover thousands of genes, Venter jumped from NIH to his own gene-hunting institute, TIGR. By 1998, he felt he could beat the genome project at its own game--and he said so. With backing from equipment maker PE Corp. and its aggressive CEO Tony L. White, the one-time surfer set up a PE unit, Celera Genomics, to use a risky, unproven approach that would ultimately beat the public effort to the prize. "The remarkable thing is that we set out to do this without actually knowing how we would do it," says Venter.
The method worked better than anyone--Venter included--had expected. In the traditional approach to sequencing, scientists divide human DNA into large pieces. They first painstakingly figure out where each piece belongs, then they begin decoding each piece. Celera, in contrast, shortened the process with a whole genome "shotgun" approach: Venter devised a way to blow the genome into many bits and sequence them without regard to their position. The challenge was whether, like all the king's horses and all the king's men, he could put the parts together again. For that, he relied on a supercomputer and clever computer algorithms.
To test the method, Venter picked a difficult genome to sequence--that of the fruit fly. "It was a gutsy move," he says. "We knew it would either work spectacularly or be the biggest flameout in history."
It worked. One early indicator came when Venter sold his 82-foot sailboat, Sorcerer. "I said I would sell Sorcerer and get a bigger boat if Celera was successful," he says. "So when I sold it, everyone was nervous." On March 24, just over a year after he started, Venter and his team published the entire sequence of the fruit fly, along with dazzling new information about the numbers and functions of its genes. Now he's building a replica of the legendary 144-foot New England schooner Bluenose.
By 1999, the Human Genome Project had responded by dramatically stepping up its own pace. And Venter and Dr. Francis S. Collins, the head of the government project, began a war of words that put Wall Street on a roller-coaster. In April, for instance, Celera announced that it had finished sequencing--but not yet assembling--one person's entire genome. The company's stock jumped 24%. A few days later, Collins fired back, saying Venter's data was incomplete and possibly flawed. Celera stock plunged 18%. Collins, derisively dubbed St. Francis by his critics for his unique blend of do-gooding and ruthlessness, also has accused Venter of trying to lock up the secrets of the genome for commercial gain. And he and his team say Venter hasn't played fair because Celera has used data from the public project. Venter retorts that Collins doesn't understand what Celera is doing, and besides, why shouldn't he make use of taxpayer-funded research that's available to everyone else? Collins' behavior, Venter says, has been "despicable."
"WASTE OF TIME?" But for all the drama behind the unveiling of humanity's genetic code, the race marks a beginning, not an end. In fact, the pharmaceutical and biotech industries are already drowning in a flood of genetic information, says Mihael Polymeropolous, vice-president for pharmacogenetics at Novartis. "That's why this race for me is a little silly," he says. "The real race is who will develop the tools to analyze the genome first."
There, Venter has another set of rivals. Celera may have won the headlines, "but they are too late," scoffs Incyte CEO Roy A. Whitfield. "From a commercial point of view, sequencing the genome is just a complete waste of time."
That's too skeptical for most experts. But all agree that making sense of the genome will not be easy. The first problem is that only about 3% of the three billion rungs on the DNA ladder make up genes. The rest of the DNA, commonly called junk, fills up not only the gaps between genes but also holes within the genes themselves. No one really knows why most of the "junk" exists. Because the genes, and the proteins they make, are the key to devising new treatments "there's gold in that 3% if you can find it," says Elliott Sigal, senior vice-president of early discovery and applied technology at Bristol-Myers Squibb Co.
That's a big "if." The difficulty, says Gerald M. Rubin, the University of California at Berkeley geneticist who collaborated with Celera to sequence the fruit fly genome, is that the genome "is written in a language no one knows how to read." Scientists know that there are certain patterns that are telltale signs of genes, and they have developed sophisticated computer programs for spotting those patterns. On May 8, for instance, DoubleTwist Inc., an Internet startup in Oakland, Calif., grabbed headlines when it announced that it had pinpointed the probable locations of 105,000 genes by analyzing the genome sequence information already available in public data bases.
JAMBOREE. Unfortunately, such computer predictions are merely good guesses. After Celera finished sequencing the fruit fly DNA, for example, it used two computer programs to identify fruit fly genes and came up with significantly different results. "No algorithm can say, `that's a gene,"' explains Venter. "You have to have good people sitting in front of the computer screen to make sense of what's really a gene and what isn't."
That's why Celera plans to host an unusual meeting in late June of top geneticists and biologists from around the world. They will gather for what's playfully called an "annotation jamboree" as they attempt to use their lifetimes of knowledge and intuition--along with the latest computer programs--to decide where genes actually lie on the human genome and what they might do.
Other outfits have taken a rival tack. In a method pioneered, ironically, by Venter, Incyte and Human Genome Sciences are identifying genes by searching for the instructions they send to protein factories elsewhere in the cell. Both Incyte and HGS say they've already found pieces of almost every human gene, though critics say the data is incomplete and often wrong. "When we publish the genome, the Incyte and HGS databases will be worthless," Venter claims.
What's more, the genes aren't actually the only interesting parts of the genome. Why does Julia Roberts look different from a chimp? It can't be just the genes, since those of humans and our simian cousins are virtually identical. The real difference lies in how those genes are turned on and off. It turns out that some bits of DNA act as master switches--and they can be located far from the genes they turn on and off, hidden in the junk DNA.
How can scientists find these regulatory regions? Celera has an ace up its sleeve--the mouse genome. The mouse has just about all the same genes and genetic switches as humans, yet its junk is quite different. So as Venter's army of robots reads the code of the billions of rungs on the mouse DNA ladder, his scientists are laying the sequence alongside the human genome. The comparison immediately reveals not only the genes but also the shared genetic switches. And Celera's scientists are discovering that the approach is also bringing key insights about the structures of individual genes, revealing which parts contain instructions for proteins and which parts are the so-called noncoding regions, or introns. Venter believes that the mouse will give him a crucial leg up on competitors like Incyte and HGS. "The mouse genome does for us what no one else can do," he explains. "It tells us the exact structure of the unknown gene. That gives Celera and its subscribers a tremendous edge."
Celera plans to publish the human genome and basic information about it by the end of the year. But it is also making money by selling access to its much larger database of information about DNA sequences and their functions. Venter figures he needs about 10 major pharmaceutical companies to sign up at a rate of some $10 million a year to pay for much of the company's operations. So far, he has deals with Pharmacia, Novartis, Amgen, Pfizer, and Takeda. And on May 7, he signed up his first academic paying customer, Vanderbilt University. Because Venter's human genome database will be more complete than its rivals', "all the companies that develop drugs may have to pass largely through Celera," says Dr. Faraz Naqvi, co-manager of Dresdner RCM Biotechnology fund. "It's like controlling the gate to the Internet."
To turn that information into drugs and treatments, researchers must determine what individual genes do. A whole biotech sector, called functional genomics, has sprung up to unlock these mysteries. Dr. Robert Tepper, chief scientific officer at Millennium Pharmaceuticals Inc. in Cambridge, Mass., wryly dubs the field "dysfunctional" genomics because it poses such difficult challenges. One major strategy relies on comparing human genes with those of other organisms, such as the mouse, the fruit fly, and a worm called C. elegans. Exelixis Inc. in South San Francisco, for example, has used this approach to try to understand the biological pathways underlying diseases like Alzheimer's.
In another example, researchers at Immunex Corp. discovered an interesting-looking molecule sticking out of the surface of the cell--a so-called receptor. They knew that the receptor was part of a large family of similar molecules that play key roles in inflammation, the activation of cells, or the ability of cells to survive. But they didn't know what this particular receptor did. So they created mice that lacked a functioning gene for the receptor--and thus lacked the receptor itself. The resulting mice had a distinctive feature. Their cells that are normally responsible for breaking down unwanted bone didn't develop or work properly, causing the animals to have extra-thick bones. The discovery suggests that the receptor might be a good target for drugs to prevent osteoporosis.
WRONG TARGET. Leading this new field of functional genomics are Millennium, Incyte, and Human Genome Sciences, with many others following behind. But that understanding of basic biology will still take medicine only partway toward its real goal: creating revolutionary new drugs and treatments.
Through most of the 1990s, drugmakers had hoped that when they discovered a gene that causes a disease, they could use knowledge of that gene to devise a new drug. "It was the wrong promise. It oversold and stifled the development of the whole industry," says Novartis' Polymeropolous, who discovered the gene that causes Parkinson's disease. "Now we know that the gene that causes a disease is actually irrelevant as a drug target."
Instead, the key is understanding the whole biological chain of events that occurs when disease strikes--and then picking the best place in that chain to intervene. Genentech researchers, for instance, realized that some breast cancer cells turn on a gene that makes a unique molecule appear on the cancer cell surface. So they devised a drug, Herceptin, that binds to the molecule and helps rein in the cancer. Herceptin's sales for the first quarter of 2000 were $68.7 million. Similarly, scientists at startup Hyseq are collaborating with Chiron to analyze 6 million samples from cancer patients, trying to find genetic clues that explain why some cancers spread and others don't. Their search has turned up 31 genes, 25 of them previously unknown, that may play crucial roles in metastasis.
Fueling these discoveries are advances in scientific tools. To study the changes in activity of many genes at once, companies like Affymetrix and Agilent Technologies Inc. have created "chips" that contain probes that recognize tens of thousands of genes. The chips--typically pieces of glass on which the probes are attached--enable researchers to almost instantly see which genes are active in a particular cell. In another strategy, Aurora Biosciences Corp. in San Diego has developed collections of thousand of cells, each designed to study the effects of one drug on one gene. If the drug turns the gene on, the cell turns blue; if not, it stays green. Using the technology, researchers can evaluate as many as 100,000 drug compounds in a single day. "It's this kind of killer application that's going to rule," proclaims John D. Mendlein, chief knowledge officer at Aurora. On May 31, the Cystic Fibrosis Foundation announced that it would invest up to $47 million in Aurora to screen for potential new cystic fibrosis drugs. Such innovative approaches, analysts say, have turned Aurora, Affymetrix, Agilent (a spin-off from Hewlett-Packard Co.), and others into promising opportunities for investors.
CRAP SHOOT. The tools promise to slash the time--and thus the cost--of drug development. How? The old-fashioned approach is a bit of a crap shoot. A company always hopes that the drug candidates it has created hit only the intended target. But until the compound is tested or used widely in people, there's no way of knowing if it causes damaging side effects. The diabetes drug Rezulin, for instance, had to be pulled from the market because it caused severe liver damage in a few patients. With a full collection of genes and rapid screening tests, "it will be possible to try a drug against all possible targets," explains the University of Wisconsin's Blattner. That will enable companies to weed out problematic drugs long before lengthy and expensive clinical trials begin.
In a glimpse of the future, Vertex Pharmaceuticals Inc. signed a groundbreaking deal on May 9 with Novartis to develop drugs aimed not at one or even several targets but rather at every member of an entire family of proteins--about 1,000 in total. "This is the natural follow-on to having the human genome," explains Vertex CEO Joshua S. Boger.
What's more, the ability to study many genes at once also means that researchers are able to piece together complete biological pathways. Until now, "biologists have mostly studied individual genes, not the larger system," says gene pioneer Dr. Leroy Hood, now director of the Institute for Systems Biology in Seattle, Wash. That, he says, is like trying to figure out a car's function by having one mechanic study the ignition, another the brakes, another the suspension. You can discover the functions of the parts, says Hood, "but what does the hunk of metal actually do?"
At Millennium, for example, researchers are figuring out the biology underlying one of America's great obsessions--fat. Working with colleagues at Hoffmann-LaRoche, they have uncovered a whole network of genes that help control body weight. Some govern the production and metabolism of fat inside cells. Others operate in the gut, controlling the absorption of fat. Still others exert their influence in the brain to control appetite. The discoveries have pinpointed a wealth of new targets for drugs--and Millennium now has 10 candidates ready for clinical trials in obesity. "This disease is just being broken wide open," says CEO Mark J. Levin. "That could never have happened without these technologies."
Some companies, working on the proteins rather than the genes, have created an entire new field, called proteomics. Their aim is to find and understand the estimated 1 million human proteins. Leading this booming new field are companies like Myriad Genetics, Cytogen, and CuraGen. Myriad's scientists have devised a method for rapidly discovering which proteins interact with which others. As a result, they are now constructing networks of proteins--and using them to find new proteins that may be promising drug targets in diseases like breast cancer.
Many other biotech companies are jumping in as well. Incyte, for instance, has ambitious plans to chart proteins in every part of the body. "We want to create a profile of human anatomy at the molecular level," explains President and Chief Scientific Officer Randal W. Scott. Working with AstraZeneca, Incyte has already begun testing drug candidates in the lab to see how they affect the levels of thousands of proteins.
PROTEIN PAYOFF. Celera, meanwhile, is building a facility that will be able to sequence a million proteins a day. And it plans to develop protein "chips"--protein counterparts of the DNA chips already in use--and other technologies that can be used to chart protein activity in every type of cell in the body. Venter figures the effort will have a tremendous payoff. He expects to uncover new hormones and other proteins that could be used as drugs. Moreover, he expects that combining the genomic and protein data will offer far more value than either alone. While other companies, such as Incyte, are following the same strategy, Venter believes that his data are far more complete and will be indispensable. "We can give the genome sequence away for free because we know that no other institution can process it or build the kind of data sets we are building," Venter says. Indeed, he predicts that, eventually, the information in his databases will be so valuable--and accessible--that ordinary people will use it to learn about their own genes. "Celera's ultimate customers are the 6 billion people on the planet," Venter argues.
Another consequence of this flood of information is that the computer has become one of the most important tools in biology. Consider these experiments. You want to measure how each of tens of thousands of drugs affects every one of humanity's 34,000 to 120,000 genes and its 1 million proteins. Or you want to compare the sequences of thousands of unknown proteins with the 3 billion bits of DNA in the human genome. In each case, the amount of data to analyze is mind-boggling. "We have reached a point where processing information is one of the major bottlenecks," says Sharon L. Nunes, senior researcher at the computational biology center at IBM's T.J. Watson Research Center.
Once the information problem has been solved, scientists will be left with a wealth of possibilities. Having the full human genome sequence and all these new tools "will keep researchers busy for a long time," says Vincent Dauciunas, head of strategic planning in the chemical analysis group at toolmaker Agilent. "I call it the Full-Employment Act for the millennium."
It won't happen overnight. The lesson of the gene sleuthing of the past is that wonderful new science takes longer to pay off than first hoped, and finding the treasure in the human genome is no exception. "Most of the miracles are going to involve unknown genes and unknown functions that we are going to take decades to solve," says Venter.
Undoubtedly, Venter will be in the thick of that gene-sleuthing. Leading a recent tour of Celera, he proudly shows off black banks of supercomputers and a vast room filled with silent sequencing robots. But as he passes down the hallway papered with press clippings on the way to his office, he points out that one article is conspicuously missing--a 1995 Business Week cover story labeling Venter and then-partner William A. Haseltine, CEO of Human Genome Sciences, as biotech's two "Gene Kings." Forget Haseltine, he says. "I think I should be on the cover of Business Week as the single gene king," he says. Well, Craig, you've earned it.