Kaiser Permanente's Genetic Database Is Boon to Medical Research

Scientists are tapping Kaiser Permanente’s trove of genetic data

Illustration by Kris Mukai

Over the past decade, Kaiser Permanente has spent more than $4 billion building the world’s largest private-sector collection of electronic health-care records. The data have become the cornerstone of a new scientific resource: a biobank that links the health records of more than 210,000 Kaiser members with samples of their DNA. The Oakland (Calif.)-based health network has teamed up with the University of California at San Francisco so scientists can use the collection to search for the genetic roots of diseases including glaucoma and prostate cancer.

Kaiser has 9.5 million enrollees in eight states and the District of Columbia, and members can see a wide variety of medical specialists without leaving the network. Every visit, lab test, prescription, and procedure is logged into a member’s electronic health record. This gives Kaiser an edge over other genomics projects, which besides collecting DNA samples must go through the expense and trouble of amassing information on subjects’ medical history. Last year biologist Craig Venter founded Human Longevity, with plans to sequence the genomes of as many as 100,000 people annually, and this summer, Google said it had launched its own genomics project, called Baseline Study. Both are hunting for genes associated with health and longevity.

“We recognized that a lot of the basic resources that would be needed to make a fabulous research program on genes, environment, and health—that those were attributes of Kaiser,” says Catherine Schaefer, who has served as executive director of Kaiser’s Research Program on Genes, Environment, and Health since its launch in 2005. She and her colleagues began by recruiting volunteers in Kaiser’s Northern California system, which has electronic records on patients dating to 1995. Recruits were asked to provide samples of blood or saliva—from which DNA was extracted and analyzed—and fill out behavioral surveys about their exercise habits, alcohol consumption, and sleep patterns. Researchers also compiled lists of patients’ home addresses, which could be cross-referenced with databases on air and water quality and other environmental factors.

So far researchers at Kaiser and UCSF have collected samples from 210,249 people and analyzed the DNA of more than 100,000 of them. (Most of the samples were genotyped, an analysis that catalogs the genetic variants present at specific locations along the genome, rather than sequenced in their entirety.) Researchers are beginning to comb through the data, hoping to identify genes that influence a variety of conditions, including bipolar disorder, hypertension, and cardiovascular disease. The goal is to use such findings to improve diagnostics, treatment, and prevention.

Neil Risch, a co-director of the Kaiser-UCSF venture, says his research using the data indicates there are genetic variants that increase the likelihood that a patient will have an adverse reaction to statins, drugs used to treat high cholesterol. Although Risch cautions that his findings are preliminary, it’s not difficult to conjure a future in which Kaiser patients would be able to take a genetic test before their doctors prescribed a medicine. That might prevent serious and costly side effects such as myopathies, muscle diseases associated with statin use.

Kaiser is also taking applications from outside researchers who want to use the banked data. Of the 75 studies that have been approved, 43 involve non-Kaiser scientists. The health network charges an administrative fee for assembling the relevant data, which are stripped of all identifying information. “What’s possible with this resource is beyond what we can do with our hands, as it were,” says Risch. “It’s almost unlimited.”

In February, Kaiser deposited information on more than 78,000 of its study subjects into the National Institutes of Health’s database on genotypes and phenotypes, a collection of genetic and health data that scientists can tap into online at no cost. Says Schaefer: “The more people who are able to use and make use of it to do good research with it, the more benefits there will be for our members and the more benefits there will be for the public at large.”