BGI's Young Chinese Scientists Will Map Any Genome

In an old shoe factory in Shenzhen, thousands of young scientists have set out to map the DNA of ... pretty much everything
Deng Wenxi with plant genomes Photograph by Luke Casey for Bloomberg Businessweek

When the workday ends at BGI’s factory in Shenzhen, the headquarters of the largest genome mapping company in the world, it’s like a bell has gone off at math camp. The company’s scientists and technicians spill out of the doorways of the building, baby-faced and wearing jeans and sneakers. Some still have braces. Several young women link arms and skip toward a bus line. Others head next door to the dorm or over to the canteen where young couples are holding hands across plastic trays. “This work we do is tiring and requires focus,” says Liu Xin, a 26-year-old team leader in the bioinformatics division, as he sinks into a couch in one of BGI’s conference rooms. “So it’s good that they allow us to date.”

Liu is one of a small army of recent college graduates at BGI’s largest facility, a former shoe factory. Two gray buildings, the factory and the dorm, are wedged between one of Shenzhen’s industrial zones—a grid of high-rises, apartment buildings, and several hospitals and medical equipment companies—and a lush, jungly hill that’s in the process of being bulldozed. Liu is stocky and serious, glad that he already has a steady girlfriend so he can focus on his career. He arrived at BGI three years ago, a biology major from Peking University with little experience in the study of the genome, the term for the entirety of an organism’s genetic information. Now he’s one of the senior people in his department. He works 12-hour days and oversees the sequencing of multiple genomes at a time. He specializes in plants—his team is currently sequencing a species of orchid. The bioinformatics teams around him are picking through the genomes of animals, microbial organisms, humans, and anything else that comes with a genetic code. “Everyone is just out of college,” he says. “I am now more sophisticated than most of the newcomers.”

Ten years after the mapping of the human genome, BGI has established itself as the world’s largest commercial genetic sequencer. The ranks of China’s college graduates are expanding faster than the country can employ them, and BGI is leveraging this cheap, educated labor pool. At the factory in Shenzhen, more than 3,000 employees (average age, 26) spend their days preparing DNA samples, monitoring sequencing machines, and piecing together endless strings of A’s, C’s, T’s, and G’s, the building blocks of genetic material.

“This is big data analysis,” says Wang Jun, BGI’s 36-year-old executive director. Wang, who regularly wears tennis shoes and untucked polo shirts, has published more than 35 articles in Science and Nature magazines and also teaches at the University of Copenhagen. Genomics, he says, is a new field and experts are being created from scratch. “We don’t need Ph.D.s to do this work,” Wang says. Instead, he believes genomics is best learned the old-fashioned way. “You just throw them in,” he says of BGI’s technicians. “The best way is hands-on experience.”

When the first draft of the human genome was released in 2000 as part of the international Human Genome Project, it seemed inevitable that scientists would soon crack the codes of disease, health, and human development. But the genome has proved more complicated. What scientists produced in 2000 was a long list of nucleotides, the combinations of markers in DNA that specify the makeup of an organism. It was just a list, and only a fraction of it is understood. Scientists were quick to identify fragments of the genome that translate into proteins, which control things like eye color, but these make up only 1.5 percent of the entire thing. As geneticists like to put it, they produced a map without a legend. This is where BGI comes in.

Executive director Wang Jun (left)
Photograph by Luke Casey for Bloomberg Businessweek

The company was founded in 1999 with state funding to lead China’s participation in the Human Genome Project. “We didn’t think about any business model; we basically didn’t plan further than the human genome,” says Wang, who was brought on in the early days of BGI to provide expertise in computers. China, he points out, was the only developing country working on the international project, and although the BGI team contributed only 1 percent of the finished project, it did it quickly and with little previous experience. “Even Bill Clinton thanked us for our participation,” he says. Wang joined the project when he was just 22 and worked under BGI’s two founders, the scientists Wang Jian, then 45, and Yang Huanming, then 47.

For its next challenge, BGI decided to tackle rice, whose genome is significantly shorter than that of humans but still large enough to impress. “We recruited a bunch of undergraduates, and lots of them had no working experience on any project,” Wang Jun says. The schedule was tight; Wang and his team barely slept. “We can do these kind of crazy things in BGI,” he says. “We can get 100 people together, very fresh, no experience at all, and get it done.”

In 2002, BGI published a paper on the rice project in Science and again attracted attention and money from the Chinese government, though it’s a private company. The company was rewarded with entry into the state-run Chinese Academy of Sciences, a distinction that secured additional funding. As part of CAS, however, BGI was limited to only 90 scientists. Its leaders had their eyes on expansion. “Our boss wanted to buy more sequencing machines,” says Deng Wenxi, a 24-year-old communications officer at the BGI factory. “But the Beijing government would not support us.” In 2007 the company found a solution by way of Shenzhen’s city government, which offered the factory 10 million yuan (about $1.6 million in today’s exchange rates) to cover startup fees and 20 million yuan in annual grants. The company changed its name from Beijing Genomics Institute to BGI Shenzhen and moved to the shoe factory. “Beijing is more strict,” says Deng. “Shenzhen wanted to welcome us.” The factory, she says, actually belongs to the Shenzhen government. When asked about the move, Wang Jun answers the question a little more vaguely, “Well,” he says, “the weather is definitely nicer here.”

Today, BGI organizes its operations into three categories—health care, agriculture, and the environment. When scientists look at the genome, they’re looking for variations from one individual to another, from species to species, or population to population. They’re looking to understand which variations link to specific traits or diseases.

As Wang Jun says, decoding any genome is a big data endeavor, and there’s no other research institution or for-profit sequencing company in the world that has the capacity of BGI. In health care, it offers straightforward sequencing services for universities and corporations globally, which ask BGI to sequence a genome and send it back for analysis. More often than not, BGI works in partnerships to map, analyze, and publish the findings.

When Deng meets me in the morning, the first place we visit is a kind of trophy room on the top floor of the factory where the walls are decorated with copies of Science and Nature magazines, each containing a paper from BGI. The subjects include the company’s part in the ICGC Cancer Genome Projects; its work with 2,000 families to map the genomes of children with autism; its mapping of the epigenetic differences (differences in gene expression not the result of a variation in the genetic code) between 5,000 twins; and a project to increase the number of identified Mendelian, or inherited, genetic disorders.

In addition to linking more disorders to variations in the genome, BGI’s research could change the way medical providers and governments understand and respond to outbreaks of disease. BGI’s partners include GE Healthcare, Merck, and Novo Nordisk, and the work they’re doing will help pharmaceutical companies understand why some drugs are more effective in some populations and less so in others. In May 2011, BGI flexed its muscle during a deadly outbreak of E. coli in Germany. As soon as the outbreak began, BGI began to piece together the genome of the strain from samples provided by the University Medical Center Hamburg-Eppendorf. Within five days, the company released sequencing reads on the strain, leading to the crowd-sourced assembly and analysis of the genome. In the future, BGI’s expertise could be applied to viruses.

Wang Jun says BGI’s first goal is to “find ways that genomics can serve society.” The company, he emphasizes, is not state-owned, and the profits it makes are cycled back into research. The company has been steadily increasing its profits in the past few years. In 2011, BGI reported revenue of 1.2 billion yuan. Many projects the company takes on reflect this policy of for-profit science. In agriculture, BGI is mapping genome sequences it considers proprietary and using them to engineer superior strains of rice, millet, and even fish. Technicians do this by using genomic information to breed for certain traits. Hybrid millet, says Deng, could improve yields and help alleviate hunger in Africa. Balsa trees designed by BGI can withstand colder temperatures, which means they could be grown in China. Sharing the trophy room with the BGI published papers is a single large fish, mottled green and gray, swimming in a tank. “That is our hybrid grouper,” Deng says. It grows three times as fast as a regular grouper, she says, and according to a BGI brochure, it tastes better. When I ask Wang how BGI determines which plants and animals to sequence as part of its “1,000 Plants and Animals” project, he answers, “We start with anything tasty.”

BGI technicians
Photograph by Luke Casey for Bloomberg Businessweek

The company is also taking part in the sequencing of the earth’s microbiome, meaning all microscopic organisms. This is an effort to identify the functional and evolutionary diversity of microbial organisms across the globe. (BGI has sequenced more than 1,000 such organisms in the human gut.) Many of the plant and animal genomes it has sequenced, such as the giant panda and Liu’s orchid, are beneficial mainly to scientists studying the traits and evolution of animals.

BGI has also made forays into cloning and has invented a simplified technique. Called “handmade cloning,” it cuts costs and makes large-scale cloning more realistic for use in animal and plant research. So far BGI has applied the technique to clone mice, sheep, and a mini pig that glows in the dark. In an office, a slightly desiccated stuffed piglet sits in a small display case. A better-looking piglet, Deng says apologetically, had been misplaced.

BGI’s footprint is expanding. It recently received approval from the U.S. government to acquire its biggest competitor, Mountain View (Calif.)-based Complete Genomics, which also provides commercial DNA sequencing. The go-ahead for the $117 million deal came after a counterbid and regulatory challenge from San Diego-based Illumina. The majority of BGI’s sequencing machines, at the moment, are purchased from Illumina, and the acquisition of Complete Genomics could give BGI a new source of technology. The complaint of Illumina’s chief executive officer, Jay Flatley, however, is about national security. In a memo addressed to Complete Genomics, Flatley warned the deal would give BGI access to American DNA, possibly posing “national security, industrial policy, personal identifier information protection, and other concerns.”

Complete Genomics will extend BGI’s reach, not only in terms of customers and sequencing power, but also in terms of data storage. Complete Genomics has established its own database of genetic information, complementing BGI’s efforts to build a cloud computing platform capable of holding large amounts of genomic data.

Even as BGI improves its technology, its biggest strength remains all those cheap, highly educated analysts. The amount of data available on the genome has outstripped the ability to analyze it. Laboratories around the world are in need of more experienced and reliable bioinformatics experts—people such as Liu.

“It’s the Wild West,” says George Church, a professor of genetics at Harvard University and an adviser to BGI. “This is a field that has arisen overnight, and the number of discoveries is going up exponentially.” A single genome contains a massive amount of data (a human genome, for example, contains about 3 billion nucleotides, or data points), and a bioinformatics expert’s work requires sifting through, comparing, and testing the information in multiple genomes. While sequencing costs have dropped dramatically in the last 10 years, the process is far from automated. Companies that offer personalized genetic testing, such as 23andMe, typically test only for a sampling of 100 traits and diseases, or about 1/3,000th of the entire genome, Church says. For about $4,000, BGI does the whole thing.

BGI’s electronic sequencers—11 are in Shenzhen, 77 in Hong Kong, and more than 66 scattered throughout the rest of China and the world—are imposing-looking black-and-white boxes, slightly taller than the technicians that run them. They don’t churn out fully formed genomes; rather, they handle fragments, reading each nucleotide from signals emitted as the machine resynthesizes a template DNA strand. These out-of-order sections of the genome require piecing together. Once assembled, a genome sequence still has to be interpreted to find the source of whatever trait or disease a particular study aims to find. This process, even with a reference genome fully in place, is difficult to hand over to a computer program. “The software basically doesn’t exist yet,” Church says.

BGI’s Shenzhen factory is organized so that a genetic sample travels from floor to floor as it goes through the sequencing process. When a sample first arrives—it usually comes in a test tube—it’s taken to the fourth floor, where workers in different colored coats prepare and expand the genetic material (coat color signifies the kind of DNA being handled). Workers bend over tiny vials, mechanically separating genetic material with a syringe. They’re splitting DNA samples into single strands and will soon put them through a chemical process called polymerase chain reaction, or PCR. This will copy a single DNA fragment about 10 million times. Microscopic chains of beads holding the DNA fragments are then loaded onto a sheet with tiny cups and sent to the sequencers on the fifth floor. When the machines are finished, the information is delivered electronically to the second floor, where Liu, the bioinformatics team leader, works.

In a large, open room, more than 1,000 young scientists sit in cubicles, staring at strings of computer code, piecing together sections of whatever genome they’ve been assigned. Liu’s team is slightly apart from the rest. “You’re looking for variants or parts of the genome that are hard to map,” he says. Computer programs have difficulty identifying a new variation unless a spot on the genome has already been pinpointed and entered into the computer program. Recently, with the orchid, Liu’s team had problems interpreting a certain section. Liu was assembling his species of the plant according to an orchid reference genome, and certain sections of the code were just not lining up the way researchers (and the computer) had expected. Trying to tie these sections to certain orchid traits was proving difficult. Liu calls it a “weird region.”

“We had to figure out how to analyze this,” he says. “It required us to try different solutions, look through sets of data that could be important, and figure out why we were having trouble mapping that section.” Researchers tried different solutions and found that some of the orchid’s traits were heterozygous—there were two spots on the genome responsible for their development. Weird regions of a genome, Liu says, are the most exciting part of his job.

A decade ago young people arriving in Shenzhen would have hoped to land a job building iPods or sewing jeans, a wholly different career track from Liu’s colleagues. “This is the virtue of Shenzhen,” Liu says. “People are all coming from other places, they are here trying to make money or to find some opportunities. We all have the same kind of ambitions.” An opening salary at BGI runs around 3,000 yuan ($481) a month. “It’s not great, but it is competitive,” Deng, the communications officer, says.

Executive Director Wang could easily disappear in the crowd of recent graduates on BGI’s campus if it weren’t for his imposing height. He doesn’t like calling BGI a factory—he’s more interested in creating the feel of a college campus. In addition to encouraging dating, BGI promotes the creation of clubs and the enjoyment of free time. “On weekends we like to climb North Mountain,” Deng says, pointing to a hill in the distance. Wang likes to play basketball, and BGI has an annual tournament. According to Deng, Wang’s team always wins. He has a suspicious number of tall people on his team. “We think he might hire people just for the basketball team,” she says, giggling.

After six o’clock, when most of BGI’s staff is done with work, a basketball court outside the dorm quickly gets crowded. Some of Liu’s colleagues from the bioinformatics division stick around to watch the games. One of the dorm’s oldest residents, Tai Shuaishuai, says he’s just taking a break before heading back to work. “For those of us who always stay in the office, the dorm is more convenient,” he says, smiling through braces. Tai is 31, and his first name translates to “handsome handsome.” Like Liu, he’s been at BGI since 2009, an eternity at the Shenzhen factory, and he heads a team using sequencing to improve what he calls “molecular breeding,” the same process responsible for BGI’s grouper. Tai is also responsible for reviewing potential employees.

“China has a lot of universities, but we prefer candidates from the top universities,” he says. “To be a BGIer means you have to be creative as a scientific researcher, and you have to have team spirit. We take a lot of things into consideration—skills, knowledge, educational background, and working style.” According to Tai, an offer made to a potential employee is rarely turned down.

Employee cafeteria
Photograph by Luke Casey for Bloomberg Businessweek

One reason may be that BGI offers employees the chance to study while working. If Liu hadn’t joined BGI, he says, he probably would have pursued graduate studies somewhere else. “I would not be getting this hands-on experience,” he says. “Working here is basically a Ph.D. program.” Nonetheless, he’s starting at the University of Hong Kong this year in a program that will only require that he leave work for a day or two each week.

Most of the employees on the basketball court seem to be participating in one of BGI’s work-study programs. One group of four says they’re still college students, living at BGI on a full-time internship. “It’s just as comfortable as the dorms in our universities,” one says. “And Shenzhen is a great place to be for young people.” On weeknights, he says, karaoke halls offer a discount.

Around 6:30 it begins to rain, and the BGI basketball court empties. Tai ducks into the entrance of the dorm, where a janitor is mopping under fluorescent lights and BGI employees queue up to buy snacks. A couple of people from the bioinformatics team gather around Tai and talk about their plans. “I would like to go abroad to the U.S.,” says a team leader named Gao Zhibo. “Not to get my Ph.D. but just to improve my language skills and my social skills.” Tai, for his part, is hard-pressed to imagine why anyone would ever leave. “Doing scientific study is my passion,” he says. “It’s my belief that science has no limits.”

Before it's here, it's on the Bloomberg Terminal.