What is a pangenome? The scientists have just published their first draft.


When the Human Genome Project was launched in 1990, it was hailed as one of the greatest scientific endeavors of all time. The 13-year project identified around 20,000 genes and gave researchers a genetic blueprint to transform modern medicine. Doctors can now use genetic information to better diagnose debilitating diseases and conditions, such as linking a rare case of leg pain to a single mutation. The research also gave way to hopes for an era of precision medicine, in which each treatment would be tailored to the individual. There was only one problem: the job wasn’t really done.

That’s because humans are 99.9 percent identical. But the 0.1 percent genetic differences explain our uniqueness and may also explain why some people are more susceptible to disease. Having a single genome map, which the 1990s-era project produced, does not adequately represent the breadth of the human population.

An international study published today in Nature is filling these gaps by analyzing a much more diverse set of genetic sequences. “We are restructuring the foundation of genomics to create a diverse and inclusive representation of human variation as the fundamental reference structure,” says lead study author Benedict Paten, associate director of the Institute for Genomics at the University of California, Santa Cross.

[Related: The benchmark for human diversity is based on one man’s genome. A new tool could change that.]

By removing bias and analyzing more inclusive genomic data, geneticists will have a better understanding of how mutations affect a person’s genes and move us closer to a future with equitable healthcare.

What is a pangenome?

The research focused on creating a pangenome, a collection of DNA sequences within a single species. Previous work focused on a reference genome, built from just a few individuals, which was supposed to represent a larger set of genes. A pangenome, on the other hand, is created from various people around the world to more accurately reflect our genetic diversity.

It’s not that geneticists of the past didn’t want to sequence more genetic variations, they just couldn’t. Erich Jarvis, a professor of genetics at Rockefeller University Howard Hughes Medical Institute and a co-author of the study, says that technology in the 1990s and early 2000s did not allow researchers to see large variations between haplotypes — groups of inherited genes. together from one parent. —within and between individuals.

The focus of a pangenome is to study genetic differences between individuals around the world. Jarvis says that knowing genomic variations is important, because some mutations are associated with different traits and diseases. For example, the gene for lipoprotein(a) has a complex structure that has not been sequenced in humans. But variations in the gene are known to be associated with increased risk of heart disease among black people. By sequencing the entire gene and understanding its variations, doctors can re-examine and treat previously unexplained cases of coronary artery disease.

“This article helps us understand that DNA [is] more than a sequence of letters; DNA is structurally organized, and human variation in that structure is important for genomic function and trait diversity,” says Sarah Fong, a postdoctoral fellow studying human population variation at the University of California, San Francisco, who did not participate in the study.

What does the first draft reveal?

The authors collected data on 47 genetically diverse individuals. About half came from Africa, and the rest represented four other continents (excluding Australia and Antarctica). The genomic data aggregated information about 119 million base pairs and 1,115 duplications, mutations in which a piece of DNA in a gene is repeated. As expected, more than 99 percent of the gene sequences were similar between individuals. But by including less than one percent of the variations in this new draft of the pangenome, the authors found that structural changes in the genes accounted for 90 million base pairs identified.

[Related: What we might learn about embryos and evolution from the most complete human genome map yet]

“By moving beyond a single, arbitrary, linear representation of the genome, the work of the Pangenome Reference Consortium more accurately describes the diversity that exists in our species,” says Rajiv McCoy, an assistant professor of biology at Johns Hopkins University who was He was not involved in the current study, but was recently involved in the first complete sequencing of the human genome.

With the latest pan-genome model, it may be easier for geneticists to detect and characterize hard-to-find genetic mutations. When the authors analyzed a separate set of genetic information using the draft pangenome as a reference, they detected 104 percent more structural variants. They also improved sequence comparison accuracy, reducing the variant error rate by 34 percent.

still a work in progress

Creating the first draft of the pangenome is only the first phase of this two-part project. The second phase will take a couple of years, as the authors build collaborations among other international researchers and conduct community outreach in areas where there is less genomic data, such as including members of indigenous cultures.

It could be decades before we see the finalized drafts of a complete picture of the human genome. There are several challenges to address, Fong says, such as developing an efficient strategy for comparing multiple human genomes and a concrete plan for testing genetic variations in the medical field.

Still, Fong says the benefits will be worth it. Having complete and diverse human genomes will advance the way genetics is studied and create a future where people’s genes are taken more into account when treating disease.



Leave a Comment