This is Part 2 of a two-part story. See also Part 1.
20 September 2012. With the advent of inexpensive genotyping technology, genomewide association studies (GWAS) have turned up thousands of point changes in DNA that can alter risk for disease. Most of those “hits” appear in genomic regions that code for no particular gene, leaving researchers puzzled about how they exert their influence. Now, scientists led by John Stamatoyannopoulos, University of Washington, Seattle, provide some clues. The researchers correlated newly catalogued functional regions in the human genome with published GWAS. They found that, not only did the majority of reported variants fall in regulatory DNA that controls gene expression, but they do so only in cells linked to pathology. That suggests these variants could indirectly influence coding genes that then go on to affect disease. "It basically says we've only been looking at the tip of the iceberg with these GWAS," said Stamatoyannopoulos. The findings, reported in the September 7 Science, come hot on the heels of a slew of related papers in Nature, revealing new insights into functional elements in the human genome (see Part 1). Together, the work could help researchers interpret some seemingly obscure links between genetic polymorphisms and neurodegenerative diseases.
GWAS of diseases, or clinical traits such as plasma cholesterol, look for genomic changes that could explain the phenotype at hand. Some studies reported GWAS hits in regulatory regions of the genome (see, e.g., Pomerantz et al., 2009 and Musunuru et al., 2010), but it was unclear whether the observation was specific to those particular diseases or true in general. "We showed that it is really across the board. Every single disease or trait we looked at shows the same phenomenon," said
To broadly examine regulatory regions, joint first authors Matthew Maurano, Richard Humbert, and Eric Rynes constructed an overall genomic map by analyzing 349 cell and tissue types—including fetal tissues, tumor cells, pluripotent cells, hematopoietic cells, and cultured cells—as part of the Encyclopedia Of DNA Elements Project (ENCODE) and the Roadmap Epigenomics Program (see ARF related news story). Both of these projects probe the finer points of genomic function. Maurano and colleagues cut the DNA into pieces with nuclease DNase1. This enzyme preferentially snips DNA that has been exposed, such as when regulatory proteins unwind the nucleic acid to activate gene transcription. After the DNA had been cut, the research group analyzed the hundreds of millions of DNA fragments to pinpoint the DNase1-hypersensitive sites (DHSs).
Almost four million DHSs pervade the genome, the team found. On average, about 200,000 DHSs were active in any one cell. Each type of cell had a unique DHS pattern, depending on which areas of DNA were active. With this DHS map in hand, the team looked at more than 5,500 single-nucleotide polymorphisms (SNPs) found in GWAS of hundreds of diseases and quantitative traits. About 75 percent of these were in or near DHSs, suggesting that a considerable number of non-coding GWAS hits are functional and exert effects on coding genes through regulation, the authors wrote.
Probing further, the authors found that specific SNP-containing genome regions often sported DHSs only in disease-relevant cells. As an example, a DHS popped up in a fetal heart cell (but not a brain cell) around a coronary heart disease-associated mutation. In a few hundred cases, DHSs harboring GWAS-related variants seemed to control distant genes, up to 500 kilobases—that is, several genes—away. In addition, seemingly unrelated SNPs that had previously been linked to individual diseases within a family of disorders (such as autoimmune disease) often struck elements that were recognized by the same transcription factor. "[That correlation] allows you to construct relationships among diseases in ways that nobody had previously anticipated," said Stamatoyannopoulos.
About 88 percent of non-coding SNPs lay in DHSs active in early fetal development. Most of the diseases or traits tied to these SNPs are thought to start in the womb or are influenced by development. The remaining SNPs, including some related to Alzheimer's disease, breast cancer, and lupus, occurred in DHSs found only in adult tissues. The majority of those diseases and traits have not previously been associated with any early causation. "This would suggest that pathology likely begins in the adult stage," said Stamatoyannopoulos. Researchers studying AD debate how early in life the disease begins to take hold. Though the researchers possess scant data so far on DHSs encompassing AD-linked SNPs, Stamatoyannopoulos hopes to sample adult brains in the coming year.
"These results are potentially very important to people carrying out GWAS in neurodegenerative disorders," wrote Peter Holmans, Cardiff University School of Medicine, U.K. in an e-mail to Alzforum. Not only will these findings help weed out true GWAS signals, but "they will have profound implications for pathway analyses, both on the way that GWAS SNPs are assigned to genes and how genes are grouped into pathways."
The study adds a level of support to an idea that scientists have held for some years. It is that several genes, each imparting a subtle effect on biology, may cause a disease, rather than one mutation causing its own profound effect, said Julie Williams, also at Cardiff University. "It also tells us that there's yet another layer of complexity, that [DNA] regions may actually affect genes that are some distance away, not necessarily the most proximal gene."
If this study can be validated, researchers may look specifically for mutations in the regulatory DNA in GWAS, increasing statistical power and the ability to detect disease-associated variants, wrote Christiane Reitz, Columbia University, New York, to Alzforum in an e-mail (see full comment below).
In addition to studying the human brain, the Seattle group will sample other tissues associated with human disease, expand the range of GWAS hits they explore, and continue to improve the technology. "The ultimate goal of this line of research is to have a complete map of the regulatory circuitry of the human genome," Stamatoyannopoulos said. With such a map, researchers could understand how the genome controls everything from normal developmental processes to disease.—Gwyneth Dickey Zakaib.
This is Part 2 of a two-part story. See also Part 1.
Maurano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, Reynolds AP, Sandstrom R, Qu H, Brody J, Shafer A, Neri F, Lee K, Kutyavin T, Stehling-Sun S, Johnson AK, Canfield TK, Giste E, Diegel M, Bates D, Hansen RS, Neph S, Sabo PJ, Heimfeld S, Raubitschek A, Ziegler S, Cotsapas C, Sotoodehnia N, Glass I, Sunyaev SR, Kaul R, Stamatoyannopoulos JA. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012 Sep 7;337(6099):1190-5. Abstract
Schadt E, Chang R. A GPS for navigating DNA. Science. 2012 Sep 7;337(6099):1179-80. Abstract