This is Part 2 of a two-part story. See also Part 1.

20 September 2012. With the advent of inexpensive genotyping technology, genomewide association studies (GWAS) have turned up thousands of point changes in DNA that can alter risk for disease. Most of those “hits” appear in genomic regions that code for no particular gene, leaving researchers puzzled about how they exert their influence. Now, scientists led by John Stamatoyannopoulos, University of Washington, Seattle, provide some clues. The researchers correlated newly catalogued functional regions in the human genome with published GWAS. They found that, not only did the majority of reported variants fall in regulatory DNA that controls gene expression, but they do so only in cells linked to pathology. That suggests these variants could indirectly influence coding genes that then go on to affect disease. "It basically says we've only been looking at the tip of the iceberg with these GWAS," said Stamatoyannopoulos. The findings, reported in the September 7 Science, come hot on the heels of a slew of related papers in Nature, revealing new insights into functional elements in the human genome (see Part 1). Together, the work could help researchers interpret some seemingly obscure links between genetic polymorphisms and neurodegenerative diseases.

GWAS of diseases, or clinical traits such as plasma cholesterol, look for genomic changes that could explain the phenotype at hand. Some studies reported GWAS hits in regulatory regions of the genome (see, e.g., Pomerantz et al., 2009 and Musunuru et al., 2010), but it was unclear whether the observation was specific to those particular diseases or true in general. "We showed that it is really across the board. Every single disease or trait we looked at shows the same phenomenon," said Stamatoyannopoulos.

To broadly examine regulatory regions, joint first authors Matthew Maurano, Richard Humbert, and Eric Rynes constructed an overall genomic map by analyzing 349 cell and tissue types—including fetal tissues, tumor cells, pluripotent cells, hematopoietic cells, and cultured cells—as part of the Encyclopedia Of DNA Elements Project (ENCODE) and the Roadmap Epigenomics Program (see ARF related news story). Both of these projects probe the finer points of genomic function. Maurano and colleagues cut the DNA into pieces with nuclease DNase1. This enzyme preferentially snips DNA that has been exposed, such as when regulatory proteins unwind the nucleic acid to activate gene transcription. After the DNA had been cut, the research group analyzed the hundreds of millions of DNA fragments to pinpoint the DNase1-hypersensitive sites (DHSs).

Almost four million DHSs pervade the genome, the team found. On average, about 200,000 DHSs were active in any one cell. Each type of cell had a unique DHS pattern, depending on which areas of DNA were active. With this DHS map in hand, the team looked at more than 5,500 single-nucleotide polymorphisms (SNPs) found in GWAS of hundreds of diseases and quantitative traits. About 75 percent of these were in or near DHSs, suggesting that a considerable number of non-coding GWAS hits are functional and exert effects on coding genes through regulation, the authors wrote.

Probing further, the authors found that specific SNP-containing genome regions often sported DHSs only in disease-relevant cells. As an example, a DHS popped up in a fetal heart cell (but not a brain cell) around a coronary heart disease-associated mutation. In a few hundred cases, DHSs harboring GWAS-related variants seemed to control distant genes, up to 500 kilobases—that is, several genes—away. In addition, seemingly unrelated SNPs that had previously been linked to individual diseases within a family of disorders (such as autoimmune disease) often struck elements that were recognized by the same transcription factor. "[That correlation] allows you to construct relationships among diseases in ways that nobody had previously anticipated," said Stamatoyannopoulos.

About 88 percent of non-coding SNPs lay in DHSs active in early fetal development. Most of the diseases or traits tied to these SNPs are thought to start in the womb or are influenced by development. The remaining SNPs, including some related to Alzheimer's disease, breast cancer, and lupus, occurred in DHSs found only in adult tissues. The majority of those diseases and traits have not previously been associated with any early causation. "This would suggest that pathology likely begins in the adult stage," said Stamatoyannopoulos. Researchers studying AD debate how early in life the disease begins to take hold. Though the researchers possess scant data so far on DHSs encompassing AD-linked SNPs, Stamatoyannopoulos hopes to sample adult brains in the coming year.

"These results are potentially very important to people carrying out GWAS in neurodegenerative disorders," wrote Peter Holmans, Cardiff University School of Medicine, U.K. in an e-mail to Alzforum. Not only will these findings help weed out true GWAS signals, but "they will have profound implications for pathway analyses, both on the way that GWAS SNPs are assigned to genes and how genes are grouped into pathways."

The study adds a level of support to an idea that scientists have held for some years. It is that several genes, each imparting a subtle effect on biology, may cause a disease, rather than one mutation causing its own profound effect, said Julie Williams, also at Cardiff University. "It also tells us that there's yet another layer of complexity, that [DNA] regions may actually affect genes that are some distance away, not necessarily the most proximal gene."

If this study can be validated, researchers may look specifically for mutations in the regulatory DNA in GWAS, increasing statistical power and the ability to detect disease-associated variants, wrote Christiane Reitz, Columbia University, New York, to Alzforum in an e-mail (see full comment below).

In addition to studying the human brain, the Seattle group will sample other tissues associated with human disease, expand the range of GWAS hits they explore, and continue to improve the technology. "The ultimate goal of this line of research is to have a complete map of the regulatory circuitry of the human genome," Stamatoyannopoulos said. With such a map, researchers could understand how the genome controls everything from normal developmental processes to disease.—Gwyneth Dickey Zakaib

This is Part 2 of a two-part story. See also Part 1.


Make a Comment

To make a comment you must login or register.

Comments on this content

  1. If these findings can be validated by additional studies, they will have a significant impact on the analysis and interpretation of GWAS data. GWAS commonly identify genes that are—in terms of their function—at first sight hard to link to the phenotype studied. The connection of numerous DNAse hypersensitive sites harboring GWAS SNPs with promoters of distant genes—as suggested by this study—offers a plausible explanation for these observed associations. In addition, the work suggests that seemingly unconnected variants have common transcription factor networks. If both findings turn out to be true, they would largely help us to understand and disentangle the specific disease studied, since the authors show how apparently unrelated genes can be linked to disease.

    These findings also help us understand the links among different diseases, because they provide a framework for elucidating whether diseases share a common transcription factor network and thus may be mechanistically related.

    Going forward, association studies could be performed differently than done at present, namely by examining hits only on regulatory DNA. This would increase the statistical power (and thus the ability to detect disease-associated variants), as it would decrease the statistical correction needed for the multiple testing required for genomewide studies.


News Citations

  1. ENCODE Turns Human Genome From Sequence to Machine
  2. Bethesda: "Ome" Sweet "Ome"—Epigenome Joins Genome, Proteome

Paper Citations

  1. . The 8q24 cancer risk variant rs6983267 shows long-range interaction with MYC in colorectal cancer. Nat Genet. 2009 Aug;41(8):882-4. PubMed.
  2. . From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature. 2010 Aug 5;466(7307):714-9. PubMed.

External Citations

  1. Encyclopedia Of DNA Elements Project
  2. Roadmap Epigenomics Program

Further Reading


  1. . Genetic variants influencing human aging from late-onset Alzheimer's disease (LOAD) genome-wide association studies (GWAS). Neurobiol Aging. 2012 Aug;33(8):1849.e5-1849.e18. PubMed.
  2. . For Alzheimer disease GWAS, pulling needles from the haystack is just the first step. Neurology. 2012 Jul 17;79(3):204-5. PubMed.
  3. . The accessible chromatin landscape of the human genome. Nature. 2012 Sep 6;489(7414):75-82. PubMed.

Primary Papers

  1. . Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012 Sep 7;337(6099):1190-5. PubMed.
  2. . Genetics. A GPS for navigating DNA. Science. 2012 Sep 7;337(6099):1179-80. PubMed.