Epigenomic Roadmap Points to Causal Genes
Quick Links
Genetic-association studies link loci—certain addresses in a person’s DNA—to disease, but finding the causal genes within those regions is anything but straightforward. In the October 26 Nature Genetics, researchers led by Thomas Montine and Howard Chang at Stanford University in Palo Alto, California, lay out a roadmap for how to identify single-nucleotide polymorphisms that are likely to affect gene expression. By analyzing the structure of chromatin, they homed in on GWAS SNPs that disrupted transcription-factor binding. This approach enabled them to nominate, with high confidence, candidate functional genes for 13 Alzheimer’s and 17 Parkinson’s loci. In some cases, these were new genes not previously associated with the disease. More broadly, the chromatin map could aid researchers studying other neurological diseases, the authors suggested.
- A new map of chromatin structure in healthy brain identifies regulatory regions.
- SNPs that disrupt transcription-factor binding in these regions are likely to cause disease.
- This approach nominated causal variants and genes for several AD and PD GWAS hits.
Others agreed. “This work is well-performed, interesting, and generates a large amount of data that will be very useful to the scientific community to make sense of GWAS data,” Jean-Charles Lambert at the Institut Pasteur de Lille in France wrote (full comment below). Alice Chen-Plotkin at the University of Pennsylvania, Philadelphia, told Alzforum, “They’ve provided a molecular atlas of normal human brain that will serve as a starting point for finding causal variants and target genes.”
In recent years, researchers have sought to decipher noncoding genetic variants by examining their ties to genetic regulation in specific cell types. This has linked many AD risk loci to microglia (Aug 2019 news; Nov 2019 news).
Got That? By comparing GWAS hits (blue line) and linked SNPs (gray lines) to regions of open chromatin (green peaks), researchers identify possible functional variants (red lines). Those that disrupt transcription factor binding (thick shaded bar) and interact with downstream genes (thin shaded bars) become lead candidates. [Courtesy of Corces et al., Nature Genetics.]
Montine and colleagues took a similar approach, first mapping chromatin structure in the human brain to find regulatory regions. First author Ryan Corces analyzed postmortem tissue from the hippocampi, substantia nigras, striata, and cortices of 39 healthy controls using Assay for Transposase-Accessible Chromatin sequencing (ATAC-Seq). In this technique, an enzyme binds to regions of open chromatin, snips the DNA strands, and adds a tag. By sequencing these tagged DNA fragments, researchers identify regions that are accessible to transcription factors and thus likely to contain active enhancers and promoters. In addition to using bulk tissue, the authors also performed ATAC-Seq on 70,631 single nuclei isolated from the four brain regions. This uncovered additional regions of open chromatin present only in specific cell types.
Armed with this chromatin map, Corces and colleagues used a three-step process to find functional disease variants. First, they overlapped their map with GWAS data for 6,496 SNPs across 86 loci associated with PD, and 3,245 SNPs across 44 loci associated with AD. Of these, 1,175 SNPs were located within regions of open chromatin, suggesting they might fall into an enhancer or promoter sequence.
Enhancers are often located far from the genes they influence, and the DNA strand has to loop to make the two meet. In the second step, the authors employed High-throughput Chromosome Conformation Capture (Hi-C) to pinpoint enhancers and their target genes. This technique cross-links nearby pieces of DNA, immobilizing interacting regions. The Hi-C data further whittled down the pool of SNPs, finding 516 PD and 433 AD variants located in enhancers that contacted downstream genes.
In the third step, the authors used machine learning to determine how many of these 949 variants would actually affect transcription-factor binding. They trained an algorithm to recognize functional enhancers by feeding it enhancer sequences and inactive sequences. It found 174 SNPs spread among 48 loci that were likely to disrupt transcription factor binding, and thus might skew gene expression. Examining each locus further, the authors pinned down putative causal variants for the 30 loci.
Then ATAC This. Single cell ATAC-Seq fingers an SNP, rs1237999, that disrupts a PICALM enhancer only in oligodendrocytes (purple). Mapping chromatin interactions (HiChIP, scATAC co-accessibility, and PLAC-Seq) suggests the same enhancer modulates expression of the transcriptional repressor EED. [Courtesy of Corces et al., 2020 Nature Genetics.]
The data turned up some surprises. For example, the AD gene PICALM has already been linked to cell-specific functions in neurons, astrocytes, and microglia (Oct 2020 news). The authors fingered SNP rs1237999, which disrupted a transcription-factor binding site that regulates PICALM only in oligodendrocytes. This hints that this risk variant might work its mischief through these cells. Curiously, the same enhancer also interacted with another gene, the transcriptional repressor EED, implicating an additional gene in the disease.
For BIN1, previously linked to the functional variant rs6733839 in a microglial enhancer, the authors uncovered a second functional variant, rs13025717, in a different microglial enhancer. Although the enhancers bind different transcription factors, both regulate BIN1 expression. Both SNPs have been shown to affect BIN1 protein levels, suggesting both are functional (Novikova et al., 2019).
In some cases, the SNPs affecting transcription-factor binding pointed to a different gene than expected. In the SLC24A4 locus linked to AD, the analysis singled out rs10130373, located in a microglia-specific enhancer that binds the transcription factor SPI1. This enhancer modulates expression of the nearby endocytic gene RIN3, but not SLC24A4. Likewise, in PD, an association in the ITIH1 locus homed in on rs181391313, located in an intron of the transmembrane receptor STAB1. The SNP disrupts a binding site for the transcription factor KLF4. The intron regulates STAB1 expression, but only in microglia. STAB1 aids endocytosis, dovetailing with other data highlighting the importance of microglial phagocytosis in neurodegenerative disease.
Chen-Plotkin liked the multistep approach to tracking down functional variants. “They’re layering all these levels together, and it’s almost like a detective story,” she told Alzforum. She believes the data lay the groundwork for future mechanistic experiments to validate these candidate genes.
To that end, the authors have already begun to examine the effects of these transcriptional modulator SNPs on disease markers using iPS-derived cell cultures. Corces said that understanding the mechanisms behind GWAS associations is the ultimate goal. “The real promise of this approach is to identify new genes and pathways to target therapeutically,” he said.—Madolyn Bowman Rogers
References
News Citations
- AD Genetic Risk Tied to Changes in Microglial Gene Expression
- Cell-Specific Enhancer Atlas Centers AD Risk in Microglia. Again.
- In Astrocytes, ApoE4 Bungles Endocytosis, PICALM Picks Up the Slack
Paper Citations
- Novikova G, Kapoor M, TCW J, Abud EM, Efthymiou AG, Cheng H, Fullard JF, Bendl J, Roussos P, Poon WW, Hao K, Marcora E, Goate AM. Integration of Alzheimer’s disease genetics and myeloid cell genomics identifies novel causal variants, regulatory elements, genes and pathways. 2019 Jul 6. bioRxiv. BioRxiv.
Further Reading
News
- Geneticists Seek Out Rare Contributors to Alzheimer’s
- Doubling Down on Sequencing Serves up More Alzheimer’s Genes
- Which Cell Types Execute Your Genetic Risk? For AD and PD, Scientists Know.
- Largest Alzheimer GWAS in African Americans Finds New Variants
- Expression, Expression, Expression—Time to Get on Board with eQTLs
Primary Papers
- Corces MR, Shcherbina A, Kundu S, Gloudemans MJ, Frésard L, Granja JM, Louie BH, Eulalio T, Shams S, Bagdatli ST, Mumbach MR, Liu B, Montine KS, Greenleaf WJ, Kundaje A, Montgomery SB, Chang HY, Montine TJ. Single-cell epigenomic analyses implicate candidate causal variants at inherited risk loci for Alzheimer's and Parkinson's diseases. Nat Genet. 2020 Nov;52(11):1158-1168. Epub 2020 Oct 26 PubMed.
Annotate
To make an annotation you must Login or Register.
Comments
Institute Pasteur de Lille, INSERM
This work has several objectives: (i) to establish a systematic mapping of transposase-accessible chromatin (ATAC) regions in seven brain regions from 39 healthy individuals; (ii) from some of the samples used in the previous bulk assay, to perform a similar ATAC analysis at the cell-type level by incorporating HiCHIP analyses in these samples (chromatin immunoprecipitation for H3K27ac); (iii) to use these data to develop a pipeline to prioritize the functional variants responsible for GWAS signals in noncoding regions.
This work is well-performed, really interesting, and generates a large amount of data that will be very useful to the scientific community as it works to make sense of GWAS data.
The pipeline has been tested on Alzheimer's and Parkinson's data, but could obviously be extended to other brain diseases or phenotypic traits. Moreover, this work could also make it possible to elucidate why a gene can be involved in two different pathologies through differential patterns of expression in different cells that, for example, depend on different epigenetic characteristics.
Of course, it is necessary to keep in mind that these omic approaches can systematically generate false-positive and -negative results (even if the authors took some precautions to control for them). This study also did not interrogate potential specific disease-related epigenetic characteristics. In addition, no biological validation has been realized and the final functional prioritization is mainly based on in silico and statistical approaches. As a consequence, these data have to be used with some caution, keeping in mind that they cannot be considered fully exhaustive or fully biologically validated.
In addition, the selection of the SNPs of interest in AD and PD led to the potential exclusion of secondary signals in the loci analyzed. Such fine mapping would have been very useful to reinforce the implication of some risk factors in specific cellular types. However, it is difficult to assess whether the machine-learning approach used in this paper would have been able to handle such complexity at the genetic level.
Make a Comment
To make a comment you must login or register.