Whiffin N, Karczewski KJ, Zhang X, Chothani S, Smith MJ, Evans DG, Roberts AM, Quaife NM, Schafer S, Rackham O, Alföldi J, O'Donnell-Luria AH, Francioli LC, Genome Aggregation Database Production Team, Genome Aggregation Database Consortium, Cook SA, Barton PJ, MacArthur DG, Ware JS. Characterising the loss-of-function impact of 5' untranslated region variants in 15,708 individuals. Nat Commun. 2020 May 27;11(1):2523. PubMed.
Recommends
Please login to recommend the paper.
Comments
Institute Pasteur de Lille
The gnomAD, a new experimental model and a potential useful tool for ND genomics
Several articles, recently published in Nature, Nature Medicine, and Nature Communication, gave an overview of the powerful potential of the Genome Aggregation Database to better understand genetic variations. The Genome Aggregation Database (gnomAD) is a resource developed in the context of an international collaboration whose goal is to aggregate and harmonize exome and whole genome sequencing data from a large number of large-scale sequencing projects, offering this invaluable resource to the scientific community
The V2 release of this database spans 125,748 exome sequences and 15,708 whole-genome sequences from 141,456 unrelated humans sequenced as part of various disease-specific and population genetic studies. This database has been deeply mined by several teams of scientists to begin to express part of its discovery potential : 443,769 high-confidence predicted loss-of-function variants allowing researchers to classify human protein-coding genes along a spectrum representing tolerance to inactivation (Karczewski et al., 2020); a roadmap for “human knockout “ studies that should guide the interpretation of loss-of-function variants in drug development (Minikel et al., 2020); 433,371 structural variations for medical and population genetics (Collins et al., 2020); 1,792,248 multinucleotide variants (Wang et al., 2020); the characterization of the loss-of-function impact of 5' untranslated region variants (Wiffin et al., 2020).
With this type of analyses, we begin to enter into a phase of deeper understanding of the impact of genetic variations thanks to a long-term quality control of annotations and the largest compendium of DNA sequences. For decades we have played with the Cyrillic alphabet, now we begin to read and understand the bible in the Russian language. The main interest of such a large database is to identify rare variations that may have a biological effect, understand not only the impact of these variations on the protein structure itself, but also on its regulatory elements.
Indeed, when we perform Genome Wide Association Studies (GWAS), we identify numerous single-nucleotide polymorphisms that point to a chromosomal region associated with a neurodegenerative disease. However this is only the beginning of the story, because with a position obtained from a statistical test, we do not have any idea of the function of these susceptibility loci in the pathophysiology of the disease. Once we have accumulated all these susceptibility loci by aggregating more and more studies to increase the statistical power to identify rare variants, the experimental evidence needed to characterize the function may take years and years. And among 45 loci you identified, what locus will you experimentally analyze first? We have very powerful high-throughput discovery tools, but we are still lacking high-throughput functional in silico studies to accelerate the understanding of physiopathology, the only way to invent new treatments.
These articles open new hope in this hunt for pathophysiological knowledge, especially in the field of neurodegenerative disease. We can reanalyze our GWAS hits, explore the impact of the mutations on the nearby gene function through a mutational constrained spectrum, and maybe validate new therapeutic drug targets. The identification of new structural variations may help us to decipher the hidden heritability that captivates so many scientists involved in chronic disease genomic research.
The only way to progress in all these domains and to face the always growing complexity of biological systems involved in the pathophysiology of ND, is to continue to develop such huge databases, to facilitate public access to summary data, and to implement global collaborations for the highest benefit of our patients.
References:
Wang Q, Pierce-Hoffman E, Cummings BB, Alföldi J, Francioli LC, Gauthier LD, Hill AJ, O'Donnell-Luria AH, Genome Aggregation Database Production Team, Genome Aggregation Database Consortium, Karczewski KJ, MacArthur DG. Landscape of multi-nucleotide variants in 125,748 human exomes and 15,708 genomes. Nat Commun. 2020 May 27;11(1):2539. PubMed.
View all comments by Philippe AmouyelNational Institute on Aging
The overall collection of gnomAD papers is an important resource for the field. That the underlying data is in the public domain is really important—it’s a common first stop to evaluate whether disease variants are rare and more likely to be pathogenic, so the resource is important in neurodegenerative disease research.
We also know that some of the specific findings are highly robust. For example, Whiffen et al. examine loss-of-function variants in LRRK2, similar to the lack of difference in frequency between controls and PD cases that we have previously reported (Blauwendraat et al., 2018). Neither study is big enough to say whether this partial loss of LRRK2 is protective against Parkinson’s disease, but both indicate that a 50 percent reduction in expression of this key PD gene is tolerated throughout lifetime.
References:
Blauwendraat C, Reed X, Kia DA, Gan-Or Z, Lesage S, Pihlstrøm L, Guerreiro R, Gibbs JR, Sabir M, Ahmed S, Ding J, Alcalay RN, Hassin-Baer S, Pittman AM, Brooks J, Edsall C, Hernandez DG, Chung SJ, Goldwurm S, Toft M, Schulte C, Bras J, Wood NW, Brice A, Morris HR, Scholz SW, Nalls MA, Singleton AB, Cookson MR, COURAGE-PD (Comprehensive Unbiased Risk Factor Assessment for Genetics and Environment in Parkinson’s Disease) Consortium, the French Parkinson’s Disease Consortium, and the International Parkinson’s Disease Genomics Consortium (IPDGC). Frequency of Loss of Function Variants in LRRK2 in Parkinson Disease. JAMA Neurol. 2018 Nov 1;75(11):1416-1422. PubMed.
View all comments by Mark CooksonKing's College London
GnomAD is an essential resource for anyone with an interest in ALS genetics, particularly those researchers like me who are focused on the search for new, highly penetrant dominant pathogenic mutations in our patient cohorts.
The initial release was of limited value to us, as it incorporated variants from the exomes of about 3,000 ALS patients, but this was soon rectified when the “non-neurological” subset of variants from over 115,000 individuals free of neurological conditions became available. Being able to assess accurately the frequency (or novelty) of patient-derived variants in such a large sample of controls, from a diverse group of populations, is a fantastic aid to selecting and prioritizing candidate mutations and genes.
The very recent release of gnomAD version 3, with whole genome variants from over 70,000 individuals, means that we will now be able to assess the frequency of intronic and intergenic variants, regions that have often been neglected in ALS research, with the same accuracy as has been previously been applied to the coding portion of the human genome.
View all comments by Simon ToppUniversity of Bresica
VIB, University of Antwerp, Center for Molecular Neurology
VIB, University of Antwerp, Center for Molecular Neurology
We also have examined a LRKK2 frameshift variant (c.6187_6191delCTCTA; p.L2063fs*) in lymphoblastoid cell lines (LCLs) of an individual affected by amnestic MCI, without Parkinson’s disease (Perrone et al., 2018). The variant was a five-base-pair deletion in LRRK2 that predicted a frameshift and a premature termination codon after amino acid residue 2063 and was previously reported in one patient with Parkinson’s disease and two control individuals (Ross et al., 2011).
We analyzed the expression at both protein and transcript levels and showed that this LRRK2 mutation had little effect on transcript levels but seemed to result in a nearly complete protein loss in LCLs of the patient carrier.
We further investigated LRRK2 protein levels in control individuals without any LRRK2 mutations and observed a highly variable expression. Some individuals showed near null LRRK2 expression, comparable to the LRRK2 loss observed in the patient carrier. Our protein expression results are slightly different from the ones performed by Whiffin et al., though we also concluded that a low LRRK2 expression is unlikely to interfere with normal biological processes.
References:
Perrone F, Cacace R, Van Mossevelde S, Van den Bossche T, De Deyn PP, Cras P, Engelborghs S, van der Zee J, Van Broeckhoven C. Genetic screening in early-onset dementia patients with unclear phenotype: relevance for clinical diagnosis. Neurobiol Aging. 2018 Sep;69:292.e7-292.e14. Epub 2018 May 9 PubMed.
Ross OA, Soto-Ortolaza AI, Heckman MG, Aasly JO, Abahuni N, Annesi G, Bacon JA, Bardien S, Bozi M, Brice A, Brighina L, Van Broeckhoven C, Carr J, Chartier-Harlin MC, Dardiotis E, Dickson DW, Diehl NN, Elbaz A, Ferrarese C, Ferraris A, Fiske B, Gibson JM, Gibson R, Hadjigeorgiou GM, Hattori N, Ioannidis JP, Jasinska-Myga B, Jeon BS, Kim YJ, Klein C, Kruger R, Kyratzi E, Lesage S, Lin CH, Lynch T, Maraganore DM, Mellick GD, Mutez E, Nilsson C, Opala G, Park SS, Puschmann A, Quattrone A, Sharma M, Silburn PA, Sohn YH, Stefanis L, Tadic V, Theuns J, Tomiyama H, Uitti RJ, Valente EM, van de Loo S, Vassilatis DK, Vilariño-Güell C, White LR, Wirdefeldt K, Wszolek ZK, Wu RM, Farrer MJ, Genetic Epidemiology Of Parkinson's Disease (GEO-PD) Consortium. Association of LRRK2 exonic variants with susceptibility to Parkinson's disease: a case-control study. Lancet Neurol. 2011 Oct;10(10):898-908. Epub 2011 Aug 30 PubMed.
View all comments by Christine Van BroeckhovenMake a Comment
To make a comment you must login or register.