Expanded repetitive sequences are infamous for causing inherited disease. Think poly-Q tracts that cause Huntington’s or the C9orf72 hexanucleotide run-ons that trigger ALS/FTD. However, given people have an estimated 1 million short tandem repeat sequences scattered throughout their genome, these infamous examples are likely the tip of the iceberg. Because these pesky repeats evade sequencing, they represent a potentially under-recognized source of genetic variability. Now, using new approaches to detect them, two new studies implicate expanded repeats in AD.

  • AD risk tied to the number of short repeat sequences scattered across a person’s genome.
  • Having these in more than 30 loci triples a person’s odds of getting AD.
  • An intronic repeat in CASP8 yields dipeptide repeat aggregates in some people with AD.
  • This doubles the chances of having AD.

One was led by Michael Guo and Jennifer Phillips-Cremins at the University of Pennsylvania in Philadelphia and published January 28 in Nature Communications. It tied the total polygenic burden of expanded short tandem repeat sequences to elevated AD risk. People carrying more than 30 different expansions had triple the risk of AD. Many of these run-on repeats landed within active promoters near genes involved in neuronal function, suggesting they could skew expression of neuronal genes.

The second was led by Lien Nguyen and Laura Ranum of the University of Florida and published February 10 in Proceedings of the National Academy of Sciences. They zeroed in on a specific repeat sequence identified within an intron of the CASP8 gene, and found it led to production of poly-glycine-arginine repeat peptides. Aggregates from such dipeptide repeats are already known to associate with C9orf72 ALS/FTD, which carries a similar expansion, but they also appear in many brain samples from people with AD. At least some of them can now be explained by repeats found in CASP8, the scientists reported. People with this intronic repeat have double the risk of AD.

To Ranum and Nguyen, the findings of both studies dovetail nicely, adding to a growing appreciation for the contribution of repeat expansions to human diseases. “While repeat expansions were initially discovered in rare diseases, both papers now strongly link microsatellite repeats to [sporadic] Alzheimer’s disease,” they wrote to Alzforum.

The contribution of repeat sequences to disease risk has been difficult to pin down, because the repetitive sequences escape detection by traditional short-read sequencing methods. Long-read sequencing picks them up but is too costly to run on large numbers of samples across the entire genome. For this reason, how repeat sequences skew disease risk remains largely unexplored.

“To some extent, repeat expansions have been the dark matter of human disease,” commented John Hardy of University College London. 

To take a stab at short tandem repeat expansions in AD, the UPenn scientists deployed ExpansionHunter and gangSTR, aptly named computational tools that detect sequences within existing short-read sequencing data. Defined as DNA sequences composed of repeating motifs two to six base pairs long, short tandem repeats (STRs) exist throughout the genome, and their tract lengths vary from person to person. Guo and colleagues hunted for repeat sequences in DNA from blood samples from 2,981 people, including 1,489 AD cases and 1,492 controls, in the AD Sequencing Project cohort. They detected a total of 321,731 polymorphic STRs. Less than 1 percent of these were in protein-coding regions. About 47 percent sat within introns, 36 percent in regions far from genes, and 14 percent in promoters.

How did the number and length of STRs compare between AD cases and controls? Firstly, no single STR associated with AD risk, except for one near ApoE that turned out to be co-inherited with that risk gene. Instead, people with AD tended to carry a higher cumulative burden of longer STR expansions. For example, people with AD harbored three times as many expansions that were more than 20 repeats long. These expansions tended to be rare, arising in a single person in the cohort.

Did the total burden of these expansions push up AD risk? To address this question, the scientists tallied the number of expansions in each person. For this, they defined an expansion as an STR with a repeat length well above average for that STR. People with AD carried an average of just 6.27 of these outlier expansions, while controls carried 5.27. A more striking disease relationship emerged when the scientists stratified by number of expansions. People who carried more than 30 had a 3.69-fold higher risk of AD, while those who carried at least 20 had nearly double the odds. People who carried fewer than 10 seemed to be protected (image below).

Repeats Raise Risk. The more STR expansions a person carried, the higher their odds for AD (left). People with more repeat expansions tended to have worse tau pathology as per Braak stage (right). [Courtesy of Guo et al., Nature Communications, 2025.]

What’s more, among the 1,188 people with neuropathological data in the cohort, the number of expansions tracked with Braak stage, such that people with more than 30 expansions had more severe tau pathology than those with fewer than 10 (image at right). Guo told Alzforum that less than 5 percent of the cohort carried more than 30 expansions.

Finally, the scientists looked where these AD-linked STR expansions resided across the genome. Using bioinformatic techniques, they found that the expansions were enriched in active promoters. Expansions also lurked in SINE-VNTR-Alus (SVAs), a type of transposable element that, during evolution, hominids co-opted for use in enhancers and promoters (Wang et al., 2005).

Many of the AD-linked expansions lay within promoters for neuronal genes, including those involved in synaptic function. To the authors, this suggests that by skewing expression of neuronal genes, these expanded STRs might tip the balance toward AD with age.

Guo told Alzforum that these expanded STRs were likely inherited, rather than arising somatically, as can happen to cause neurodegenerative diseases. In support of this idea, most STRs existed in one or two copies of discrete length in each person, rather than in a variety of lengths that one might expect if mutations were arising during development. Still, Guo said that future studies will examine expansions found in brain samples rather than the blood. These might reveal somatic expansions that arose in individual neurons as the genome becomes increasingly unstable with age, especially in people with AD pathology (Apr 2022 news). Guo is also hunting for expansions in much larger cohorts now, he said, as the cost of running these analyses has dropped considerably.

To Julie Williams, Rebecca Sims, and Rebecca Maloney of Cardiff University, U.K., who study polygenic contributors to AD risk, the study suggests that genetic variation in the form of short tandem repeats may explain a proportion of AD heritability so far not accounted for by single nucleotide associations. “This is an important observation and one which needs further investigation in the field,” they wrote (comment below).

CASP8 Intronic Expansion Pegged in AD
In their study, Nguyen, Ranum and colleagues went hunting for a specific flavor of repeat expansion, namely those that get translated into aggregation-prone dipeptide repeat proteins. The ALS/FTD-causing hexanucleotide repeat expansion within the C9orf72 gene is a well-known example. Though they reside within an intron, these sneaky repeats still manage to be transcribed and, ultimately, translated by a process called repeat associated non-AUG (RAN) translation, which was discovered in Ranum’s lab (Zu et al., 2011). Curiously, the poly-GR aggregates that arise from this type of repeat have been spotted not only in C9orf72 mutation carriers, but also among noncarriers with AD. This suggests other sources of repeat expansions lurk within the genome, and may relate to AD risk.

Nguyen and colleagues set out to find these culprits. First, they confirmed the presence of poly-GR aggregates in sporadic AD brain samples, where immunohistochemistry detected them in 45 of 80 cases. The inclusions congregated around the nucleus in neurons and glia in the hippocampus, and weren’t found in any of 18 controls. Notably, in the hippocampus, poly-GR inclusions both overlapped with, and accumulated in, distinct subregions relative to tau tangles. Some cells contained either poly-GR or tau tangles, and others had both. Regions with the highest burden of poly-GR aggregates also had high levels of tau pathology.

To find the poly-GR-encoding repeat expansions within the genome, the scientists devised a CRISPR-dCas9-based technique to fish them out. The method uses repeat sequences within single guide RNAs to latch onto, and pull down, short repeat sequences along with flanking sequences for identification. Applying this technique to genomic DNA from five poly-GR+ AD cases and two controls, including without AD, and one with AD but no poly-GR aggregates. Among these six samples, the scientists identified more than 2,000 potential GR repeat loci, of which 19 were more readily pulled down by the CRISPR assay in the poly-GR+ AD cases. Of these, an expanded repeat sequence in an intron of the CASP8 gene was the most abundant.

Using long-read sequencing, the scientists detected 44 to 64 copies of the CASP8 sequence—GGGAGA—in AD cases and in controls. Interspersed with interrupting sequences, the repeats were part of an SVA transposable element within the eighth intron. This gene encodes caspase-8, rare coding variants of which have been tied to AD (Rehker et al., 2017). A broader look at samples from three independent cohorts, including 1,174 AD cases and 1,195 controls, suggested that CASP8 repeat expansion carriers had double the risk of AD.

Were these CASP8 repeats the source of the poly-GR aggregates found in AD cases? The scientists first generated antibodies specific for the predicted dipeptides that would be translated from the repeats. RAN translation can occur in three reading frames in both the sense and anti-sense directions. Because the interrupting sequences between some of the repeat tracts are predicted to shift the reading frame, the resulting dipeptides would be chimeric glycine-arginine (GR), arginine-glutamic acid (RE), and glycine-glutamic acid (GE) repeat proteins, with unique C-terminal regions for each reading frame. With antibodies against these potential products, the scientists were able to detect the predicted CASP8 repeat-derived dipeptides from two of the sense reading frames. These overlapped with poly-GR aggregate staining. Ultimately, they found that many, but not all, of the poly-GR inclusions found in people with AD appeared to have come from the CASP8 repeat expansion.

Red-Handed. Immunohistochemistry detected CASP8-GGGAGA repeat derived protein aggregates (red) in the hippocampi of AD cases who carried the CASP8 repeat expansions (top left), but not in AD cases who did not (bottom left) or controls (right). [Courtesy of Nguyen et al., PNAS, 2025.]

Cell culture experiments revealed that, like the C9orf72 hexanucleotide expansion, these CASP8 repeats were transcribed into RNA that formed foci within the cells, and, via RAN translation, spun into poly-GR dipeptides. Like all RAN translation, this process was exacerbated by cellular stress. Strikingly, inducing expression of the repeats in neuronal cell lines spurred production of phosphorylated tau.

Notably, not all poly-GR aggregates harbored the CASP8-derived sequence, suggesting that there were other genomic sources of repeats. Furthermore, some controls carried CASP8 repeat expansions but no poly-GR aggregates; this could mean other hits, such as cellular stress, are needed to trigger their production, which, in turn, might spur more p-tau production, creating a vicious cycle, the authors proposed.—Jessica Shugart

Comments

  1. To some extent, repeat expansions have been the dark matter of human disease. They are generally missed by SNP arrays and by short-read sequencing technologies and it is only with long-read sequencing we can see them and also appreciate their high mutation rates and perhaps appreciate also their roles as risk loci. What is not yet clear is what pathologies they are associated with: Is it plaque and tangle disease or (like C9orf72 and PGRN genes) is it more with TDP-43 pathology?

  2. This study examined the relationship between short tandem repeat (STR) expansions, a frequently overlooked type of genetic variation, and the risk of AD. While one STR was significantly associated with AD risk, its association could be almost fully accounted for by the APOE-e4 genotype. This lack of single STR association is likely explained by a lack of power due to small sample size (~ 1,500 cases and 1,500 controls).

    However, aggregating all STRs into a burden test yielded exciting results. Individuals with more than 30 STR expansions had more than a three-fold risk of AD. An odds ratio of this magnitude is similar to that of the largest genetic effect for late-onset AD, namely APOE4. Current odds-ratio GWAS estimates of this locus range between 2-5. Reassuringly, and in contrast to the single STR association, the authors show this STR burden effect is independent of APOE4.More

    It remains unclear how well common-variant, SNP-based GWASs tag STR expansions. If SNPs tag STR expansions well, then they would be unable to explain much of the missing heritability, which is part of the study's motivation. To address this, an analysis that adjusts the STR burden effect for a polygenic risk score (PRS), or an analysis that includes both scores, should be performed.

    Nevertheless, this is an important study examining the association of non-SNP genetic variation with AD risk. In the future, such studies may combine several types of genetic variation (SNPs, copy number variations, STRs) across the full frequency spectrum.

  3. The paper presents evidence that genetic variation in the form of short tandem repeats may explain a proportion of Alzheimer’s disease heritability so far not accounted for by single nucleotide associations. This is an important observation and one which needs further investigation in the field.

    There are major advantages in identifying STRs as they are more likely to have direct functional effects. Identifying their disease-relevant effects is more tractable. Furthermore, the STR’s themselves offer the possibility for gene editing therapies which are already being developed for other diseases with similar causal pathways.

    Further, larger scale studies must now be considered to capture specific disease-associated STRs to aid our understanding and support future therapeutic advances.More

  4. We commend the recent advancements in understanding the role of expanded repeat sequences in Alzheimer's disease (AD) risk, as highlighted by the studies from the University of Pennsylvania and the University of Florida. These findings align with our ongoing research into the impact of intermediate-length CAG repeat expansions, particularly within the huntingtin (HTT) gene, on neurodegenerative diseases.

    In our 2019 study, we observed a significantly higher frequency of HTT intermediate alleles (IAs) in AD patients (6.03 percent) compared to healthy controls (2.9 percent), suggesting a potential role for these alleles in AD pathogenesis (Menéndez-González et al., 2019). Further, our 2020 research expanded this investigation to include intermediate repeats in the ATXN1 and ATXN2 genes. We found an increased frequency of ATXN2 IAs in AD cases (4.1 percent vs. 1.8 percent in controls) and a notable association of HTT and ATXN1 IAs with progressive nonfluent aphasia, a subtype of frontotemporal dementia (Rosas et al., 2020).More

    Building upon these findings, our most recent study focused on the caudate nucleus, a region susceptible to HTT CAG expansions. We discovered that HTT IAs in late-onset AD patients are associated with altered microRNA profiles, leading to dysregulation of gene expression. This dysregulation affects key components of the spliceosome, resulting in an increased presence of the tau 3R isoform and a higher number of ghost tangles, potentially accelerating disease progression. These insights underscore the importance of genetic screening for HTT alleles in clinical practice to enable more accurate classification and personalized therapeutic interventions for AD patients (CastillaSilgado et al., 2025).

    Collectively, these studies emphasize the significance of intermediate repeat expansions in neurodegenerative diseases. We encourage continued research in this area to deepen our understanding of these associations, which hold promise for advancing both research and clinical practices in neurodegenerative disorders.

    References:

    . HTT gene intermediate alleles in neurodegeneration: evidence for association with Alzheimer's disease. Neurobiol Aging. 2019 Apr;76:215.e9-215.e14. Epub 2018 Nov 28 PubMed.

    . Role for ATXN1, ATXN2, and HTT intermediate repeats in frontotemporal dementia and Alzheimer's disease. Neurobiol Aging. 2020 Mar;87:139.e1-139.e7. Epub 2019 Nov 1 PubMed.

    . Synergistic impact of CAG intermediate alleles in the HTT gene and microRNA dysregulation exacerbates spliceosome impairment and accelerates Tau pathology in the caudate nucleus of late-onset Alzheimer's disease. 2025 Jan 19 10.1101/2025.01.19.25320764 (version 1) medRxiv.

Make a Comment

To make a comment you must login or register.

References

News Citations

  1. Somatic Mutations Accrue in Alzheimer's Neurons

Paper Citations

  1. . SVA elements: a hominid-specific retroposon family. J Mol Biol. 2005 Dec 9;354(4):994-1007. Epub 2005 Oct 19 PubMed.
  2. . Non-ATG-initiated translation directed by microsatellite expansions. Proc Natl Acad Sci U S A. 2011 Jan 4;108(1):260-5. Epub 2010 Dec 20 PubMed.
  3. . Caspase-8, association with Alzheimer's Disease and functional analysis of rare variants. PLoS One. 2017;12(10):e0185777. Epub 2017 Oct 6 PubMed.

Further Reading

No Available Further Reading

Primary Papers

  1. . Polygenic burden of short tandem repeat expansions promotes risk for Alzheimer's disease. Nat Commun. 2025 Jan 28;16(1):1126. PubMed.
  2. . CASP8 intronic expansion identified by poly-glycine-arginine pathology increases Alzheimer's disease risk. Proc Natl Acad Sci U S A. 2025 Feb 18;122(7):e2416885122. Epub 2025 Feb 12 PubMed.
AlzAntibodiesAlzBiomarkerAlzRiskBrain BanksGeneticsAlzGeneHEXMutationsProtocolsResearch ModelsTherapeutics