Introduction

Tiny and timid though it may be, the humble house mouse is not only one of the most successful mammalian species on Earth, but is also the dominant model organism for medical research, especially because of the ease with which it can be genetically manipulated to express human disease-causing genes. Yet time and again, “cures” effected in mice have failed when tried in human patients. Is this simply because “mice just aren’t humans,” as researchers are wont to say, or is there something else going on?

A recent study by a team of investigators at the ALS Therapy Development Institute (ALS-TDI) suggests another reason why mouse studies fail to replicate in humans: uncontrolled biological variables in underpowered studies. The authors recommend a minimum study design to manage the inherent noise in the system. Similar issues may arise in mouse studies in other fields.

Sean Scott, president of ALS-TDI, presented the findings from the study. Other featured participants included Ben Barres, Mike Sasner, Lucie Bruijn, Greg Cox, Stan Appel, Cathy Lutz, Gene Johnson, Jeff Rothstein, and Jonathan Glass. We thank Melanie Leitner of Prize4Life for helping to organize this discussion.

View/Listen to the Webinar

Transcript:
A previous version was posted on June 2 2008. It has since been reorganized and slightly edited for clarity and accuracy.

Participants: Sean Scott (ALSTDI), Brian Johnstone, Jonathan Glass (Emory University), Tennore Ramesh (Ohio State University), Ben Barres (Stanford University), Gabrielle Strobel (Alzheimer Research Forum), Yun Li, Melanie Leitner (PRIZE4LIFE), Anatoly Chernyshev (University of Iowa), Mike Sasner (The Jackson Laboratory), Greg Cox (The Jackson Laboratory), Patrizia Fanara (KineMed Inc.), Cat Lutz, Kirsten Carlson (The Michael J Fox Foundation for Parkinson's Research), Eugene Johnson (Washington University), Jeyanthi Ramasubbu (ALSTDI), Steve Perrin, Jennifer Gatchel, Avichai Kremer, Stanley Appel (Methodist Neurological Institute), Huan Ngo (Northwestern University), Nico Stanculescu (Alzheimer Research Forum).

Note: Transcript has been edited for clarity and accuracy.

Ben Barres
Sean, I am wondering how the authors of the questioned results have responded?

Sean Scott
Ben, actually, folks have been pretty receptive. I was surprised.

Brian Johnstone
Thanks to the organizers for providing this forum and the speaker for his stimulating talk. Question 1—what is your opinion on the timing of the dosing regimen with regard to ALS mouse model study design? Specifically, please provide a rationale for presymptomatic dosing as opposed to beginning treatment following onset of symptoms. Should not all preclinical testing be performed with agent given after onset of symptoms to more accurately reflect the clinical scenario? This is especially relevant given the elegant studies by Dr. Cleveland which show that the mechanism underlying disease onset fundamentally differs from that driving disease progression.

Sean Scott
Brian, we don't actually believe this animal is ever presymptomatic. With that said, we start dosing at day 50 as a uniform start.

Brian Johnstone
Sean, thank you for your answer. The mice may not ever be without disease, but they do not display differences in symptoms compared to control until much later in life. Certainly ALS patients are the same, but they never are treated before manifestation of disease symptoms (e.g., post-diagnosis). Peak weight or rotarod tests are reliable measures of onset (diagnosis) in mouse.

Sean Scott
Brian, my concern is the subtlety of symptom detection. If mice played golf, would they start noticing errant shots around day 30?

Brian Johnstone
Question 2—the use of total lifespan as a measure of efficacy in preclinical studies is artificial and not relevant to clinical scenario, where post-onset survival time is relevant. Has ALS-TDI applied its statistical methods toward analysis of post-onset survival time in SOD1-G93A mice?

Sean Scott
We are doing that analysis now. It is very evidently much noisier than survival. In the end, simply moving a pathway or marker relevant to the human disease may be all we can hope for.

Ben Barres
Presumably if the published positive results simply reflect noise, then there should have been an equal number of negative studies conducted that simply were not published (were not publishable). Is there any evidence for that? If so, does this mean we need to set up a mechanism where people can report their negative results, as they are an important part of the statistical data?

Sean Scott
Ben, I do not really believe that most of these studies were conducted blind; therefore, I do not believe that the distribution is even. I believe it is skewed toward the positive.

Jonathan Glass
Sean, it must take a large number of monetary resources to do all of these survival studies on these mice. Is this the best way to spend research dollars? Are there too many eggs in this basket?

Sean Scott
Jonathan, it's not the only way we spend ALS research dollars. However, I believe that it is critical to have a robust preclinical validation system before spending the dramatic quantities of money required for human clinical trials.

Jonathan Glass
Scott, I certainly agree with that, but I am not sure that the G93A mouse is necessarily going to provide that robust preclinical validation system.

Sean Scott
Jonathan, I agree, but there is no way to really validate a model without doing some level of scaled screening. However, we wouldn't have gone this far had we not assumed that this model was valid in the first place. Certainly the community accepted it as such when we started.

Yun Li
There are other very relevant models to also examine—the ALS2 mice (not much of a phenotype)—and the potentially better mice—the p150 dynactin mutation mice from Phil Wong. Also, the new story on TDP-43 will likely supersede the SOD1 biology of the last 15 years. These mice are now being made by multiple labs (in both the ALS and dementia fields) —perhaps effort should shift in that direction—and not waste another $50 million on ALS SOD1 mice (unless it’s for ALS SOD1 people!).

Sean Scott
Yun, I am in favor of any model that can be generated using a demonstrable change from the human disease; so far, it is not clear that any exist. Perhaps I am wrong.

Tennore Ramesh
Have you done a simple T-test of individual apparent effect and measured the frequency of significant apparent effect? Then you would be comparing apples to apples.

Sean Scott
Tennore, like we discussed in the talk, since the underlying data structure is not suitable for T-test, the statisticians felt that it would not be an appropriate comparator.

Ben Barres
As a general comment, not addressed to anyone specifically, this question of how good the mouse models of disease actually are is a critical one. Perhaps one of the most exciting home-runs for neurological disease is the discovery by Hauser et al., recently published in the New England Journal of Medicine, that the rituxamab monoclonal antibody that targets CD20+ B cells leads to a 99 percent decrease in new MRI lesions in multiple sclerosis over one year. This drug was developed as a direct result of research conducted in a marmoset model of MS, which the authors claim better reflects human neuropathology. It is now possible to make transgenic marmosets and it might be very interesting to use these to create a new ALS model (perhaps by expressing the mutant form of SOD).

Anatoly Chernyshev
Just want to mention one more recent study on a drug extending lifespan of ALS mice (this study was not covered in the Webinar paper, so it might be useful for the statistics on the subject). The work is done by Dr. Engelhardt's team at the University of Iowa (Harraz et al., 2008). The drug was apocynin, a known inhibitor of NADPH oxidase.

Sean Scott
Anatoly, we've already re-run apocynin at multiple doses according to the publication's specifications. Our first run suggests a very modest difference in weight. We are not sure if this is a true effect and are re-running it. It is certainly true that the effect is nowhere near the magnitude of the publication.

Anatoly Chernyshev
Are you going to publish your apocynin findings sometime?

Sean Scott
It's already available on our website, but we will want to do it a couple more times before we draw any public conclusion, but eventually, yes.

Melanie Leitner
Sean (and folks from non-ALS fields), do the findings from the ALSTDI study translate to other models of neurodegenerative disease (especially APP, α-synuclein, and other overexpression models)?

Sean Scott
Melanie, my belief would be that the problems of censoring and scale would be present in any lab that only runs studies occasionally, regardless of the model.

Greg Cox
Melanie, yes, I believe the ALSTDI findings will reach across many of the different mouse models used in neurodegenerative research. I view this work as a primer in reasonable study design for mouse genetic models.

Gabrielle Strobel
Melanie, certainly in AD, there has been long and intense discussion about how much of human AD the APP and/or PS-overexpressing mice truly model. Not so much on the separate issues of drug study design and interpretation that Sean's study has raised. Now there is a triple transgenic model that has a tau mutation, also. But there is no mouse model for late-onset AD. And most drugs for which we have had clinical trials following successful mouse studies have failed—HRT, NSAIDs, Pfizer's Lipitor trial. For the immunotherapy, the meningo-encephalitis that ended the first active vaccine did not show up in mice (or primates, for that matter), but on immunotherapies generally, it's too early to tell. What do others here think?

Gabrielle Strobel
Karen [Chen], if you are still here: you have worked extensively with the PDAPP mouse model for AD. Do you see any lessons from this ALSTDI study that would be applicable? Or, to ask about your new topic, are there mouse models for SMA that could take its implications on board?

Melanie Leitner
Jackson Laboratory folks please chime in.

Mike Sasner
Sean, wouldn't some of the variability come from using the mixed hybrid background? Would you expect to see less noise with a B6 congenic background?

Cat Lutz
Although controlling these variables is important, I am concerned that the genetic background of the B6:SJL mouse is also a huge, uncontrolled variable. The B6 congenic seems to be the more logical model.

Sean Scott
Mike, we used the B6 extensively and surprisingly it is no less noisy. My interpretation is that the noise comes from the transgene.

Cat Lutz
Sean, certainly the rate of metabolism for any compound can differ in the segregating genetic background. It seems like we are adding to the number of animals that need to be used....

Sean Scott
Cat, we look at drug exposure by mass spectrometry for every study. We have not done a lot of comparison between B6 and B6SJL; however, with minocycline, for example, there was no real difference in exposure at equivalent doses.

Sean Scott
Just as an FYI: we re-ran lithium according to the publication and it has no effect.

Greg Cox
Cat and Mike follow-up: as a mouse geneticist, the litter effects you described and need to control for are completely expected in a mixed hybrid background mating scheme such as the one used for this model (B6SJL F1 female x mixed hybrid B6SJL-SOD1 male). In other words, who the father is (what combination of B6 and SJL alleles are present) limits the possible genotypes of the offspring. From our own data and from others, we know that there is a huge lifespan difference of approximately 40 days between transgenic mice congenic on these two different genetic backgrounds. The B6-SOD1(G93A) congenic mice have a median lifespan of approximately 161 ± 10 days and the SJL-SOD1(G93A) mice have a median lifespan of approximately 119 ± 10 days with no difference in copy number or mRNA expression based on Q-PCR.

Melanie Leitner
Greg, that is very interesting given that there seems to be almost a twofold difference in endogenous SOD expression in those two strains (B6 are low and SJL are twice as high).

Brian Johnstone
Sean, what is your method for determining onset? In our testing by rotarod (15 rpm, 10 minute criterion) the display of symptoms is quite clear.

Patrizia Fanara
Hello, everyone. Thanks, Sean, for presenting today to address the implications of your study. I agree with you that the SOD1 mouse model can still be used to eventually achieve control over this disease. We have recently published the use of “authentic” biomarkers (disease-specific) in preclinical animal models to reveal the actual dynamics of a biological system affected by disease and to develop novel mechanistic-based therapies. Through this approach, we were the first to publish the null efficacy of Riluzole. Our studies used Riluzole as a negative control for what a non-effect on mechanistically based therapy looks like. Questions for discussion: in addition to establishing more rigorous constraints and thus ensuring minimal variability in future preclinical studies, are we considering the impact of using biological-based biomarkers (metrics that are intrinsically linked to the pathogenesis, progression, and reversal of the disease) to bridge the animal neurological score with molecular changes underlying this disease? Should we include in vivo readouts in animal models to: 1) maximize certainty of therapeutic effect before human trials begin, 2) improve the validity of the SOD1 mouse model, and 3) develop novel mechanistically based therapeutic interventions?

Sean Scott
Patrizia, all of our efforts these days focus specifically on repeatable panels of biomarkers. We think any future utility of this mouse will depend on achieving reliable biomarkers.

Patrizia Fanara
Sean, are you referring to proteomic-based biomarkers? That reflects expression level of proteins?

Sean Scott
Patrizia, most of our biomarker panels at this point are derived from gene chip studies that are later turned into low-density arrays on a TaqMan®.

Melanie Leitner
Gene, is the Parkinson's community also struggling with this model issue like the ALS and AD communities?

Eugene Johnson
Melanie, I am not aware of a systematic analysis such as this. Part of the reason is that there are no models that seem to have the validity of the SOD1 mouse in producing the full spectrum of pathology and symptoms. Also, part of it is that the PD field does not have as definitive an endpoint (death, however defined) to score. I do not think the HD mouse, which is perhaps more similar to the ALS model situation, has been similarly analyzed.

Kirsten Carlson
Melanie, in PD research, one of the biggest hurdles is the lack of a progressive animal model of neurodegenerative processes. In addition, existing genetic models are generally not well characterized in a systematic way. Multiple promoters and phenotypes of genetic models in PD are further compounded by general "drift" of the genotype between investigators and over time.

Mike Sasner
Kirsten, we at Jackson Laboratory are doing our best to minimize the "drift" you mention by monitoring copy number (and rebuilding colonies from frozen stocks when necessary) and moving alleles to congenic backgrounds.

Mike Sasner
Melanie, certainly some of the same criteria apply to other models and other diseases. We found that the J20 APP line from the Mucke lab had a similar loss of copy number (and therefore later and less severe phenotype) in some litters, and thus more noise within an experiment. We are now monitoring things like copy number as best we can. There are other sources of genetic variation that people need to be aware of, for example, the paper (Watkins-Chow and Pavan, 2008) showing that some B6 colonies have copy number variants for the Ide gene, which is known to be relevant to expression of AD phenotype.

Kirsten Carlson
Mike, thank you for your comment and the work that Jackson Laboratory is doing to address this problem. I think it may apply more in situations where strains are shared directly among researchers.

Jeyanthi Ramasubbu
Copy number and genetic background being critical variables, particularly in efficacy endpoints as survival, is certain to influence other neurodegeneration models which use mixed hybrid, high copy animals in their studies.

Melanie Leitner
Mike and Cat, don't the Huntington's animals also have some of the issues seen in the SOD animals? This seems to be a very widespread problem. Is Jackson Laboratory doing anything to help its customers cut through some of the confusion?

Cat Lutz
Melanie, loss of CAG repeat size is a huge issue in the HD models; the key is to be constantly monitoring the phenotype of the mice. A long-lived mouse will soon take over your colony. A good part of the "drift" we are likely to see in all transgenics.

Yun Li
Sean, I would encourage you to seriously carry out a proper positive control. Riluzole has been shown to have clear statistical efficacy in multiple human trials (comparable to many, many drug trials for real chemotherapeutics). Does it not work at all in this mouse model of FALS? Perhaps the model is just not appropriate for evaluating drugs for sporadic ALS. You appear to run very reproducible mouse trials, and the human effects of Riluzole, although small, have been reproduced in four separate large human trials, so again, is this mouse just not suitable for preclinical discovery in the more common form of ALS?

Sean Scott
Yun, I agree, it would be great to have that data. However, we are talking about close to 600 total mice to get the answer because the human clinical effect is extremely borderline. We will probably chase that using gene chip profiles as opposed to survival.

Brian Johnstone
Sean, I understand your rationale for simplifying the system as much as possible to facilitate mass screening of candidates. Ideally validated in vitro models would be available to fulfill the same function. It is important that we all understand, though, that the model must not be oversimplified to the extent that good drugs are thrown out with the bad. If we accept that the SOD1 mouse is a valid model for the disease, then we need to (at least at some point in development) validate candidates using experimental designs that are closer to the clinical scenario rather than the opposite.

Sean Scott
Brian, your point is well taken. My stress is that because the mouse is so overdriven I think we will be lucky to see any effect under any conditions. Once we do, we can tease it out using multiple study designs and biomarkers and biomarker panels.

Ben Barres
In general, probably before there is a good treatment, there will need to be a much deeper understanding of the pathophysiology of the disease process. So far most of the drugs tested have been stabs in the dark....

Steve Perrin
I agree with Ben, and in addition we should all keep in mind that the animal model is just a tool to test therapeutics against therapeutic hypotheses.

Melanie Leitner
Sean, I would like to raise an issue that is in some ways the flipside of the issues we have been wrestling with here; namely, while it is almost impossible to interpret mouse study data if the study in question doesn't account for some essential variables (as I think your data beautifully illustrates), the dark flipside is that it is a major challenge to translate information from a study using a highly controlled single strain of mouse to a heterogeneous population of humans. Some have proposed that we should be testing drugs using set panels of mice of differing strains so as to have greater similarity to the diversity of a true clinical trial.

Sean Scott
Melanie, I agree with that point; however, the cost will be prohibitive. We are running close to $200,000 per study by the time you add in survival, surrogate markers, and pharmacology. Imagine if you did that study using multiple strains. I think our first order is to more closely tie affected pathways in the mouse to correspondingly affected pathways in humans so that the targets being chased are not so random as they have been in the past.

Greg Cox
Melanie, I agree that replication in a second model or genetic background would be ideal. Unless a drug effect is specifically blocked by a polymorphism in one inbred strain, inbred mouse studies are the most powerful study designs you can work with and have the greatest sensitivity to see an effect if it exists.

Eugene Johnson
Sean, from your perspective, using 24/group, how long a life extension do you think you need to see to justify moving that compound forward?

Sean Scott
Gene, anything repeatable and statistically significant is worth following up on, even if it is only a repeatable effect on a relevant molecular target and not survival.

Gabrielle Strobel
Sean, your conclusion that previous drug efficacy studies in the SOD1 model have basically measured noise is pretty disheartening. Have you received any technical/methodological criticism of your method that refutes your conclusion? If not, I'd say we can assume that the same labs are already trying to adjust their study design to match yours as best they can?

Sean Scott
Gabrielle, we have not really received any criticisms after sharing the specific methodologies. However, for anyone who has displayed stress about the conclusions, we have offered to allow them to audit our process and retest the compound on our dime. To date, nobody has taken us up on this offer. The flipside is that this data has come as a relief to many folks who were very frustrated with the lack of translation between the mouse and clinical trials.

Jeyanthi Ramasubbu
Greg, what are your thoughts on the effect on the transgene during subsequent inbreeding in individual labs?

Greg Cox
Jeyanthi, in my lab, we have made five different congenic lines from the high-copy G93A transgenic mice and see major differences in lifespan. We are currently trying to map these modifier genes in crosses to see what genes are controlling this effect. In other labs that are not controlling for genetic background, much or all of their effects can be mimicked just from the background modifier effects.

Cat Lutz
Jeyanthi, the B6SJL line at Jackson Laboratory is bred to an F1 female. If people order these mice and then breed transgenic males to non-transgenic females, they will eventually fix different alleles between B6 and SJL. Even worse is if they start with just a few mice and bottleneck that effect.

Melanie Leitner
Sean (and others), one of the participants has asked: in light of this study, what is your position on proceeding to clinical trials without robust efficacy in the mouse model?

Sean Scott
Melanie, my position is that if you are targeting an altered pathway and can measure effect on that pathway, then it is completely reasonable to proceed without a mouse trial.

Greg Cox
Melanie, I have always believed that if a drug is attacking the underlying mechanisms of disease (instead of a secondary symptom), then one should have a huge effect on either the onset or progression of the disease in the mice and not just a subtle effect. Based on the ALSTDI results, almost none of the drugs should have been considered for clinical trial and in this way the mouse has been completely informative; there is no effect in either the mice or patients for the currently tested drugs.

Steve Perrin
I agree with Greg. So far the mouse model is 100 percent predictive of clinical translation.

Patrizia Fanara
Sean, simply listing an inventory of the expression level of thousands of genes in a complex network does not reveal the actual dynamics of the biological system, and this type of approach would not lead to “authentic” metrics that can be used to define any future utility of this mouse.

Yun Li
Sean, it’s interesting that you are carrying out genetic biomarker studies in the mice. Are you looking at specific cell populations (gross tissue analysis would seem to drown out potentially valuable cell-based mechanistic pathways)? As you know, much of this has already been published by excellent labs over the last 10 years already. Do you plan to make your data public? How will it be different than what’s been done by others? Hopefully you will incorporate some of their excellent work to aid your own discovery effort?

Sean Scott
Yun, we are indeed performing both whole tissue and Laser Capture Microdisection (LCM) studies. Our studies differ in that they are very high-power studies and they are done by a team recruited from industry that has hundreds of thousands of samples’ worth of experience. Normalization and control in these studies is not trivial. As patterns emerge, we will undoubtedly publish them. Patrizia, you are correct; however, if the pathway is affected in both mice and humans, then at least we are truly modeling a component of the human disease.

Yun Li
Sean, that’s good to know, but it’s somewhat biased (or maybe so) since the Cleveland labs and many others suggest that non-neuronal cells are important, but LCM is really only good for neurons...not good at all for non-neuronal cells.

Steve Perrin
Yun and Patrizia, we are combining gene expression studies in the mouse from spinal cord, brain, skeletal muscle, blood, adipose, sciatic nerve, LCM captured motor neurons, glial cells, and NMJS with biopsies from ALS muscle, blood, skin, and adipose. The goal is to combine these data sets into a complex map of molecular mechanisms leading to disease pathology for therapeutic hypothesis testing and development. Standard pharmaceutical-driven approach.

Patrizia Fanara
Steve, standard pharmaceutical-driven approaches of this type have high attrition rates and poor predictive power.

Steve Perrin
Patrizia, all therapeutic development processes have high attrition rates. The highest rate is in Phase 3 trials. The approach is just to have better hypotheses about what molecular mechanisms to target.

Patrizia Fanara
Steve, when viewed from a broad perspective, the modern DDD paradigm has created an untenable system. The combination of enormously efficient tools for identifying leads, particularly those active against biologically novel or “unvalidated” targets, with no equally efficient process for filtering, eliminating, or optimizing these leads on the basis of their actions in living organisms, has resulted in a clogged pipeline. The current system is, unfortunately, perfectly designed for the exorbitant cost per drug approved that characterizes this era....

Steve Perrin
Patrizia, I agree that the current model has become laborious, expensive, and even unpredictable. Maybe the low-hanging fruit is gone?

Patrizia Fanara
Steve, pharmaceutical researchers are unable to predict the likely success or failure of agents using the tools available. An inability to link molecular events (i.e., actions on the physical targets of drugs) to functional outcomes (i.e., macroscopic events that beneficially alter disease processes without causing undesired toxicities) is responsible for the extremely low success rates of leads that are prosecuted in modern DDD. This situation can be improved.

Steve Perrin
Patrizia, we are getting off target, but I don't know if I whole-heartedly agree. Often drugs fail late in clinical development because of very unpredictable off-target effects that we don't understand and would never see in a preclinical model. Often we don't see them in Phase 3, or until they become commercial products do these adverse events arise.

Patrizia Fanara
Steve, maybe. The absence of authentic biomarkers is arguably the key to the lag in drug development and is the major current impediment to advancing molecules to drugs. However, the ability to objectively measure a biochemical action of agents on their true targets in living systems would provide an objective means of establishing efficacy and predicting clinical response. We are working on true functional biochemical targets of drug: fluxes of molecules through the pathways that are responsible for disease in fully assembled systems.

Steve Perrin
A good example of off-target effects is minocycline in ALS. At the dose given in clinical trial, it would have been well tolerated in healthy individuals. Yet it was toxic in ALS patients and in the preclinical model. At least in our hands.

Eugene Johnson
Patrizia, I agree that biomarkers, or lack thereof, are critical roadblocks in the process, especially at the level of a Phase 2 trial, to demonstrate the drug does anything biologically in humans and to titrate dose.

Jennifer Gatchel
I wonder if your group has acquired data on other disease phenotypes in this SOD model using the multiple compounds retested, such as weight, motor function/performance, activity, strength, and also the biomarker issue. While survival is obviously of the most interest as you mentioned above, compounds that show efficacy on other phenotypes could potentially benefit quality of life for patients; as well, I don't know if your group also carries out dose response studies such that compounds that show a modest effect, if used at a slightly higher dose, might have more widespread benefits?

Sean Scott
Jennifer, our group collects about 300 data points per mouse including daily body weight, daily neuroscore, achieved drug levels, and for some drugs that are not repeat studies we look at effect on target.

Jeyanthi Ramasubbu
Jennifer, we are currently approaching drug testing in our model confirming the target effect in vitro in a relevant cell line, then ensuring that we can achieve therapeutically relevant levels at the target site (spinal cord or as appropriate); we perform parallel studies to measure biological effects and neurological effects (neuroscore) besides survival, all under systematic, standardized protocols.

Brian Johnstone
Sean, good points about enhancing the robustness of the SOD1 model. I do want to again emphasize my point that if one carefully analyzes many of the studies that use pre-onset dosing, it is apparent that onset is delayed, but post-onset survival time is actually reduced. On first principles, this indicates that the agent acts differentially on mechanisms of onset versus those involved in progression. If the delay in progression was sufficiently robust, then it would appear that survival is enhanced.... If this translates to humans, then a drug that appeared positive in mice due to delaying onset would surely fail in the clinic where patients are invariably treated well after becoming symptomatic.

Steve Perrin
Brian brings up a critical point. If we don't develop a good diagnostic biomarker in ALS, we will probably miss our therapeutic window with even a very good drug.

Sean Scott
Brian, I believe that onset measures are dramatically noisier than survival or body weight measures. As such, it is difficult for me to believe even with our own studies that onset is truly affected. We have seen onset changes scores of times that will not repeat when retested or powered up.

Yun Li
Steve, how do you account for Riluzole? (Some try to discount its effect in humans, but there's no question it is reproducible in real patients.)

Steve Perrin
Yun, as Sean mentioned, if we designed a highly powered study with >500 mice we would probably measure a marginal therapeutic benefit just like in patients.

Yun Li
Steve, then why not do that?? That’s good science: having the right control that matches the human effect! Obviously the genetic knockdown of SOD1 is another (but there is no human counterpart to it, yet).

Yun Li
Steve/Sean, it seems that the validated pathway approach is much better, but if you already have knowledge of the human pathway, why worry about any mouse data (other than to validate the effects in a given pathway), mainly since the mouse just does not predict (yet) any human outcome (especially with regard to magnitude)? That said, as glad as I am that you're pursuing pathways, you're in the same league as hundreds of others doing the same thing!

Sean Scott
Yun, the scale at which we're doing it, combined with the fact that we are validating the utility of altered pathways using gene therapy approaches and doing it all in an assembly line fashion, will hopefully allow us to get to the answer faster than we could doing it solely in the clinic. Modulating some of these pathways is lethal and must be done in a model system.

Anatoly Chernyshev
From the human perspective: how valid are the pre-onset dosing studies in mice? Nobody's going to administer a drug before illness begins.... Just a thought.

Sean Scott
Anatoly, I actually think that worrying about pre- versus post-onset in this mouse raises the bar on an already overdriven model. What we need to be doing in my view is finding pathways that are actually provably active in disease. Avichai Kremer Sean, did you do any analysis of efficacy regarding delaying symptoms (of motor-function rotarod test)? In light of this study, what is your position on proceeding to clinical trials without robust efficacy in the mouse model? Would in-vitro efficacy models be sufficient for backing the rationale of the drug effect? Have you done similar analyses on results obtained by the inbred strains produced at the Jackson Laboratory, where presumably there would be less noise?

Sean Scott
Avi, we don't use rotarod. It seems as though a mouse in a bad mood will jump right off. I am okay with proceeding to trial without robust efficacy as long as a pathway that is altered in both mice and humans can be changed for the better by the test agent. However, I must caution that I believe quantitative measures employed for measuring things such as protein levels leave a lot to be desired in terms of sample number, standardization, and dynamic range of the assay. With respect to the inbred strain, they do live 30 days longer but they don't really have less noise, per se. Remember that the biggest source of noise is censoring criteria which have nothing to do with genetic background.

Melanie Leitner
Sean and Steve, have you considered using a lower copy number transgenic and/or another SOD1 mutation and/or another strain as a validation? (If using a panel of mice would be cost-prohibitive, would this be an intermediate solution?)

Sean Scott
Melanie, my stress about the low copy animals is that their disease course, once disease begins, is not very different from the high copy G93A. So, in essence, it takes longer for disease to onset but it's just as severe once it does. If there was a mouse with a very gentle disease slope I would prefer that, but I do not see one.

Melanie Leitner
D90A?

Steve Perrin
Melanie, to add to Sean's point, the slope for the mixed strain and BL6 isogenic from Greg's lab are the same. The only advantage may be reducing the number of animals per study.

Gabrielle Strobel
All, it seems to me the question of whether drug efficacy studies in mice should be considered indispensable to develop a candidate drug further is a good one in light of Sean's/ALSTDIs work. To throw in one example from AD: the PET imaging tracer PIB is invaluable in AD studies right now, done in some 50 labs worldwide, longitudinally in prospective cohorts, and also increasingly in some drug trials, potentially an antecedent marker. It never worked in mice. If the PIB investigators had predicated their pursuit of it on it working in mice, we would not have it.

Melanie Leitner
This has been an awesome discussion. My one concern is that we haven't discussed whether there is anything the neurodegenerative disease research community can do to 1) alert more researchers to the issues raised here, and 2) do something about them, come up with solutions (whether this be a new animal model as Ben Barres raised earlier, or some other solution).

Stanley Appel
Melanie, I wanted to comment on the issue of human studies without mouse testing. The key issue is to understand the basic pathophysiology of motor neuron injury, and the mouse has provided extremely meaningful insights. Based on such insights and further understanding of pathophysiology in mouse (and humans), we certainly should be planning small pilot trials in the heterogeneous disease we call human ALS. Clearly the human trial efforts of the last decade based on therapeutic trials in the mouse have been failures. But pilot studies based on a clearer understanding of the pathophysiology of motor neuron injury and the non-cell autonomous insights might offer more promising approaches.

Melanie Leitner
Stan, I hope so!

Cat Lutz
Melanie, at Jackson Laboratory we are looking to improve our methods of copy number detection and would be interested in any potential collaborations to that effect. In addition, it would be great if we could get support to test tissue samples for copy number for those researchers who might not have the ability or experience. Two percent detectable copy number drop doesn't sound too insurmountable, but what about the copy number drops that go undetected because the qPCR isn't sensitive enough?

Huan Ngo
Sean, should mouse preclinical models include environmental risk factors, on top of the SOD1 mutations? Would that provide more accurate data for human trials? After all, pathology is not based on genetics alone, right?

Sean Scott
Huan, I wouldn't mind that idea if one could establish a real link between the risk factor and human disease ahead of time and then use that risk factor to generate neuronal pathology in animals.

Gabrielle Strobel
We are nearing the end of our time. For all who want to keep chatting, feel free to; the page will stay open. As others begin to leave, however, let me thank you all for this chastening but constructive discussion. If as the result of it, you have ideas for how to improve drug study design, or even ALS research more broadly—there were good remarks about intensifying pathophysiology research—please send us an e-mail and we will be glad to share your thoughts as an appendix to this discussion.

Sean Scott
Thanks everyone for having me and spending your time here this morning. I appreciate all the comments.

Gabrielle Strobel
I'll take my leave now, with thanks again and goodbye to Sean, Melanie, and all who helped and contributed. As I said, the room will stay open for a while longer.

Melanie Leitner
Thanks to Sean, Nico, our great discussants and all of you for your interest, and, of course, to Alzforum for hosting.

Brian Johnstone
Melanie, thanks for the excellent job with moderating and keeping the discussions on track.

Jeyanthi Ramasubbu
Melanie, Alzforum, and all discussants, thanks.

Background

Background Text
By Melanie Leitner, Prize4Life

This forum built on a recent study by Sean Scott and his colleagues at the ALS Therapy Development Institute, published in the journal Amyotrophic Lateral Sclerosis, which indicated that failure to control for biological variables, common in the design of mouse drug efficacy studies, can explain why promising drug studies in mice have resulted in dashed hopes when the compounds reached clinical trials in ALS patients. These observations raise the possibility that these common issues in study design may pose similar problems for other neurodegenerative diseases.

In the 1990s, a mutation in superoxide dismutase 1 (SOD1) was identified as the cause of a significant subset of familial amyotrophic lateral sclerosis (FALS) cases. This discovery led to the generation of transgenic rodent models of autosomal dominant SOD1 FALS. Mice carrying 23 copies of the human SOD1G93A transgene have become the standard model for FALS and ALS therapeutic studies. To date, there have been at least 50 publications describing therapeutic agents that extend the lifespan of this mouse. However, no therapeutic agent besides riluzole has shown corresponding clinical efficacy.

Using computer modeling and statistical analysis of over 5,000 SOD1G93A mice, Scott et al. quantified the impact of several critical confounding biological variables frequently present in transgenic mouse studies, and developed an optimal study design that controlled for these variables. When the authors retested various compounds previously reported to be efficacious in major animal studies using this optimal study design, the authors found no survival benefit in the SOD1G93A mouse for any of these compounds (including riluzole), all of which were administered by their previously reported routes and doses. The compounds retested in this way included minocycline, creatine, celecoxib, and sodium phenylbutyrate, all of which were followed up in ultimately unsuccessful human clinical trials.

The results of this paper suggest that historically there has been a profound and widespread problem in the design and therefore interpretation of many drug efficacy studies in the most commonly used and widely accepted mouse model of ALS. The primary aim of this discussion forum is to invite experts and interested researchers to examine the implications of this study both for the ALS field itself as well as for related fields of neurodegenerative disease research. This discussion seeks to 1) raise the question of whether this problem is not unique to the G93A SOD1 model of disease but may rather be endemic to other transgenic mouse studies, particularly in overexpression paradigms in transgenic mice on hybrid backgrounds, and 2) if the community deems this to be a widespread problem, develop ideas to minimize the impact of this problem and to come up with approaches that might facilitate more significant and reproducible results among all laboratories employing mouse models of neurodegenerative disease.

Questions for discussion:

1. Do the findings from the ALS-focused study presented here translate to other mouse models of neurodegenerative disease? What, if any, are the implications of these findings for other mouse models of neurodegenerative disease (specifically APP, α-synuclein, and other overexpression models)?

2. What are the implications of this study for earlier and ongoing mouse studies that do not follow these rigorous guidelines?

3. What, if any, are the obligations of the research community a) when reviewing articles for publication that do not follow these strict criteria for design, and b) when reviewing grant applications that do not follow these strict criteria for design? Should the NIH and/or other funders take a position on this issue?

Reference:
Scott S, Kranz JE, Cole J, Lincecum JM, Thompson K, Kelly N, Bostrom A, Theodoss J, Al-Nakhala BM, Vieira FG, Ramasubbu J, Heywood JA. Design, power, and interpretation of studies in the standard murine model of ALS. Amyotroph Lateral Scler. 2008;9(1):4-15. Abstract

Related news from the Dana Foundation

Comments

  1. Picking the Right Model of Neurodegeneration for Drug Discovery for Patients With Sporadic Amyotrophic Lateral Sclerosis
    Comment by John Q. Trojanowski and Virginia M.-Y. Lee

    Despite significant heterogeneity within frontotemporal lobar degeneration (FTLD) and amyotrophic lateral sclerosis (ALS), TDP-43 has emerged as the common pathological substrate linking FTLD with ubiquitin inclusions (FTLD-U) and ALS since the initial report describing ALS and FTLD-U as TDP-43 proteinopathies in 2006 (1). Subsequent studies support the hypothesis that FTLD-U and ALS represent two extremes of a clinico-pathological spectrum of TDP-43 proteinopathies. However, pathological TDP-43 inclusions are absent in familial ALS (FALS) with SOD1 mutations (SOD1-FALS) yet are present in all cases of sporadic ALS (SALS) and some cases of non-SOD1-dependent FALS. This implies that SOD1-FALS is not the familial counterpart of SALS nor of FALS cases caused by other genetic abnormalities (2).

    Indeed, despite some early skepticism about this view, a flurry of recent reports establish that point mutations in the TDP-43 gene (TARDBP), especially in the glycine-rich region that is essential for RNA splicing, cause FALS and SALS (3-7), and that TARDBP variants may be genetic risk factors for disease (8). Moreover, recent studies suggest that ALS is a multi-system TDP-43 proteinopathy rather than being a disorder restricted to the pyramidal motor system. That is because neuronal and glial TDP-43 inclusions are present throughout the CNS, not just in upper and lower motor neurons; these TDP-43 lesions are always associated with loss of nuclear TDP-43, thereby resulting in a loss of TDP-43 nuclear functions (9,10).

    In light of these and the more than 90 studies published on TDP-43 in the last 20 months, it is reasonable to ask whether transgenic SOD1 mice are models only of FALS due to SOD1 gene mutations. Do they perhaps fail to model other forms of ALS including SALS and FALS caused by mutations in genes other than SOD1, because the underlying disease mechanisms are different between SOD1 FALS and other forms of ALS. Thus, proof-of-concept studies of potential ALS therapies that target SOD1-mediated neurodegeneration may yield effective therapies for SOD1-dependent FALS, but not other forms of ALS. This may be why such therapies have not shown efficacy in patients with SALS and are unlikely to work in patients with SOD1-independent forms of FALS.

    Recognition that TDP-43 pathology underlies FTLD-U and ALS opens up new avenues for drug discovery focusing on TDP-43-related targets to develop mechanistically based therapies for these disorders. Many of us who work on ALS research are actively pursuing efforts to develop TDP-43 transgenic mouse models of ALS that all of us hope will accelerate the pace of drug discovery for this disorder and other TDP-43 proteinopathies.

    References:

    . Ubiquitinated TDP-43 in frontotemporal lobar degeneration and amyotrophic lateral sclerosis. Science. 2006 Oct 6;314(5796):130-3. PubMed.

    . Pathological TDP-43 distinguishes sporadic amyotrophic lateral sclerosis from amyotrophic lateral sclerosis with SOD1 mutations. Ann Neurol. 2007 May;61(5):427-34. PubMed.

    . TDP-43 A315T mutation in familial motor neuron disease. Ann Neurol. 2008 Apr;63(4):535-8. PubMed.

    . TARDBP mutations in individuals with sporadic and familial amyotrophic lateral sclerosis. Nat Genet. 2008 May;40(5):572-4. Epub 2008 Mar 30 PubMed.

    . TDP-43 mutations in familial and sporadic amyotrophic lateral sclerosis. Science. 2008 Mar 21;319(5870):1668-72. Epub 2008 Feb 28 PubMed.

    . TARDBP mutations in amyotrophic lateral sclerosis with TDP-43 neuropathology: a genetic and histopathological analysis. Lancet Neurol. 2008 May;7(5):409-16. Epub 2008 Apr 7 PubMed.

    . TDP-43 mutation in familial amyotrophic lateral sclerosis. Ann Neurol. 2008 Apr;63(4):538-42. PubMed.

    . A90V TDP-43 variant results in the aberrant localization of TDP-43 in vitro. FEBS Lett. 2008 Jun 25;582(15):2252-6. Epub 2008 May 27 PubMed.

    . Evidence of multisystem disorder in whole-brain map of pathological TDP-43 in amyotrophic lateral sclerosis. Arch Neurol. 2008 May;65(5):636-41. PubMed.

    . Sporadic amyotrophic lateral sclerosis: two pathological patterns shown by analysis of distribution of TDP-43-immunoreactive neuronal and glial cytoplasmic inclusions. Acta Neuropathol. 2008 Aug;116(2):169-82. PubMed.

  2. Thank you, Sean, for presenting yesterday to address the implications of your study.

    I agree with you that the SOD1 mouse model can still be used to eventually achieve control over this disease. We have recently published the use of “authentic,” i.e., disease-specific biomarkers in preclinical animal models to reveal the actual dynamics of a biological system affected by disease and to develop novel mechanistic-based therapies. Through this approach, we were the first to publish the null efficacy for Riluzole. Our studies used Riluzole as a negative control for what a non-effect on mechanistically based therapy looks like.

    Questions I'd suggest for further discussion:

    • In addition to establishing more rigorous constraints and thus ensuring minimal variability in future preclinical studies, are we also considering the impact of using biological-based biomarkers (metrics that are intrinsically linked to the pathogenesis, progression, and reversal of the disease) to bridge the animal neurological score with molecular changes underlying this disease?
    • Should we include in-vivo readouts in animal models to 1) maximize certainty of therapeutic effect before human trials begin, 2) improve the validity of the SOD1 mouse model, and 3) develop novel mechanistically based therapeutic interventions?

    References:

    . Stabilization of hyperdynamic microtubules is neuroprotective in amyotrophic lateral sclerosis. J Biol Chem. 2007 Aug 10;282(32):23465-72. PubMed.

  3. Based on the large scale analysis of survival in the Bl6/SJL mixed hybrid strain of SOD1 G93A (gur hi copy) mice it is clear that survival as a measure of drug efficacy is dogged by issues of appropriate N, gender and litter matching. With all the necessary controls in place it is possible to reliably detect a 3 percent effect on survival using N = 20 animals/group (litter matched). This is based on the only positive control in the study, which is the effect of gender on survival. This effect is reliable and reproducible in almost every study with N = 60/group and is statistically significant.

    Regarding the author’s statement that most of the published studies are primarily due to inherent noise in the system: I disagree with this blanket statement. The figures on apparent effect are misleading and measure frequency of overall effect (like tossing a coin), rather than frequency of statistically significant overall effect (representative of most published studies). Although I agree that T-test is not the appropriate test and is not stringent, I just wanted to see if a simple T-test would give a large number of false positives. When I reanalyzed data with historic controls and randomly assorted them into control and treatment groups, I found very different outcome. Analysis by randomization of data as performed by SimLIMS (with 974 randomization experiment), shows that while the chance of getting a positive effect in study can happen 48 percent of the time, it is very unlikely that one would see statistically significant effect repeatedly in the same direction (Table 1, Figure 1 [.pdf]). In fact the random distribution of control animals into untreated and untreated group resulted in statistically significant (P Sod1 G93A High copy mice obtained from Jackson labs show gender differences in survival (1). The magnitude of this effect is small ~3 percent. This small but consistent effect was tested in this SimLIMS randomization model. When a realistic gender effect was measured, 12 percent of the experiments showed significant gender effects (p I do not believe that all drug effects seen to date are just noise. For example drug studies with Celebrex showed a 15-20 percent effect and these studies had high N (N = 40). Yet, ALSTDI was unable to reproduce this effect. This brings into a fundamental question of the genetic background of each lab's mouse colony. Many researchers that do drug studies in ALS have their own colony that they themselves breed. One interpretation is that this closed breeding among mice in each laboratory can be enough to bring about such differences in drug effects.

    There are other reasons some published studies are not reproducible. They are

    1. segregation of drug metabolism genes and disease modulating genes among different house bred lines;
    2. basing papers on just one experiment, rather than performing repeat experiments and demonstrating similar trend in survival;
    3. using simple T-test instead of appropriate survival analysis;
    4. not using appropriate variables in statistical analysis (as discussed in the paper).

    I am hoping that the guidelines in this discussion would be of value in designing and evaluating future studies.

    See Table 1, Figure 1 [.pdf]

    References:

    . Background and gender effects on survival in the TgN(SOD1-G93A)1Gur mouse model of ALS. J Neurol Sci. 2005 Sep 15;236(1-2):1-7. PubMed.

Make a Comment

To make a comment you must login or register.

References

Paper Citations

  1. . Design, power, and interpretation of studies in the standard murine model of ALS. Amyotroph Lateral Scler. 2008;9(1):4-15. PubMed.
  2. . Genomic copy number and expression variation within the C57BL/6J inbred mouse strain. Genome Res. 2008 Jan;18(1):60-6. PubMed.

Other Citations

  1. Melanie Leitner

External Citations

  1. Related news

Further Reading

Papers

  1. . In vitro neurogenesis by progenitor cells isolated from the adult human hippocampus. Nat Med. 2000 Mar;6(3):271-7. PubMed.
  2. . Neuronal progenitors-learning from the hippocampus. Nat Med. 2000 Mar;6(3):249-50. PubMed.