This is Part 5 of a six-part series. See also Part 1, Part 2, Part 3, Part 4, Part 6. View a PDF of the entire series.
16 March 2011. As discouragement about a decade of negative clinical trials in Alzheimer’s disease is spreading through industry and academia alike, AD researchers are getting curious about an innovative type of trial design that is showing success in cancer and in medical device development. Called adaptive trial, it rests on Bayesian probability statistics and works quite differently from the traditional trials with which the field is familiar. Adaptive trials grabbed the limelight at a scientific and regulatory meeting of the Alzheimer’s Prevention Initiative held on 7 January 2011 in Washington, D.C. The API is a concerted, increasingly broad-based drive by researchers in Arizona and the South American nation of Colombia to get secondary prevention trials up and running in people who face a high risk of Alzheimer’s because they carry either an autosomal-dominant AD mutation or ApoE4.
Besides news on API preparations on the recruitment (see Part 2 of this series), scientific (Part 3), and regulatory fronts (Part 4), the D.C. meeting featured increasingly concrete discussions on how to design the trials. The stakes are high because treating AD years prior to dementia pushes researchers into uncharted territory with what is considered to be an especially vulnerable population. Donald Berry shook up the conversation. He advocated a type of trial that may seem radical to a field using mostly traditional randomized controlled trials (RCTs) based on what is called frequentist statistics. Besides being a statistician at the University of Texas MD Anderson Cancer Center in Houston, Berry runs a business designing adaptive trials for companies. Berry introduced adaptive trials to an audience comprising AD scientists and statisticians in academia, industry, at regulatory agencies, and public and private funders. He urged the API to consider adaptive designs because, by their very nature, they can make a virtue of the uncertainties of secondary prevention trials that can hobble conventional designs. Rather than forcing the investigator to “guesstimate” parameters they understand poorly and then hinging success or failure on the guesstimate, an adaptive trial flexibly explores that parameter while the trial unfolds. In this way, it is more likely to deliver an answer with fewer patients, Berry argued.
In broad terms, Janet Woodcock of the Food and Drug Administration (FDA) had for years called on trialists to use adaptive designs to boost the success rate and control costs of Phase 3 trials, and the FDA issued guidelines to help the transition. Calls for adaptive trial designs come up in the context of the FDA’s Critical Path Initiative. At the API meeting, regulators took the same stance. Rusty Katz of the FDA and Cristina Sampaio of the European Medicines Agency (EMA) urged the API and ADCS to venture into this new territory.
Adaptive trials are increasingly used in drug and device development, particularly in cancer, but also in migraine, stroke, diabetes, and other conditions, Berry said. “Some companies have hired whole teams; others are getting their feet wet to make sure if the train really leaves the station, they won’t miss it,” Berry said, adding that, to date, the FDA has approved one drug, the pravastatin-aspirin combination pill Pravigard, based on a wholly Bayesian efficacy analysis.
Berry claimed that an adaptive trial not only answers the question at hand faster, with fewer patients, and cheaper than a traditional trial would, but that it also gives the trial participants better medical care along the way. Typically, such trials adapt what doses or treatment arms patients get randomized to, or when to declare success or futility.
Learn as You Go
So what are adaptive trials, exactly? Starting from the underlying statistics, the main difference is that traditional RCTs regard parameters as fixed, whereas adaptive trials view them in terms of changing probability distributions. Adaptive trials measure all uncertainties by probability. Everything that is unknown has a probability distribution, and every probability is calculated conditionally on known values. As results roll in, those values go into the computer model and the numbers get re-crunched. That means incoming trial data serve to better simulate the probability of success if the trial keeps going as is, or if it changes a given parameter. Trialists then adapt that parameter to match a higher probability of success. In essence, Bayesian trials continually incorporate the latest trial data, recalculate probabilities to update knowledge, and in this way, inform ongoing decisions by the trial leaders about how to tweak the design of the trial or when to end it. “The frequentist approach typically forces you to set all assumptions, lock them in, and run with it to the end. The Bayesian approach says we can revise what the assumptions should be by monitoring them, and as the trial accrues data, you may have more accurate evidence,” said Pierre Tariot of the Banner Alzheimer’s Institute in Phoenix, Arizona.
What does this mean in practice? For example, in a trial of adjuvant chemotherapy for early-stage breast cancer in older women, the National Cancer Institute had originally required 1,800 patients; however, an adaptive design cut that number down to 600 (Muss et al., 2009). A recent trial comparing treatment options in atrial fibrillation answered the question with 167 patients and was published in the Journal of the American Medical Association (Wilber et al., 2010).
In Alzheimer’s research, finding the right dose in Phase 2, and using fewer participants, are two goals adaptive trials could accomplish, Berry claimed. Several AD scientists at the API meeting, including William Potter, who retired from Merck, had cautioned that finding the right dose was both critically important and a highly uncertain process in preclinical patients. In practice, dose finding often involves little more than guesswork, Berry charged, and by Phase 3, patient numbers in the thousands are routine. “Typically, in a dose-finding trial you administer maybe six doses. At the end, you find that all the action was between two doses where have you relatively few patients, and most patients were on doses that had no effect or were too high,” Berry said.
As an example of how to do things better, Berry cited an adaptive trial by Abbott reported at the International Society for CNS Clinical Trials and Methodology conference in San Diego in 2009. It started out with five patients on each dose, and as results rolled in, it randomized more patients to the higher doses that appeared to elicit a response and fewer to the low doses that proved early on to be ineffective. In this case, Abbott scientists declared futility, stopped the trial, and abandoned the drug, but the point is they were able to do that having used 320 patients instead of the 700 that were initially projected, Berry said. The drug flopped, but the trial was informative. Bayesian trials recruit fairly slowly to allow time to learn from incoming information and to react to it. “If you are all done enrolling before you get any information, you cannot adapt,” Berry said.
Besides finding the right dose with the minimum number of patients, adaptive trials can compare several drug candidates in one trial and help scientists decide which one to pick for Phase 3. This can be done by measuring biomarker responses to a given therapy and finding the therapy that has the highest chance of subsequently showing a clinical benefit. These twin goals of comparing drugs and using biomarkers to move a drug from Phase 2 to Phase 3 touch a nerve with AD trialists. They have a wealth of experimental drugs in their pipeline but no expeditious way of evaluating in Phase 2 which one to bet on for expensive Phase 3 registration trials. In particular, trials in asymptomatic mutation carriers will need to do this based largely on drug effects on biomarkers. Because scientists don’t know for certain how a biomarker change relates to any future clinical benefit, the fixed parameters required in frequentist trials make such RCTs inflexible and risky, Berry said. Adaptive trials could start out with, for example, control, two drugs, even a combination arm, and then drop the less effective arms. They can also test a therapy effect on a range of biomarkers initially and then drop those markers that do not respond.
An adaptive trial can accommodate up to 10 treatment arms, Berry said. Perhaps the most innovative example of that—and of data-sharing in the highly competitive world of pharmaceutical drug development—is the multicenter Phase 2 breast cancer trial I-SPY 2. It is managed by the Biomarkers Consortium, a public-private partnership led by the Foundation of the NIH (FNIH). Berry co-designed the trial, and the FNIH worked out a regulatory path for participating drug companies with the FDA. I-SPY 2 started out with five different investigational drugs but intends to test up to a dozen (Patlak, 2010).
I-SPY 2 is more like a screening process than a trial, Berry said. After being adaptively evaluated for biomarker responses and a clinical outcome, a given investigational drug either graduates to a larger, specific Phase 3 trial if it performs well, or is declared futile if it fails to best standard therapy or causes a serious side effect. When a drug leaves the trial for either of these reasons, a new one enters. Drugs from Abbott, Amgen, Pfizer, and other companies are being evaluated in this single adaptive trial. “This is the most amazing piece to me,” Berry said. “Ten years ago, I’d go to one pharma company and they said ‘this sounds like good idea but I don’t want you comparing my drug in the same trial to my competitor’s.’ That is different now,” Berry said.
Given its own litany of failure and millions of dollars lost, why has the AD clinical trial research community not embraced adaptive designs? Part of the reason is technical. Adaptive designs require statisticians trained in Bayesian methods, plus massive computing power. Errors can happen, especially early on when statisticians recalculate the likelihood of success based on incoming data on the first, small numbers of patients. The FDA’s Katz said that with adaptive trials, type 1 errors, where the null hypothesis is rejected even though it is, in fact, true, are a concern. Put simply, the fear is that adaptive trials trade scientific rigor for nimbleness. In AD in particular, the endpoints that matter to the patients are thought to emerge long after a person’s initial response to the new drug. Finally, researchers don’t know enough yet about preclinical biomarkers to build the simulation models that underpin adaptive trials, said Tariot, “We know what to expect at baseline and over time for certain biomarkers in certain clinical groups, such as ApoE4 carriers; we know very little about these markers in PS1 carriers. And response to treatment is speculative in any case. Therefore, it is incumbent on us to design efficient trials with these humbling limitations acknowledged.”
Some industry scientists believe that regulators frown on adaptive trials. The opposite was the case at the API meeting. Both Katz and Sampaio spoke personally, not formally, on behalf of their respective agencies. That said, they encouraged API scientists to try adaptive designs, especially to determine the right dose and to learn what the best endpoints might be for preclinical treatment/secondary prevention trials. API leader Eric Reiman asked regulators how much flexibility the group had with regard to pre-specifying endpoints versus determining them adaptively. Their advice came down to, “The less you know for sure, the more you should adapt.” Here are excerpts from the discussion.
Sampaio: I see a lot of potential in the use of adaptive designs. You face many uncertainties, and in that situation, adaptive designs are good. If you know everything, you do not need to be adaptive. You can adapt almost every variable, though not in one and the same trial.
EMA is open to see trials with adaptive designs. Some adaptations are innocuous, others are troublesome. Adaptation of the primary endpoint is the single one the EMA usually disapproves of; we have issued guidance on that. But ignore the guidance in this case. If we always stick to guidance, we will not open new avenues. Ignoring what was written was what allowed the Portuguese to open the seas for exploration. With the trials you are proposing, you have to write a new story.
This is an extremely difficult field. You could risk doing an adaptive design even on an endpoint because you really do not know yet what the endpoint should be in asymptomatic trials. Each trial proposed today—the extremely important Colombian autosomal-dominant Alzheimer’s disease (ADAD) trial, the ApoE4 trial, the ADCS biomarker trial—is a different setting. But for each, the choice of the endpoint is the most uncertain issue. If you have the guts, that should be your adaptation.
I have thought a lot about the uncertainty regarding what is the best endpoint. Among four or five candidate endpoints, it really is guesswork these days. So why not assess them all in a trial, model them based on early data, and then find the best one? You can incorporate four or five outcomes into one adaptive study.
This may not be your single pivotal trial. It would more likely be an exploratory one that ensures the most learning.
Katz: The FDA has been encouraging more creative, adaptive Phase 2 or even Phase 2/3 trials for some time. Even so, I see very few coming across my desk.
On the dilemma of how to pick a primary outcome in patients who have no symptoms, we would like to see studies include many surrogates. Although we usually require prospective designation of key secondary outcomes, this may be a case where we have to assess the totality of the data until a surrogate emerges as the critical one. The talk on adaptive designs was pertinent here. You can start with an array of outcomes, and during the trial see which ones are responding.
The idea of using an early biomarker to predict outcome, or to identify likely responder populations, or to determine future study conduct is very intriguing. We encourage a protocol like that. It will take a lot of thought, but the FDA stands ready to entertain adaptive proposals.
In discussion, industry scientists expressed interest for API to adopt something similar to the I-SPY concept, and encouraged Berry to develop a proposal. Others pointed to how complex that would be legally, computationally, and practically. William Potter said that the FNIH biomarkers consortium could pursue the idea much like it had supported share research in ADNI and I-SPY 2. Below are some excerpts.
Paul Aisen, UC San Diego/ADCS: To play Devil’s advocate: in AD, we design a trial to give us an answer at the end of the trial with just enough patients as we need to get the answer. With adaptive designs, aren’t you taking a shortcut? You are making decisions about dropping arms, for example, before you have reached the number of subjects that you need to make that decision. There is a significant risk of making mistakes.
Berry: Frequentist trial designs force you to make so many thinly supported assumptions—especially on dose and sample size—that making mistakes has become the status quo. In AD, many frequentist trials ended with inconclusive results.
Katz: I agree about dose finding. Sometimes a company picks a dose seemingly randomly, and if they get lucky and the dose works, we approve. But in this case, where asymptomatic people take a drug a long time, it really behooves a company to find that minimally necessary dose a lot better than we often see.
Laurel Beckett, UC Davis/ADNI: Both patient burden and cost go up when we use biomarkers. Adaptive trials are the direction we need to look because they allow us to say: Let’s stop burdening the patient with this; we can already see it will not work. Or we can add patient visits if we see it will work.
David Bennett, Rush University, Chicago: A well-designed adaptive trial could take advantage of the heterogeneity of the sporadic AD group. I worry about populations that go into a trial. If you do a study in E4 homozygotes and it fails, then do you repeat the study in E4 heterozygotes anyway because that might be the subset that responds? Can an adaptive design look at heterogeneous populations?
Berry: Yes. We tend to define narrowly who enters a trial, and then when the drug is approved, everyone gets it and that leads to problems with lack of response and unanticipated side effects in Phase 4. With adaptive trials, you can learn how more different types of people respond. So for sporadic AD, I recommend to start out with a broad population and then home in where you see an effect.
Lon Schneider, University of Southern California, Los Angeles: This resonates with me. AD is a heterogeneous illness. We give that lip service, but then treat it like a homogeneous one. Doing good trials in AD is hugely complex. I want to see something like I-SPY in AD that includes and explores the heterogeneity.—Gabrielle Strobel.
This is Part 5 of a six-part series. See also Part 1, Part 2, Part 3, Part 4, Part 6. View a PDF of the entire series.