In June 2024, the FDA’s Psychopharmacologic Drugs Advisory Committee voted 9-2 against the effectiveness evidence for Lykos Therapeutics’ MDMA-assisted therapy for PTSD, and 10-1 against the benefit-risk balance. The single most cited reason, in the committee’s own discussion and in the agency’s later Complete Response Letter, was functional unblinding. The FDA’s analysis concluded that nearly all participants in the MDMA arm correctly identified that they were receiving MDMA rather than placebo. The committee took this as fatal evidence that the trial results could not be interpreted.
Two years later, similar arguments have been deployed against Compass Pathways’ COMP360 program. The molecule is different, the design is different, the results are different, but the underlying critique is the same: how can you interpret a psychedelic Phase 3 trial when the participants know what they got?
The persistence of this argument is doing real damage to the field’s ability to design trials that the FDA will accept. Part of the problem is that the argument is sometimes correct and sometimes not, and distinguishing the cases requires unpacking a conflation that has shaped industry response to FDA guidance since 2017.
Blinding and expectancy are two different things. The methodological response to one is not the methodological response to the other. Treating them as the same problem produces trial designs that fail to address either.
What blinding actually is, and why it matters
The randomized controlled trial inherits its design from agricultural research in the 1920s. The basic premise: divide subjects randomly into groups, give one group the intervention, give the other group a control (placebo, standard care, or another active intervention), measure the outcome, compare. Randomization controls for selection effects. Blinding controls for measurement effects.
Blinding has two layers. Single-blind means the participant does not know which arm they are in. Double-blind means neither the participant nor the assessor knows. Triple-blind, in some formulations, extends the requirement to the statistician analyzing the data. The deeper the blinding, the harder it is for any party’s beliefs about which arm a participant is in to influence what they report or how they measure.
The reason blinding matters is not abstract. In psychiatric trials specifically, the outcome measures are usually clinician-rated scales (the MADRS, the CAPS-5, the HAM-D). A clinician scoring a patient on these scales is making judgments that involve interpretation. If the clinician believes the patient received the active drug, the clinician’s interpretation of borderline responses can drift toward better. If the patient believes they received the active drug, the patient’s self-report can drift the same way. These effects are real, replicated, and large. In some psychiatric trials, the placebo response itself accounts for the majority of the apparent drug effect.
A trial in which the blinding holds is one where these biases are controlled. A trial in which the blinding does not hold is one where they are not. The FDA cares about blinding because the agency’s job is to determine whether the measured effect reflects the drug, or reflects bias, or reflects some combination. The cleaner the blind, the more confidently the agency can attribute effect to drug.
Why blinding cannot hold in a psychedelic trial
The structural problem is straightforward. Classical psychedelics (psilocybin, LSD) and entactogens (MDMA) produce acute psychoactive effects that are difficult to mistake for anything else. A participant receiving a therapeutic dose of psilocybin experiences perceptual alterations, profound shifts in mood and cognition, and a subjective state lasting four to eight hours that has no realistic placebo analog. The participant knows. The clinician sitting in the room with the participant for those hours knows. There is no consistent way to hide this.
Several attempted solutions have been tried. Active placebos (low doses of the same drug, or different drugs that produce some shared somatic effects) have been used. None have produced a credible blind in practice. In Lykos’s MAPP1 and MAPP2 trials, participants assigned to placebo correctly identified their assignment at very high rates. In Compass’s COMP005 study, the 1 mg “subperceptual” dose was demonstrably distinguishable from the 25 mg therapeutic dose in participant reports.
The honest acknowledgment, which the field has been slow to adopt, is that classical placebo blinding is structurally impossible in trials of acutely psychoactive substances. This is not a problem that can be solved by trying harder. It is a feature of what these drugs do.
This is where the field’s response has gone in different directions. Some sponsors have responded by treating blinding failure as something the FDA should accept and overlook. Others have responded by designing trials that address what blinding was supposed to address through different mechanisms. The difference between these two responses is most of what separates the Lykos rejection from the Compass voucher.
Why expectancy is a different problem
Expectancy is the participant’s belief, before the trial begins, about what is going to happen to them. Specifically: how much they expect the treatment to help.
Expectancy effects are large in psychiatric trials regardless of blinding, because patient symptoms in psychiatric conditions are influenced by belief states in ways that are not true for, say, blood pressure or tumor size. A patient who believes their depression will improve often does report some improvement, particularly on subjective measures. This is part of why placebo response rates in depression trials are routinely 30-40 percent of baseline severity.
Expectancy effects are particularly large in psychedelic trials for three reasons.
First, the participants self-select with very high expectancy. People who enroll in psychedelic-assisted therapy trials have typically read about positive results in earlier studies, watched documentaries, listened to podcasts, sought out the trial, traveled significant distances to participate, and arrived at the trial site already convinced that this treatment is likely to help them. The participants are pre-selected for high expectation.
Second, the trials are conducted by clinicians who are themselves believers. The treating clinicians in psychedelic trials are typically practitioners who entered the field because they believed in the therapy. They convey that belief, intentionally or not, in their interactions with participants. This compounds the participants’ baseline expectancy.
Third, the pre-dosing preparation itself is designed to maximize therapeutic expectation. The standard psychedelic-assisted therapy protocol includes multiple preparation sessions before the dose, in which the clinician and the participant build rapport, discuss intentions for the dosing session, and frame the experience as potentially transformative. This preparation is itself a powerful expectancy-shaping intervention.
The result is that even a perfectly blinded psychedelic trial, if such a thing were possible, would still face the expectancy problem. The patients are already expecting improvement. The patients in the active arm would report improvement consistent with their expectations. The patients in the placebo arm would too, but possibly less, because the placebo experience does not match the expectancy they brought in.
This is the conflation that has cost the field a decade. The argument “the blind failed, therefore the trial is uninterpretable” elides what blinding was supposed to do. The argument that survives the elision is: “expectancy effects are the underlying concern, blinding was the mechanism for controlling them, blinding failed, therefore expectancy effects are uncontrolled.”
That argument is correct, in the absence of any other mechanism for controlling expectancy effects. Where the argument fails, and where the field’s better trial designs have begun to operate, is when sponsors design trials that control for expectancy effects through mechanisms other than blinding.
The four designs that actually work
Four trial design choices substantively address expectancy independent of blinding. Sponsors and trials that incorporate them have a much stronger methodological case for their results than those that do not.
The first is dose-response. If a trial includes multiple active doses and shows that the higher dose produces a larger effect than the lower dose, the result supports a true drug effect because expectancy alone would not vary systematically with dose. Compass’s COMP005 trial used 1 mg, 10 mg, and 25 mg arms. The 25 mg arm separated from the 10 mg arm, not just from placebo. Both groups had similar expectancy at baseline, but the higher dose produced more improvement. This is methodologically powerful evidence that what is being measured is at least partly the drug’s pharmacological action.
The second is active comparator design. Compare the psychedelic to another active treatment for the same indication, ideally one with similar somatic effects but different pharmacology. If the psychedelic separates from the active comparator, the expectancy explanation weakens substantially, because both groups had reason to expect improvement and both received an active treatment.
The third is pre-registered expectancy measurement. At baseline, ask participants explicitly how much improvement they expect. Analyze whether expectancy predicts response. If response is independent of baseline expectancy, the expectancy explanation becomes harder to sustain. Some psychedelic trials have incorporated this; many have not. The Lykos program did not include systematic baseline expectancy measurement.
The fourth is durability assessment. Expectancy effects in psychiatric trials tend to fade. The placebo response usually peaks early and erodes over weeks to months. A drug effect that persists at 12 weeks, 26 weeks, or longer is harder to explain by expectancy alone. Compass’s COMP006 reported 26-week durability data. The Lykos program’s durability data was less extensive, and the FDA’s CRL specifically noted concerns about whether the apparent benefits were durable enough to justify approval.
A trial that incorporates several of these designs (dose-response separation, active comparator, expectancy measurement, durability) has built a methodological case that does not depend on the blinding working. A trial that has none of these has, in effect, said that the FDA should accept its results on the strength of the headline p-value alone.
The methylone trial as the contemporary contrast
The Transcend Therapeutics IMPACT-1 study, which read out in JAMA Psychiatry in February 2026 and supported the Phase 3 voucher subsequently awarded to the Otsuka-acquired program, is worth examining for what it did differently from the Lykos approach.
IMPACT-1 used four once-weekly oral doses of methylone, with placebo control. The study population was 65 adults with severe PTSD. The primary endpoint was reduction in CAPS-5 total severity score from baseline to day 64. The trial met its primary endpoint with statistical significance.
What made the trial methodologically interesting was its structure. The dosing schedule, four weekly doses, allowed the protocol to track response across multiple administrations rather than a single dosing event. The endpoint was measured at day 64, which is six weeks after the final dose, providing some durability data within the trial itself. The trial population was carefully characterized, and the adverse event reporting followed standard pharmaceutical practice. The trial did not eliminate the blinding problem; functional unblinding likely occurred with methylone as it does with MDMA. But the trial’s structure produced data that could be interpreted on dimensions other than pure blinding.
The Compass COMP360 program goes further in the same direction. Three-dose-arm design in COMP005, structured durability assessment in COMP006, blinded independent raters for the primary endpoint, more than 1,000 participants across the two pivotal studies. The blinding problem is not solved; the methodological case is built on other dimensions.
This is the difference the FDA’s voucher choices signaled. The agency was not endorsing the proposition that blinding does not matter in psychedelic trials. It was endorsing programs that had done the work of building a case that did not depend on blinding working.
What the field should take from this
The implication for any psychedelic Phase 3 program currently in design is direct. The right response to functional unblinding is not to dispute its existence, and it is not to ask the FDA to overlook it. It is to design the trial so that the expectancy concerns blinding was meant to address are controlled through other mechanisms.
Sponsors who are still designing trials whose methodological defense rests on “the blinding mostly held” are designing trials that the agency will reject for the reasons documented at length in the Lykos CRL and the June 2024 advisory committee discussion. The agency has been clear about this for years, in published draft guidance and in public communications. The forthcoming finalization of the 2023 guidance is, by all available signals, going to make the requirements more explicit, not less.
The methodological lesson the field has been slow to learn is that the FDA does not expect blinding to work in psychedelic trials. The agency expects the trials to be designed for the world in which blinding does not work. That is what the better trials have begun to do. The trials that have not done it should expect outcomes that look like Lykos’s, regardless of the political weather in any given year.
Functional unblinding is not the question. The question is what the trial does about it.