Tale of Two Studies

Quantitative enablement of smart risk-taking in clinical study design

Andrew Bean (Advanced Quantitative Sciences, Novartis Pharmaceuticals, East Hanover, NJ USA)

EFSPI SSL Webinar

2025-05-06

Background

PoS framework of Hampson, Bornkamp, et al. (2022) is well established at Novartis
- Quantification of program-level risk
- Goals include: transparency with project- and portfolio-level stakeholders
Study-level predictions are based on expert prior elictation (e.g. Holzhauer et al. (2022)) or on the predictive distribution from a Bayesian meta-analysis model

Today’s talk

Two examples where a deeper dive into study-level unknowns was beneficial
- Quantification of study-level risk
- Useful for project-level decision-making
Tailored modelling to answer the question “how likely is my study to succeed?”

How likely is my pivotal program to succeed?

Program-level PoS approach

Bringing in all relevant information

Steps to a program-level evaluation using the Hampson, Bornkamp, et al. (2022) approach

Program-level PoS approach

Outcome of an assessment

Probability of success to approval (top), and with additional requirement of meeting Target Product Profile (TPP) met (bottom). Paths show P(success) and hurdles show transition rates.

Program-level PoS approach

References

This PoS framework has been described in

Hampson, Bornkamp, et al. (2022)
Hampson, Holzhauer, et al. (2022)
Holzhauer et al. (2022)

and presented many times, including:

BAYES2023 (Hampson 2023)
DSBS/FMS meeting (Lange 2022)
DIA BSWG (Hampson and Holzhauer 2021)
BBS seminar (Lange 2021)

How likely is my trial to succeed?

Flow of efficacy predictions¹

From a calibrated benchmark prior (for the mean effect) p(\mu) = w_t N(m_t, c) + (1 - w_t) N(m_n, c) … to the posterior distribution (for the mean effect), p(\mu,\tau | \{ \hat\theta_i, s_i \}) \propto \prod_i f(\hat\theta_i | \theta_i) p(\theta_i | \mu,\tau) \cdot p(\mu) \, p(\tau) … the MAP² prior (for a future study mean), p(\theta^* \, | \, \{ \hat\theta_i, s_i \}) = \int \text{N}(\theta^* | \mu,\tau)\, dP(\mu,\tau | \{ \hat\theta_i, s_i \}) … and the predictive distribution (for a future study outcome), p(\hat\theta^* \, | \, \{ \hat\theta_i, s_i \}) = \int \text{N}(\hat\theta^* | \theta^*, s^*)\, dP(\theta^* \, | \, \{ \hat\theta_i, s_i \})

Two studies

Study 1

Design: Phase 3b H2H non-inferiority study

flowchart LR
  A[Eligible patients] --> B(1:1 randomize)
  B --> C[Investigational drug]
  B --> D[Competitor drug]
  C --> E(Test non-inferiority)
  D --> E

Timing: two options

Options for timing of H2H Phase-3b: (A) before pivotal readout or (B) gated on positive pivotal

Study 1: Strategy

What is “success”?

Prerequisite: Pivotal success (supriority to SoC)
Required: H2H noninferiority
Desired: H2H (numerical) superiority

What are the risks?

Commercial impact of failure
Challenging: choice of noninferiority margin
Risks magnified for concurrent Phase-3b due to incomplete information at design stage

Study 1: Enabling smart risks

Strategic decision makers would need to know:

What are plausible head-to-head effects based on what we know now?
How much additional risk from beginning Phase-3b concurrently as opposed to waiting on positive Phase-3?

Study 1: Quantifying risks

Inadequacy of power alone

Consider a study concept with sample size chosen to ensure 90% power to declare noninferiority (at a specified margin) given true head-to-head effect of zero
Does this adequately capture the level of risk? NO!
Several inputs to this calculation are based on uncertain quantities

Study 1: Quantifying risks

Uncertainty about H2H effect

Two versions of contrast-based network meta-analysis

Using what is known now (risk of concurrent trial)¹
Using what will be known after pivotal results (risk of gated trial)²

Study 1: Communicating risks

Option A: go now w/ current knowledge

Pivotal outcome (random)	Phase 3b outcome	Total probability
Success (74%)	Success (83%)	61%
Success (74%)	Failure (17%)	13%
Failure (26%)	Unimportant	26%

Option B: await pivotal outcome

Pivotal outcome (known)	Phase 3b outcome	Total probability
Success	Success (83%)	83%
Success	Failure (17%)	17%
Failure	Not run	N/A

Study 2

Impact of nonresponder imputation

Design

flowchart LR
  A[Eligible patients] --> A1[Adult patients]
  A --> A2[Pediatric patients]
  A1 --> B1(1:1 randomize)
  A2 --> B2(1:1 randomize)
  B1 --> D[Investigational]
  B1 --> E[Standard-of-Care]
  B2 --> D
  B2 --> E
  D --> F(Test for superiority)
  E --> F

Key Consideration

Nonresponder imputation for patients dropping out due to lack of tolerability
Dropout rate anticipated higher on the investigational treatment, especially among pediatric patients

Study 2: Strategy

What is “success”?

Required: Superiority overall
Desired: Superiority in both adults and pediatric patients

What are the risks?

Attenuation of treatment effect due to differential dropout rates
Higher degree of uncertainty about pediatric effects (pre-pivotal data is from adults)

Study 2: Quantifying risks

Inadequacy of power

Typical presentation: power under various scenarios

Scenario	Dropout rates				Treatment effect among adherents (difference in response rates)	Power for overall comparison
	Pediatric		Adult
	Active	Control	Active	Control
1	10%	5%	4%	1%	20%	90%
2	10%	5%	4%	1%	15%	76%
3	15%	5%	4%	1%	20%	81%
4	15%	5%	4%	1%	15%	68%

Shortcoming

What is the relative plausibility of these scenarios based on what we know?

Study 2: Quantifying risks

Modelling all unknowns

Separate meta-analytic modelling of historical data for (1) treatment effect among completers, (2) dropout rates + elicited beliefs about pediatric dropout, and (3) posterior distribution of power function implied under theese models

Study 2: Communicating risks

Visual display: sensitivity of power

Example of a tornado plot illustrating systematic sensitivity analysis of power. Bounds for unknowns are calibrated based on degree of certainty (central 50% posterior intervals). Diamonds represent values assumed in sample-size calculation.

Conclusions

Themes

Understanding, quantifying, and communicating about risks in study design
Power alone does not sufficiently characterize risk in these cases
A slightly different emphasis than program-level PoS: more granular attention to study-specific considerations
- Especially necessary outside pivotal Phase 3
- e.g. in Study 1, calibrated benchmark prior is not directly relevant to the H2H effect

A PoS philosophy

Modelling when there is data
- Potentially going beyond treatment effect: e.g. standard deviations, control response rates, overdispersion etc.
Eliciting to bridge data gaps
- SHELF framework has been powerful (Holzhauer et al. 2022)
Sensitivity for controllables
- Sample size, timing of assessment, estimands
Standardization as much as possible
- Consistent communication AND methods

Acknowledgements

Quantitative PoS team, including

Markus Lange
Björn Holzhauer
Joseph Kahn
Alex Przybylski
Lou Whitehead
Steffen Ballerstedt

And other colleagues

Lisa Hampson
Jahangir Alam
David Ohlssen
Andrew Wright
Guoqin Su
Gen Zhu
Haoyi Fu

References

Hampson, Lisa V. 2023. “Role of Bayesian Methods in Evaluating and Communicating Risk.” In BAYES2023 Conference. Utrecht, Netherlands. https://www.bayes-pharma.org/wp-content/uploads/2023/11/22-HAMPSON-Quantitative-Decision-Making-for-drug-development-the-role-of-Bayesian-methods-in-evaluating-and-communicating-risk.pdf.

Hampson, Lisa V., Björn Bornkamp, Björn Holzhauer, Joseph Kahn, Markus R. Lange, Wen-Lin Luo, Giovanni Della Cioppa, Kelvin Stott, and Steffen Ballerstedt. 2022. “Improving the Assessment of the Probability of Success in Late Stage Drug Development.” Pharmaceutical Statistics 21 (2): 439–59. https://doi.org/https://doi.org/10.1002/pst.2179.

Hampson, Lisa V., and Björn Holzhauer. 2021. “Strategies for Improving the Assessment of the Probability of Success in Late Stage Drug Development.” In Bayesian Scientific Working Group (BSWG) KOL Series 21. https://view.officeapps.live.com/op/view.aspx?src=http%3A%2F%2Fwww.bayesianscientific.org%2Fwp-content%2Fuploads%2F2021%2F02%2FDIA_Hampson_Holzhauer_Feb2021.pptx&wdOrigin=BROWSELINK.

Hampson, Lisa V., Björn Holzhauer, Björn Bornkamp, Joseph Kahn, Markus R. Lange, Wen-Lin Luo, Pritibha Singh, Steffen Ballerstedt, and Giovanni Della Cioppa. 2022. “A New Comprehensive Approach to Assess the Probability of Success of Development Programs Before Pivotal Trials.” Clinical Pharmacology & Therapeutics 111 (5): 1050–60. https://doi.org/https://doi.org/10.1002/cpt.2488.

Holzhauer, Björn, Lisa V. Hampson, John Paul Gosling, Björn Bornkamp, Joseph Kahn, Markus R. Lange, Wen-Lin Luo, et al. 2022. “Eliciting Judgements about Dependent Quantities of Interest: The SHeffield ELicitation Framework Extension and Copula Methods Illustrated Using an Asthma Case Study.” Pharmaceutical Statistics 21 (5): 1005–21. https://doi.org/https://doi.org/10.1002/pst.2212.

Lange, Markus. 2021. “Unraveling a Single Number: Using Graphics to Explain Probability of Success.” In Basel Biometrics Society (BBS) Seminar. https://baselbiometrics.github.io/home/docs/talks/20210308/5_Lange.pdf.

———. 2022. “Strategies for Improving the Assessment of Probability of Success (PoS) in Late Stage Drug Development.” In Joint DSBS/FMS Meeting 2022. Copenhagen, Denmark. https://statistikframjandet.se/wp-content/uploads/2022/12/Markus-Lange-Strategies-for-improving-the-assessment-of-probability-of-success-PoS-in-late-stage-drug-development.pdf.

Neuenschwander, Beat, Gorana Capkun-Niggli, Michael Branson, and David J Spiegelhalter. 2010. “Summarizing Historical Information on Controls in Clinical Trials.” Clinical Trials 7 (1): 5–18. https://doi.org/10.1177/1740774509356002.

Tale of Two Studies

Background

Today’s talk

How likely is my pivotal program to succeed?

Program-level PoS approach

Bringing in all relevant information

Program-level PoS approach

Outcome of an assessment

Program-level PoS approach

References

How likely is my trial to succeed?

Flow of efficacy predictions1

Two studies

Study 1

Design: Phase 3b H2H non-inferiority study

Timing: two options

Study 1: Strategy

What is “success”?

What are the risks?

Study 1: Enabling smart risks

Strategic decision makers would need to know:

Study 1: Quantifying risks

Inadequacy of power alone

Study 1: Quantifying risks

Uncertainty about H2H effect

Two versions of contrast-based network meta-analysis

Study 1: Communicating risks

Option A: go now w/ current knowledge

Option B: await pivotal outcome

Study 2

Impact of nonresponder imputation

Design

Key Consideration

Study 2: Strategy

What is “success”?

What are the risks?

Study 2: Quantifying risks

Inadequacy of power

Typical presentation: power under various scenarios

Shortcoming

Study 2: Quantifying risks

Modelling all unknowns

Study 2: Communicating risks

Visual display: sensitivity of power

Conclusions

Themes

A PoS philosophy

Acknowledgements

Quantitative PoS team, including

And other colleagues

References

Flow of efficacy predictions¹