Policy equipoise and ethical implementation experiments: Evidence of effectiveness, not merely efficacy
This is the second in a series of blogs on ethics in social science research. Read the introductory post here.
One ethical concern that researchers and implementation partners confront with the use of experiments to evaluate policy interventions is the withholding of an intervention or policy – e.g. a cash transfer or empowerment collective – from otherwise eligible people. This concern may be alleviated in cases where there is a scarcity of resources. It is also alleviated when the relevant community of experts is in a state of equipoise regarding the merits of the intervention under study and the status quo. In this post, I discuss some of the factors to be considered when making judgments regarding equipoise.
As Asiedu et al. put it, equipoise is satisfied when “reasonable and informed stakeholders disagree on the optimal policy.” When this condition holds in the context of a randomized controlled trial (RCT), no participant is ex ante worse off and so no one is treated unfairly. While the principle of policy equipoise is gaining traction in the policy research ethics literature, there remains the challenge of determining when it is satisfied. Researchers and implementation partners may struggle to decide if equipoise is satisfied in cases where there is a desire to scale-up an intervention which has proven to be efficacious in a small-scale RCT or pilot project.
In clinical research, there is a useful distinction between explanatory RCTs, which test an intervention under ideal conditions with the aim of determining its efficacy, and pragmatic RCTs, which test an intervention under real-world conditions with the aim of determining its effectiveness. For example, in an explanatory RCT, a novel chemotherapy drug would be administered by expert practitioners to a highly monitored, restricted set of patients in accordance with strict instructions. In a pragmatic RCT, by contrast, the drug would be offered to the broader patient population in the context of usual clinical care. The aim of the former is to determine if the drug works under ideal conditions; the aim of the latter is to determine if it works in typical clinical settings.
While public policy research is different from clinical research, some policy scholars draw a similar distinction between small-scale proof of concept studies and large-scale implementation studies. Where an intervention has proven to be efficacious in the former, researchers and policymakers may wish to conduct an implementation experiment to see if the results hold under real-world conditions.
As Banerjee et al. argue, there are a number of obstacles to the replication of a proof-of-concept study’s results on a larger scale, including market equilibrium effects, spillover effects, political reactions, context dependence, randomization or site-selection bias, and the various implementation challenges government bureaucracies face in scaling up an intervention. These factors are all relevant when deciding whether policy equipoise is satisfied in the case of a particular implementation experiment.
The primary research question researchers and implementation partners wish to answer in an implementation RCT is whether a particular intervention is superior to the status quo under real-world conditions. This focus is legitimate, for the central purpose of policy research is to identify interventions which are effective, rather than merely efficacious. Therefore, for implementation experiments we are in a state of equipoise when there is reasonable disagreement regarding whether the intervention is superior to the status quo under real-world conditions.
The fact that the intervention in question has been proven to be efficacious – through a proof-of-concept study for example - under ideal conditions is relevant to determining whether there is uncertainty regarding the outcomes of the implementation experiment. But so are the factors identified above in Banerjee et al. For example, questions regarding the ability of the average teacher to faithfully implement an educational intervention proven to be efficacious in a proof-of-concept study may be so severe that there are sufficiently strong reasons to doubt that the findings will replicate. In such a case, policymakers and researchers may occupy a state of equipoise regarding the two arms of an implementation RCT, even though the treatment arm has proven to be superior in a proof-of-concept RCT.
To determine if a particular implementation experiment satisfies the principle of policy equipoise, researchers and implementation partners need to consider evidence of effectiveness, not merely evidence of efficacy. Where an intervention has been proven to be superior to the status quo in a proof-of-concept study, researchers and policymakers must ask: are the obstacles to pure replication of the original study’s findings on a large-scale severe enough that a reasonable and informed person, concerned to promote people’s interests, could sincerely doubt that the intervention will prove to be superior on a large-scale? Answering this question will require consideration of factors beyond evidence of efficacy, including the quality of typical program providers, differences in study populations, and the inner workings of relevant government institutions, among others.
After reading this blog from Dr. MacKay, the 3ie team offers these reflections:
Research funders and producers should clearly articulate whether the research activity is a proof-of-concept or an implementation study. It is important to acknowledge when there is already evidence regarding efficacy and what contextual factors affect the need for further implementation studies. This clarity in defining the study motivation can strengthen how other stakeholders – particularly outside the research system - understand the social value of the research.
Research teams and implementation partners should critically assess what is appropriate for the control group. Researchers should clearly document the state of equipoise, scarcity of resources, intervention details and priority research questions, and work with implementation partners to understand what level of resources should be withheld – or not - from the control group for the purpose of the study. In some cases, it may not be appropriate for the control group to remain at the status quo or receive ‘less than’ the treatment group. For some studies, the control group may receive the same level of resources as the treatment group, but through a different mechanism (example).
Research teams should elevate the discussion with implementation partners on how to prioritize allocating the intervention to the control group if equipoise and scarcity are resolved. Unlike the treatment group, the control group bears the burden of research without the potential benefits. The research team should work with implementation partners to determine the extent to which the control group can be prioritized to receive the treatment if (i) the treatment is determined to be superior to the status quo and (ii) scarcity of resources is resolved. This can be accomplished ex-ante through randomized roll-out strategies using stepped wedge design (example) or ex-post based on research findings.
Systematic reviews and evidence gap maps are important tools for aggregating evidence and understanding the state of equipoise. Evidence synthesis products, such as systematic reviews and evidence gap maps, can articulate whether available evidence comes from proof-of-concept or implementation studies. Even when there is evidence of efficacy for the intervention, context matters, and the same results may not always replicate in other contexts. Understanding how much we know, or don’t know, about the evidence regarding an intervention’s efficacy and effectiveness is critical for understanding the state of equipoise. (For some examples, explore systematic reviews and evidence gap maps on our Development Evidence Portal.)
Coordinated research and replication efforts where the same intervention or policy is studied in a variety of contexts can be useful when the state of equipoise is understood. One study rarely provides all the answers regarding both efficacy and effectiveness that can generalize to many contexts. When there is a better understanding of evidence gaps and demands for specific evidence, coordinated research and replication efforts can support evidence production that addresses external validity constraints faced by individual studies (One example is here.)
How does equipoise in this context incorporate costs? Some interventions would seem to clearly benefit the recipients (e.g., cash transfers) but without assessing their impacts and their opportunity costs it would seem hard to justify these policies as opposed to alternatives. But to do that would seem to require comparing beneficiaries' behaviors and outcomes to those of comparable controls. as well as costs. So it would seem to me that in the absence of such knowledge, it would be ethical to randomly assign treatment even if it is known that those treated benefit.
Add new comment