How much evidence is enough for action?

04 March 2014

One of the most useful ways in which evidence from rigorous evaluations can be used is to help policymakers take decisions on going to scale. Notable recent examples of scaled-up interventions based on high-quality synthesised evidence are conditional cash transfers programmes and early child development (pre-school) programmes. Both types of programmeshave been piloted and tested in many places effectively, as shown in systematic reviews on cash transfers and early child development.

The global scaling up of development programmes needs to be based on systematic reviews and not evidence from single studies. This is because single studies – even rigorous studies like randomisedcontrolled trials (RCTs) – can be biased. According to Dr Philip Davies, 3ie’s head of systematic reviews, “Single studies can misrepresent the balance of research evidence. They illuminate only one part of a policy issue, they are sample-specific, time-specific, context-specific and are often of poor methodological quality”. While high quality single studies can still be useful to inform scaling up the specific programmes on which they are based, they are biased when attempting to make more generalisable statements, such as on whether or not to go to scale elsewhere.

The value of systematic reviews has been fully embraced for a number of years now, including major development funders such as the Department for International Development, the Bill and Melinda Gates Foundation, the US Agency for International Development and others. A number of global organisationsare dedicated to producing rigorous reviews, including the Cochrane Collaboration, the Campbell Collaboration’s International Development Coordinating Group and 3ie. Importantly, the systematic review community, thanks to Archie Cochrane, embraces the philosophy that reviews need to be updated regularly, and that including new studies and doing new reviews are crucial to ensuring that the evidence is as robust, validated and useful as possible.

Economists at the Abdul Latif Jameel Poverty Action Lab at the Massachusetts Institute of Technology and its sibling organisation, Innovations for Poverty Action (IPA), have conducted hundreds of RCTs in areas such as agriculture, microfinance, governance and public health. Their focus is on using RCTs to provide rigorous causal evidence on what difference development programmes make. It is not an exaggeration to say that this group has changed the ways in which development micro-economists undertake field research and also how a good number of governments and NGOs conduct pilot programmes. They have made a positive difference to many single programmes based on these rigorous studies.

IPA has just announced a spin off organisation called Evidence Action, launched to scale up programmesto reach millions of beneficiaries. According to the website, the principles of Evidence Action include ‘Only scale interventions whose efficacy is backed by substantial rigorous evidence’. So they are currently focusing on two initiatives they believe to have already been proven effective: Chlorine Dispensers for Safe Water and Deworm the World.

The website states that ‘regular deworming treatment reduces school absenteeism by 25 per cent’. This is an impressive figure, until one realises it is from a single study conducted in Kenya. They also state that ‘chlorination has been shown to reduce diarrhoea by 40 per cent’. The figure is based on results from a systematic review of household water treatment from 2007. Very soon after this review was first published, evidence came to light suggesting child diarrhoea impacts from such studies were severely biased, as also recently discussed by Givewell.

But IPA researchers are scaling up chlorine dispensers in neighbouring Uganda, and have received funding from the Gates Foundation to scale up in Haiti and elsewhere. School-based dewormingprogrammes are being scaled up in Ethiopia and the Gambia, and across India. The World Food Programme is partnering Deworm the World in 12 countries where it operates school feeding programmes.

IPA states that it bases its scaling up campaigns on substantial evidence, which we take to mean more than single studies, which is good. It’s just that in these cases, it doesn’t look as though they are looking at the full body of substantial and rigorous synthesised evidence for either intervention.

In contrast, the balance of evidence does not support scaling up of either one.

The impacts of deworming are context specific, and sustaining impacts means treating children every year. As the Cochrane Collaboration stated “it is probably misleading to justify contemporary dewormingprogrammes based on evidence of consistent benefit on nutrition, haemoglobin, school attendance or school performance as there is simply insufficient reliable information to know whether this is so.” IPA and the World Bank have commented on that review here and here.

Health impacts of water treatment are likely to be smaller, and may not exist. And sustaining adoption (which is necessary for sustaining impacts) is problematic. People frequently don’t like the taste of chlorinated water. It is also very difficult to get children’s carers to change behaviour when the main benefits of a new technology, such as reducing a child’s disease rate, are hard for them to observe. Moreover, while a single study in Kenya may have suggested impacts, the balance of systematic evidence suggests adoption and health impacts from water treatment typically are not sustained.

One of the problems we face in international development is that the results of effective interventions are not shared in useful ways with the right decision-makers, in the right places at the right times. So, the idea of an organisation getting the word out about proven interventions and building support for scaling up is a welcome one.

Our concern, based on Evidence Action’s initial campaigns, is that the evidence demands firstly that they consider more than single studies, and secondly they use the most up-to-date synthesised evidence. We hope IPA and Evidence Action take context into account when campaigning for particular interventions, do sufficient further research on single studies (not just from their own work) and use the latest systematic reviews before deciding that they have enough evidence to go to scale.

How much evidence is enough for action?

Leave a comment