Impact Evaluation

Impact Evaluation

Impact evaluation is structured to answer the question: “how would outcomes such as the participants well-being have changed if the intervention had not been undertaken?” This involves counter-factual analysis, that is, “a comparison between what actually happened and what would have happened in the absence of the intervention.”

A secondary question that normally comes imediatly after is “why?”. What behaviours did we change that explain the observed impact of the intervention?

The key challenge in impact evaluation is that the counter factual cannot be directly observed and must be approximated with reference to a comparison group. There are a range of accepted approaches to determining an appropriate comparison group for counter-factual analysis, using either:

5 things that can contaminate measure impact: Confounding, Selection bias, Spillover, Contamination, Impact heterogeneity.

(ante) Randomized field experiments are the strongest research designs for assessing program impact. This particular research design is said to generally be the design of choice when it is feasible as it allows for a fair and accurate estimate of the program’s actual effects

Non-experimental design

Non-experimental designs are the weakest evaluation design, because to show a causal relationship between intervention and outcomes convincingly, the evaluation must demonstrate that any likely alternate explanations for the outcomes are irrelevant.

However, there remain applications to which this design is relevant, for example;

On product launch

How we then quantify the final real impact ?

I general we can look at the top level KPIs and see if trends change to have an idea if it really changed things or not.

But in order to quantify it somewhat, an approach is to compare the time period before and after deployment and see what is the difference.

The caveat with this approach is that there can be other external confounding factors influencing it, like seasonality holidays, weekend vs week days, other deployments happening at the same time, etc… So this approach has to be carefully planned for and in general: try find ways to exclude out as many potentiality influencing external variables as possible.

When results show a small difference we are less confident about the results, because the external influences could be biasing somewhat it to either direction.

Methods

Estimation methods

References


comments powered by Disqus