Tag Archive for: evaluation

DEMYSTIFYING THEORIES OF CHANGE AND PROGRAM LOGICS

Do theories of change and program logics excite you or bore you?  Are you overwhelmed by them, or maybe don’t see what all the fuss is about? 

My observation from the people I have worked with has been quite variable.  Some people love a clearly articulated theory of change, and can’t wait to see the outcome of its implementation; whilst others are a bit confused, and might slink down in their chair when asked to describe their program or intervention’s theory.

For those in the first group… me too !!  I love a good theory, and get excited about the prospect of testing it, learning from the results, making some adjustments, and then testing again.  Monitoring and evaluation in action – beautiful!!

For those in the second group… chances are you probably understand parts if not all of the theory that underpins your program or intervention, but maybe implicitly… and maybe you’ve just never been asked the right questions to make it explicit.

A theory of change, or the theory or theories that underpin your program or intervention, help to explain why the program has been designed and is delivered in a certain way, and how the program is thought to bring about the desired outcomes.

The design of a program is informed by someone’s knowledge (hopefully).  This knowledge may have come from formalised study, reviewing the literature, previous experiences, cultural beliefs… and may be a mix of fact, opinion or assumptions.

Guided by this knowledge, we make decisions about how to design and implement our programs, with the strong hope that they lead to our desired outcomes. 

  • Maybe a weight-loss program, smoking-cessation program or anger-management program is informed by social learning theory, or behaviour modification theories. 
  • Maybe the way we design and implement training programs to improve employment opportunities is based largely on adult learning theories, which may have been tweaked or adapted based on our learnings from previous experiences. 
  • Maybe we design our communication and engagement activities to take advantage of social network theories. 
  • Maybe we ensure that our clients always have a choice about the gender of their clinician, or the location or modality of their sessions, because we know that the strong rapport between client and clinician is a good predictor of positive outcomes.

Some of the knowledge we use to inform our programs, activities and interventions may be based on long-standing, strongly-held, widely-acknowledged theories, such as attachment theory or social learning theory, whereas other knowledge may have emerged very recently from our own observations, or reflective practice, or careful monitoring of our programs.  Regardless of whether your theories have emerged from formalised training, a review of the literature, your previous experiences and observations, or your cultural beliefs – they are all still theories that guide and explain the assumptions behind why you think your program will achieve the desired outcomes.

The implementation of your program is really just testing to see if the theory is correct.  Hopefully you’ve employed some rigour, and consulted broadly to ensure you’ve made good decisions in your program design, to optimise the desired outcomes.  As service providers, we really are obligated to ensure our theories make sense, as they inform or underpin the services we offer our clients.

Program logics are really just the plan to operationalise our theories of change.  Program logics usually map out (graphically or in table form) the inputs and activities necessary, and the outputs and outcomes we expect to occur, when our theory is brought to life. 

Donna Podems, in her fabulous book “Being an Evaluator – Your Practical Guide to Evaluation” presents a hybrid model (see example below), where she incorporates elements of the theory in the program logic. She highlights the prompting question what makes you think this leads to that? as helping to make the implicit, explicit, as we move from left to right.

When was the last time your team spoke openly about your theory of change?  Is everyone on the same page?  Does everyone understand what underpins what you do, and how you do it? Does everyone have a consistent view of what makes them think this… leads to that? 

It’s critically important that they do… otherwise some of your team may be deviating from your program, without even realising it. 

Reach out if you’d like to chat more about theories of change and program logics, or if you’d like to more clearly and explicitly articulate your theories and logics – they really are core to the success of your program.

NECESSARY AND SUFFICIENT – LET’S THINK ABOUT THOSE TERMS FOR A MOMENT

We use the words necessary and sufficient almost everyday – but they have a specific meaning in evaluation, and play an important role in Impact Evaluation.

According to Dictionary.com:

  • Necessary:  being essential, indispensable or requisite; and
  • Sufficient:  adequate for the purpose, enough.

These absolutely hold true in evaluation nomenclature as well… but let’s take a closer look.

When we undertake an Impact Evaluation, we are looking to verify causality.  We want to know the extent to which the program caused the impacts or outcomes we observed.  The determination of causality is the essential element to all Impact Evaluations, as they not only measure or describe changes, but seek to understand the mechanisms that produced the changes.

This is where the words necessary and sufficient play an important role.

Imagine a scenario where your organisation delivers a skill-building program, and the participants who successfully complete your program have demonstrably improved their skills.  Amazing – that’s the outcome we want!

But, can we assume that the program delivered by your organisation caused the improvement in skills? 

Some members of the team are very confident – ‘yep, our program is great, we’ve received lots of comments from participants that they couldn’t have done it without the program.  It was the only thing that helped’.  Let’s call them Group 1.

Others in the team think that the program definitely had something to do with the observed success, but they think it also had something to do with the program the organisation ran last year in confidence-building, and that they build on each other.  We’ll call them Group 2.

Some others in the team think the program definitely helped people build their skills, but they’re also aware of other programs delivered by other organisations, that have also achieved similar outcomes.  Let’s call them Group 3.

Who is correct?  The particular strategies deployed within an Impact Evaluation will help determine this for us, but hopefully you can start to see an important role for the words necessary and sufficient.

  • Group 1 would assert that the program is necessary and sufficient to produce the outcome.  Their program, and only their program, can produce the outcome.
  • Group 2 would assert that the program is necessary, but not sufficient on its own, to cause the outcome.  Paired with the confidence-building program, the two together might be considered the cause of the impact.
  • Group 3 would claim that their program isn’t necessary, but is sufficient to cause the outcome.  It would seem there could be a few programs that could achieve the same results, so whilst their program might be effective, others are too.

Patricia Rogers has developed a simple graphic depicting the different types of causality – sole causal, joint causal and multiple causal attribution. 

Sole causal attribution is pretty rare, and wouldn’t usually be the model we would propose is at play.  But a joint causal or multiple causal model can usually explain causality. 

Do you think about the terms necessary and sufficient a little differently now? Whilst we use them almost every day, when talking causality, they are very carefully and purposefully selected words – they really do mean what they mean.

CLARITY OF PURPOSE IS SO IMPORTANT

Everything always comes back to purpose.

Have you been part of evaluations, where 6-12 months in, you’re starting to uncover some really important learnings… but you can’t quite recall exactly what you set out to explore when you started, and now you’re overwhelmed with choices about what to do with what you’ve learned… and sometimes you don’t end up doing anything with the learnings?

Or perhaps the opposite, where 6-12 months in, the learnings that are starting to emerge are really not meeting your expectations, and you’re wondering if this whole evaluation thing was a waste of time and resources?

Whilst there are many types of evaluations, one evaluation cannot evaluate everything.  A good evaluation is purposely designed to answer the specific questions of the intended users, to ensure it can be utilised for its intended use.  It’s critically important to ensure the evaluation, and all those involved in it, remain clear about its intended use by intended users.

A simple taxonomy that I find helpful is one proposed by Huey T. Chen (originally presented 1996, but later adapted in his 2015 Practical Program Evaluation).

Chen’s framework acknowledges that evaluations tend to have two main purposes or functions – a constructive function, with a view to making improvements to a program; and a conclusive function, where an overall judgement of the program’s merit is formed.  He also noted that evaluations can be split across program stages – the process phase, where the focus is on implementation; and the outcome phase, where the focus is on the impact the program has had.

The four domains are shown below:

  • Constructive process evaluation – provides information about the relative strengths and weaknesses of the program’s implementation, with the purpose of program improvement.
  • Conclusive process evaluation – judges the success of program implementation, eg, whether the target population was reached, whether the program has been accepted or embedded as BAU.
  • Constructive outcome evaluation – explores various program elements in an effort to understand if and how they are contributing to outcomes, with the purpose of program improvement.
  • Conclusive outcome evaluation – provides an overall assessment of the merit or worth of the program.

This simple matrix can serve to remind us of the purpose of the particular evaluation work we are doing at any given time.  It is simple, and there are of course nuances, where you may have an evaluation that spans neighbouring domains, or transitions from one domain to another, but despite its simplicity, I have found it a useful tool to remind me about the focus of the current piece of work or line of enquiry.

ARE WE NATURAL EVALUATORS?

Think about the last time you bought a car, chose where to live or decided which breakfast cereal to throw in the trolley (or in the online cart if you’re isolating like we have been for the past few days) … without overtly realising it, you quite possibly followed the logic of evaluation.

For those interested, you can read more about Michael Scriven’s work on the logic of evaluation here, but to paraphrase, evaluative thinking has four key steps.

  1. Establishing criteria of merit
  2. Constructing standards
  3. Assessing performance
  4. Evaluative judgement

Let’s apply this to buying your next car. 

Establishing criteria of merit

What criteria are important to consider in buying a car?  Maybe we have a firm budget, so any car we’re going to consider will need to perform well against that criteria.  Maybe we’re climate conscious, so we will place more value on a car that has a better emission rating.  Maybe our family has just expanded, so need to find a car that will accommodate a minimum number of people.  Maybe we don’t like white cars, maybe we’re looking for a car with low kms, maybe we’re only interested in cars made by a certain company, because we believe them to be a safe and trust-worthy company.

The list of possible meritorious criteria is extensive, and quite dependant on who needs to make the evaluative judgement.  Two people shopping for a car at the same time may have quite different criteria.

Constructing standards

This is where we define how well any potential car needs to perform against the various criteria we identified earlier.  If our budget is a firm $10,000, then any car that comes in over budget is not going to score well against that criteria.  Some of us may even write a list of our must haves and nice to haves, and include our standards in that list.  Maybe our next car must be within our budget, must be able to carry five people, must a be no older than 10 years old.  We’d also, if possible, like to find a purple car, that’s located within 50km of where we live, so we don’t have to travel too far to collect it, and has had less than 2 previous owners.  We now have a list of criteria that are important to us, and we have set standards against each of those criteria.

Assessing performance

Now’s the point at which we assess how any contenders stack up.  Regardless of whether we’re scrolling through an online marketplace for cars, browsing various websites, or physically walking through car yards – we are assessing how each car we review compares to our defined criteria.  Maybe we find a purple five-seater car within budget, but it’s 200km away.  Maybe we find a blue five-seater car within budget and within 50km, but it’s 15 years old.  In the same way that our programs, policies, products and people don’t perform ideally against all criteria – our potential cars are the same.  We will take note in our heads, or some of us will actually take notes in a notebook or prepare a spreadsheet, where we track how each car performs.  We may find that one or two of the criteria we determined were important to us, were simply unable to be assessed because data wasn’t available.  We should learn from this, and perhaps rethink our criteria and/or standards for next time.

Evaluative judgement

Now we need to make our evaluative judgement, which will inform our decision.  Which car performed the best?  Often not all criteria are equally important, and we apply different weightings.  Maybe we determine that blue is pretty close to purple, so despite the car not being the right colour, it’s still a contender.  We need to determine how we will synthesise all we have learned from our evaluation activities, and make an evaluative judgement. (For those interested in more about how to integrate or synthesise multiple criteria, Jane Davidson has a nice chapter on Synthesis Methodology in her book Evaluation Methodology Basics).

We often work through these steps fairly quickly in our heads, especially if the decision is about what cereal to buy, or what to have for dinner.  Are all the kids going to be home? Does one child have a particular allergy or dietary preference? Is the budget tighter this week because we needed to spend extra on fuel?  We might not go to the effort of making a spreadsheet, or getting evaluation support to make these decisions… but we do employ some logical evaluative thinking more often than we realise.