01 August 2022

Problems requiring cross-agency work are complex, and these initiatives need to adapt and improve over time.

Evaluation supports this by telling us what is working, what is not, and what opportunities there are for improvement.

Find details for references in this guidance: Evaluating collaborative initiatives — further resources

  • Start early

    Evaluation planning should begin at the outset of the initiative, or before. Shared measures need to be agreed early in the initiative’s life, so that progress can be tracked consistently across the participating agencies.

  • Establish roles and resources

    In evaluations of cross agency initiatives, it can be challenging to manage the different demands of the involved agencies. Up-front clarity about who does what and how decisions will be made can prevent misunderstandings later on.

    Understand stakeholders and determine how to evaluate jointly across the involved agencies

    Evaluations of cross-agency initiatives require shared decisions about the evaluation planning, management, scope, team composition, methodology, and reporting. There are various ways of doing this. The United Nations Evaluation Group Task Force on Joint Evaluation (2013) identifies 3 main types of joint evaluations.

    • Classic: all agencies participate and contribute on equal terms.
    • Qualified: participation only by a core group of agencies.
    • Framework: agencies agree on a common evaluation framework, and each agency does their own evaluation, followed by a synthesis across the evaluations.

    Establish decision-making processes

    Governance and management responsibilities should be established at the outset, including responsibilities for day-to-day management, and feedback on and sign-off of deliverables.  At a minimum, the evaluation should have a steering committee and an evaluation management group.

    Decide who will undertake the evaluation

    You may use a team that is external to the initiative, or a team made up of people from the involved agencies. Consider:

    • where sufficient evaluation capability is present
    • whether the evaluation will be seen as more trustworthy if it is carried out independently
    • what contracting arrangements will be manageable given the complexity and longer timeframes associated with evaluations of joint initiatives.

    Secure resources

    Decide which agencies will provide cash or in-kind contributions, and how costs will be apportioned. Will one agency administer the funds on behalf of all, and if so, according to what rules, and with what kind of control and audit?

  • Define what is to be evaluated and why

    Evaluation purpose

    The evaluation purpose should describe how the evaluation findings will be used. Evaluations often have one or more of the following purposes.

    • to inform decision-making aimed at improvement
    • to provide lessons for the wider sector
    • to inform decision-making aimed at selection, continuation or termination of initiatives.

    Describe the initiative

    Develop an initial description of the cross-agency initiative, including the activities it involves, the needs it addresses, and the involved parties.

    Cross-agency initiatives are often implemented in different locations by different organisations, and employing different interventions. (Referred to as ‘multi-project programmes’ — see for example Buffardi and Hearn, 2015).

    You should describe the extent of the devolved delivery, and the expected coordination.

    Use an intervention logic

    You should develop an intervention logic diagram of the inputs, activities, outputs and intended outcomes. You can start to develop it by filling in a table like this.

    Inputs

    What you invest, for example:

    • time
    • money
    • equipment.

    Activities

    What you did to plan and implement the initiative, for example:

    • policy development and planning
    • training
    • delivery of services
    • development of resources.

    Outputs

    What you produce, for example, tangible products and services such as:

    • policy papers, guidance documents, other published deliverables
    • new, improved or expanded services delivered to clients
    • established cross-agency working processes.

    Outcomes
    (short-medium term)

    Change expected to be achieved in 1–4 years, for example:

    • improved awareness of the issue
    • greater access to services among the target population
    • improved staff capabilities
    • improved short-term health, social, economic, cultural outcomes for service users.

    Outcomes
    (long term)

    The national or community-level changes expected to be achieved in 5+ years, related to the cross-agency initiative’s mission.

    For multi-component programmes, developing an intervention logic can be challenging, but with careful thought it’s usually possible to identify common features, grouping by type of activity, type of delivery organisation, or type of output or outcome. This was done for the evaluation of the Prime Minister’s Youth Mental Health Project (Example 1).

    The intervention logic covers intended outcomes. You should supplement it by working with stakeholders to identify possible unintended results (positive and negative). 

  • Example 1: Evaluation — The Prime Minister's Youth Mental Heath Project (YMHP), New Zealand

    This evaluation assessed short- to medium-term progress towards outcomes for the YHMP, which was a multi-component programme. Its intervention logic integrates across the programme’s components, and groups features of the initiative into 3 areas:

    • policy and governance by central government
    • service delivery by schools, health or social services
    • information or resources by various entities.

    Find the YMHP Logic Model on page 99 of The Prime Minister’s Youth Mental Health Project Summative Report 2016 — Social Wellbeing Agency

  • Frame the boundaries of the evaluation

    Key evaluation questions

    The evaluation questions need to take into account the length of time that the initiative has been running and the purpose of the evaluation. Some common questions are:

    • how well has the cross-agency work been implemented?
    • what improvements can be made to the implementation?
    • what outcomes have resulted from the cross-agency work?
    • what is value for money has the cross-agency work provided?
    • what aspects worked well or less well, for whom, and under what circumstances?
    • what can be done to improve effectiveness or value for money?
    • does the cross-agency work address a demonstrable need?

    Cross-agency initiatives tend to be complex, and it takes time to generate outcomes. The table below identifies 3 phases in initiative development, and describes the evaluation needs at each (adapted from Preskill, Parkhurst and Splansky Juster, 2014).

     

    Early–middle years

    Middle years

    Late years

    Stage of development

    The initiative is in development

    The initiative is being refined

    The initiative is stable and well established

    What’s happening?

    Partners are assembling key elements of the initiative, developing action plans, and exploring strategies and activities.

    There is some uncertainty about what will work and how.

    New questions, challenges, and opportunities are emerging.

    Key elements are in place and partners are implementing agreed strategies and activities.

    Outcomes are becoming more predictable.

    The context is increasingly well-understood.

    Activities are well established and not changing.

    Implementers have significant experience and increasing certainty about ‘what works’.

    The initiative is ready for a determination of impact, merit, value, or significance.

    Strategic question

    What needs to happen?

    How well is it working?

    What difference did it make?

    Evaluation focus

    Help agencies understand how the initiative is developing and what context it is working in

    Develop shared performance measures to track progress

    Understand whether the initiative has achieved its intermediate outcomes (changes in practice, behaviour, or how systems operate), using the shared performance measures.

    Understand whether the initiative has achieved its long-term outcomes.

    Criteria for measuring success

    For each evaluation question, you need to determine what level of achievement constitutes success and how you will judge whether it has been achieved.

    For example, if one of the objectives of the initiative is to improve health outcomes and increase health equity for Māori:

    • what health outcomes are important (for example in a particular locality where across agency working is being focused) and can be measured reliably? (Because health outcomes can be slower moving, increasing access to health promoting initiatives (such as healthy housing), preventative services (such as GPs and community health services) or treatment services may need to be used as a proxy in the short term).
    • what level of improvement constitutes ‘better’ (for example, is a 5% reduction in the incidence of a health problem a good outcome, or is a bigger improvement desired?)
    • has there been an increase in access to services, improvement in experience of those receiving services, and better outcomes, including for Māori and groups with poorer health outcomes?

    The evaluation needs to be informed by the kaupapa of the initiative, including values and frameworks developed by the population that the initiative supports.  Examples of these are the Enabling Good Lives principles for disability and the strengths-based Māori intervention logic applied to E Tū Whanau.

    Another objective might be to improve how agencies work together, share resources and support each other to achieve mutual goals. Different agencies may have different views on what success looks like and how it should be measured. Shared measures of success should be developed and agreed by the involved agencies. Shared measurement is covered in more detail below.

  • Describe what has happened

    What kinds of data to collect

    Cross-agency initiatives are diverse, and there is no universal recipe for what data you should collect and how. Decisions about data should consider the evaluation’s purpose and key questions, and the nature of the initiative. Data sources can include documents, existing datasets, such as national statistics or administrative data, and interviews or surveys with people involved in the initiative. Preskill, Parkhurst and Juster (2014) suggest some indicators of effective cross-agency work.

    Many cross-agency initiatives now use anonymised service and outcomes data from Stats NZ’s Integrated Data Infrastructure to support initiative design (for example by developing case histories of individuals’ interactions with different agencies and their outcomes) and track results. It is important to disaggregate data, where possible, to understand the difference an initiative is making for different groups, and to bring those views into the evaluation data to inform findings and to identify more nuanced areas for improvement. 

    Shared measurement

    A set of shared measures should be set up early in the life of the initiative. Shared measures are crucial to allow the evaluation to draw overall conclusions and assess overall value.

    Shared measurement involves the use of a common set of measures to monitor performance, track progress towards outcomes, and learn what is and is not working in the group’s collective approach. The issues in doing this are discussed by Kania, J. and Kramer, M. in Collaboration for Impact (2011).

    Shared measures can include information on inputs, outputs, outcomes, and context. A shared measurement system should include the following features.

    • Agreement on shared outcomes: the agencies should have shared goals and should measure progress towards them using the same tools. Allows some flexibility for agencies to choose the measures that are most relevant to them.
    • Consistent methods: use consistent methods when measuring. This means consistent ways of collecting, analysing, and reporting data. Confidentiality and transparency should be addressed when these methods are developed.
    • An ability to compare: agencies should be able to compare their results to those of similar agencies. This helps them to understand context and to learn which approaches are more effective. It may involve technology such as platforms for data aggregation.

    Case study approaches for multi-component programmes

    For some multi-component programmes, there may be so many individual interventions that not all can be evaluated in depth within budget. You can use a case study approach, selecting a subset of the initiatives for in depth examination. There are 2 main approaches to selecting case studies, depending on what you want to learn.

    • You can choose cases that are typical of the wider programme, to understand what the wider programme is achieving.
    • You can choose cases that are not typical (that did things differently, or that were more or less successful than average) to find factors associated with success or new ways of working that the wider programme can learn from.
  • Understand attribution of outcomes

    When analysing outcomes from the cross-agency initiative, you need to look for evidence that the outcomes were caused by the initiative. For example, if offending decreased in regions where cross-agency initiatives worked to reduce offending, did the cross-agency initiatives contribute, or would the decrease have happened anyway?

    There are 3 main approaches to assessing attribution:

    • Randomly assign people or organisations to participate in or not participate in the initiative, and compare outcomes between those who did and did not participate.
    • Compare the outcomes of participants to the outcomes of a group of non-participants who are as similar as possible.
    • Compare what you observe with what you would expect if the initiative contributed to the outcomes.
      • Look at timing and intermediate steps. For example, did the reduction in offending occur after establishment of local initiatives that could have influenced that?
      • Investigate alternative explanations. For example, is the reduction in offending better explained by wider socio-economic changes, or by other initiatives in the regions?
      • Explore the perceptions of participants and well-informed observers (for example, community leaders) about the extent to which the initiative contributed.

    For aspects of some cross-agency initiatives, you may be able to find a comparison group.

    • An intervention that delivers services to clients can compare clients’ outcomes with outcomes for a similar group of people who were not clients.
    • When interventions are rolled out only in some regions, you can compare outcomes between participating and non-participating regions, with careful accounting for other regional differences.

    But the comparison group approach is not usually feasible for outcomes that can’t be counted, or for initiatives that are rolled out nationally. And it will not tell you what the cross-agency work contributed, over and above a less coordinated approach. You need to investigate whether the evidence is consistent with what you would expect if the initiative contributed to outcomes. This approach was used in an evaluation of the New Zealand Results Programme (Example 2). You can also use the ‘contribution analysis’ framework to investigate attribution without a comparison group (Better Evaluation, no date; Mayne, 2008).

    Sometimes the partners in cross-agency work want to identify which outcomes are attributable to which agencies. This is rarely possible, but the issue should be raised early, so that the agencies can agree on an approach. Less integrated components of the initiative may be able to use the methods described above to look at attribution to individual agencies. But for work that is fully integrated across agencies, you can only state which agencies contributed and what their inputs were. 

    For example, where health issues prevent people from being employed, health agencies may work with the Ministry of Social Development (MSD) to assist that part of the population.  While it is possible to measure what contribution the health agencies have made, other factors such as the skills of the MSD case managers will affect whether mitigating the health issues leads to employment for those people. 

  • Example 2. Case Study — New Zealand’s Results Programme

    This is an evaluation that assessed the effectiveness of a new approach to cross-agency work. It provides a good example of investigating attribution by assessing the evidence for alternative explanations.

    In 2012, the New Zealand Government created interagency performance targets. Ministers chose 10 cross-cutting problems, set a 5-year target for each, and then held the leaders of relevant agencies collectively responsible for achieving the targets.

    A synthesis of the findings from 4 evaluations of the Results Programme found that positive results were evident for each of the targets. But this alone did not show that the Results Programme was responsible. The evaluators considered 6 alternative explanations for the improvements and assessed the strength of the evidence for each:

    • Improvements represent some sort of natural progression.
    • Improvements resulted from increased attention (government can only focus on a limited number of problems, and the 10 may have improved purely because they were selected).
    • Improvements were due to the delayed effects of prior actions.
    • Measures are imperfect and may not be good reflections of the underlying problems.
    • Measures were ‘gamed’, so that they improved but the underlying problem did not.
    • Measures improved due to additional resources provided.

    They found that other explanations could explain part of the observed changes in some of the results, but that none explain all of the improvements. They concluded that the Results Programme contributed to the improvement in the 10 areas.

    Report: Scott & Boyd (2017)

  • Synthesise findings

    Your synthesis should be shaped by the evaluation purpose, collating and interpreting findings so that decisions can be made. Synthesis should involve:
    • assessment of effectiveness or implementation against the success criteria
    • assessment of value for money, cost benefit, or cost effectiveness (this could use tools such as CBAx, from The Treasury)
    • describing what has worked and why, and what can be improved
    • generalisation of findings into lessons for the wider sector.

    Weighing up the overall benefits across the interventions can be difficult, especially for multi-component programmes that encompass diverse interventions. The evaluation of the YMHP (Example 1) demonstrates how to synthesise findings across multiple interventions. Remember that some multi-component programmes include the goal of experimenting with different approaches to find out what works; less successful projects can be beneficial if they provide lessons for the wider programme on what works and what does not.

  • Report on and support the use of findings

    Reporting and dissemination

    The involvement of multiple agencies can add complexity to reporting. Different agencies may have different views about what should be reported and how, and gaining support from all stakeholders can sometimes be at odds with providing free and frank advice. There are strategies that can help with navigating these competing demands.

    • Agree in advance who will sign-off the report (for example, this could be the steering committee), and develop principles of transparency in reporting.
    • Develop a consultation strategy that identifies who to consult, about what, and what obligation the evaluation has to respond to their feedback. Some agencies may be consulted for fact-checking only.
    • Decide whether there will be a combined report, or separate reports for each agency.
    • Build in time to deal with comments on the draft report that are wide ranging and contradictory in nature.

    Supporting use of the findings

    Interaction with users throughout the evaluation can focus both on helping them to understand the findings and on supporting them to decide how to respond.

    Evaluation findings are harder to use when they do not fit easily within the remit of individual agencies. This issue can arise in evaluations of cross-agency initiatives. Support the use of the findings by working with the agencies to develop a joint follow-up action plan.

    Publish the report

    We strongly encourage publication of the evaluation findings as a way to propagate lessons across the public sector, supporting the development of a better, stronger system.