At the 2024 meeting of the American Evaluation Association, I had the opportunity to spend a whole day in a workshop on cost-inclusive evaluation taught by Dr. Brian Yates. The main argument of the workshop was that pretty much every evaluation should strongly consider including data on costs. Yates has argued over the years that there are several compelling reasons for this in program evaluation.
Information about costs is critical for securing funding and ensuring the long-term sustainability of programs. In an environment of limited resources, we should prioritize interventions that provide the most value for money. Cost-inclusive evaluation provides decision-makers with the evidence they need to allocate funds effectively. This is not confined to “go, no-go” decisions about programs - it also helps optimize resource allocation within programs. Yates provided examples of programs that were optimized substantially by simply changing their method of delivery – think of the cost savings of telehealth versus in-person office visits. We want to understand the full impact of interventions, including both direct and indirect costs, as well as monetary and non-monetary benefits. Finally, perhaps the most obvious reason for cost-inclusive evaluation is that it promotes transparency and accountability. This transparency increases trust in evaluation findings and promotes use.
All evaluators have their own version of the history of evaluation. We are a young field (the American Evaluation Association is only 39 years old!) so this history is still very contentious. You can learn a lot by listening to how seasoned evaluators like Brian Yates tell this story. In Yates’ version, evaluation started out mainly measuring the costs of setting up programs. Over the years this gave way to a broader view that sought to measure activities, processes, outcomes, and impacts. Today most evaluations focus on these latter parts of logic models. Many program heads and funders want to establish that their program has any positive outcomes before they would consider costs. However, Yates argues that it is time for evaluators to spend more time thinking about costs again, without dropping the measurement of activities and outcomes. In fact, measuring activities and outcomes along with costs is what Yates recommends to get a complete picture of what is going on.
At AEA this year, one could hear several presenters lamenting the fact that the main output of cost-benefit analyses, the benefit-cost ration (BCR), is too simplistic because it boils down the entire evaluation to single number. Add up all the monetary benefits, add up all the monetary costs, apply discounting for monetary transactions that will occur in the future, and divide benefits by costs. If the number is greater than one, the program was “worth it.” Hidden in the BCR (or ROI), Yates explained, was a considerable amount of valuable data that could be used to improve the evaluand. How much particular activities or particular outcomes cost to produce is the lever that planners need. Yates explained that many programs turn out not to be cost-effective because they require too much administrator time. The time of highly-paid staff can be reduced with improvements to workflows. For example, if your program can devolve authority for certain decisions to the lowest level possible, empowering frontline workers, this could make the difference between a program that pays for itself and one that is a net drain on resources. This is why “train the trainer” models are such a good value.
At the workshop I began to wonder why I had been reluctant at times to include cost data in my own evaluations. Listening to Brian Yates, I realized that I had been putting too much emphasis on uncovering 100% of the monetary costs and benefits of an evaluation. Yates encouraged us to look for the primary drivers of cost rather than to obsess about counting every minor cost. In human services programs, the main driver of cost is staff time.
“It’s not the dime, it’s the time,” Yates said.
Yates cautioned us to be skeptical of doing cost accounting according to public budgets, since these are always wrong. Instead, we should be asking for real expenditures of the program. By digging into the costs of the program, we are bound to learn things that never would have come up otherwise. For example, when mental healthcare providers conduct home visits, do they go alone (as they would when seeing patients in the office) or do they bring an assistant along to babysit children while the visit is ongoing? Using two staff instead of one for the same task will come up when looking at hourly time logs, but it may never come to our attention if we are focused primarily on measuring outcomes.
If we have cost data, we can quantify the cost of each unit of an outcome, even if we don’t want to monetize those benefits. This is called cost-effectiveness analysis. For example, it can be useful to quantify the cost of “one opiate-free day.” Doing this in the other direction, quantifying the outcome in terms of monetary benefits is known as a benefit analysis, for example “$100 saved per participant.” The fact that these outcomes may sound more concrete than typical outcomes evaluation findings is both exciting and a little dangerous. In his published1 work, Yates cautions:
Although preceding a number with a currency symbol seems to instantly validate that number for too many readers, for example, “an average $5,255 was saved per patient following treatment,” these remain values of variables for which reliability and validity can be assessed. Accurate measurement of benefits, and costs, is especially important because errors in benefits and costs can affect intervention funding more than other forms of evidence, for example, effect size.
Modeling uncertainty about our estimates of cost, for example by calculating the standard error of cost estimates, reveals that there are often much bigger variations in cost per program or costs per participant than we might expect. This uncertainty exists alongside our uncertainty about the program’s outcomes. A sensitivity analysis can explore the various combinations of best and worst case scenarios for the relationship between outcomes per person and costs per person. We might find that we end up paying more than we thought we would for less of an outcome we want, depending on how the chips fall. To drive home this inherent uncertainty, Yates advocates using a data visualization that I had never seen before, in which the x-axis shows costs and our uncertainty about them and the y-axis shows outcomes and our uncertainty about them. This data representation uses two dimensions to show our uncertainty, allowing us to see what costs cannot be distinguished from one another and what outcomes cannot be distinguished from one another.
The next part is how Brian Yates officially knocked my socks off: he divided the outcome space to show the minimum acceptable outcome and the maximum allowable cost. This bit of logic accords with Scriven’s idea that there are certain criteria of merit that, if they are not met, cannot be compensated for by other criteria. Exceeding maximum available resources is always one such criterion and failing to generate certain minimum outcomes can (and should) be another. Here is what such a graphic looks like:
In this data visualization, until we add the lines representing our criteria of merit, Option A is what we choose if we are trying to minimize cost, Option C is what we choose if we are trying to maximize outcomes, and Option B appears as a compromise between the two. Once we add our criteria to the graph, we see that Option B is the only feasible option, since it is below our maximum allowable cost (x-axis) and above our minimum acceptable outcome (y-axis).
The world according to Brian Yates currently looks like this: we have a lot of expensive, poorly scalable, but pretty effective, interventions (e.g., most individual-level interventions like counseling) and a lot of inexpensive, scalable, but less effective interventions (e.g., mass media public health campaigns). When we focus mainly on outcomes, we favor the first kind of intervention. When we focus mainly on costs, we favor the second kind of intervention. We will never have enough resources to solve society’s problems using either of these kinds of interventions, because the first kind runs out of money and the second kind doesn’t move the needle far enough. By focusing on the relationship between cost an outcomes, he hopes, we can see more of a third kind of intervention: the inexpensive, effective, and scalable kind.
Yates, B. T. (2023). Cost-inclusive research on health psychology interventions: Why, how, and what next. Health psychology, 42(3), 139.
Your assertion that we all want more benefit for less cost sounds very reasonable, yet is somehow not universal.
Some of your points are reminiscent of those made by Effective Altruists. Does utilitarianism play any role in your beliefs about the importance of analyzing program cost?