Realist evaluation is a name for a form of program evaluation created by Pawson and Tilley in their 1997 book Realistic Evaluation (Sage). It focuses on descibing the context, mechanism, and outcomes - CMO configurations - of programs. It is a fairly popular form of evaluation, with hundreds of realist evaluations published in peer-reviewed journals since its introduction.
In 2020, Sebastian Lemire and colleagues published an important paper alleging that the center of the CMO configuration - mechanism - was poorly defined, with several incompatible definitions circulating following multiple papers of Pawson and Tilley, whose conception of mechanism allegedly changed during the decade following their original book. Lemire and colleagues also showed, through a review of published realist evaluations, that the semantic confusion didn't affect just Pawson and Tilley, but had spread to the literature as well. In a word, the evaluators who care most about mechanisms appeared to be deeply divided - or confused - about what a mechanism actually is. This would bad news for realist evaluation and perhaps for evaluation theory as a whole.
In their review of Pawson and Tilley, Lemire and colleagues identify three seperate definitions of "mechanism." Let's walk through the three definitions briefly with an example.
Program Components as Mechanisms
In their 1997 book, Pawson and Tilley define mechanisms as program components, the "ideas and opportunities" presented to participants. These are program components in the sense that they are the sort of things that would be listed in a logic model as an activity or process. For example, in an opioid treatment program, this sort of mechanism might include regular counseling and Suboxone.
Participant Reactions as Mechanism
In a later publication in 2004, the authors define mechanisms as the ways that participants respond to the program, that is, the "reasoning and responses" of participants to program activities. In the opioid treatment program example, this version of mechanism would correspond to the cognitive changes that participants undergo while engaging in counseling and taking Suboxone as directed.
Explanatory Accounts as Mechanism
Then, in 2008, Pawson and Tilley stated that mechanisms are not variables, but accounts of the interrelated processes that lead to outcomes. Mechanisms are causal stories, such as the story "When people who are addicted to opiates are provided with Suboxone and counseling, they are more likely to quit using drugs."
Plausibility and spread
The appeal of the first definition of mechanism, as program component, is that it fits easily into our existing representations of programs, such as logic models. This version of mechanism often corresponds to the things in the "Actions" or "Processes" column of the logic model. The mechanism is an identifiable verb phrase like "Provide counseling." The second definition of mechanism seems right because it looks more specific. In programs that seek to change human behavior and that depend on the agency of participants, the mind is the locus of the change. It would seem strange not to identify it as the the mechanism of the change as well. The appeal of the third definition of mechanism (as an account) is that it takes some of the pressure off of the task of identifying a precise mechanism and allows us to describe processes more generally. It also makes fewer philosophical demands, such as the demand that our causal account be of a real process. Whether or not we have certainty about mechanisms, it is always safe to venture an "account."

It turns out that about a third of the evaluations studied by Lemire and colleagues operated according to the first definition of mechanism and about another third operated according to the second definition of mechanism. About a quarter employed the idea of a general "underlying" latent cause. Sometimes realist evaluators also invoked the third definition of mechanism or another definition, but these were relatively rare. The field seems to be divided between mechanisms as components, participant reactions, and general unobserved causes. Lemire and colleagues conclude with the admonition that realist evaluators should think more carefully about mechanisms. They don't say much about what they think mechanisms actually are.
How should we define mechanism?
It would be easy to deliberately blur the lines around mechanisms and marvel at the difficulty of locating just one process (or entity) that produces outcomes in programs. Causal chains can get very complicated. In hospitals, patients are constantly dying of pneumonia because their immune system was weakened by another condition which was acquired due to lifestyle factors to which they were socially predisposed by virtue of political events that occurred before they were born - cause of death, pneumonia? This is a major problem for the CMO configuration, since there is no sharp distinction that can be drawn between "context" and an "event" such as an intervention. Most consequential programs work by reshaping whole networks of humans and nonhumans. However, just because causation is complicated, we can't give up on the idea of mechanism.
Mechanisms are one kind of cause. In particular, they are the causes that, if different, would have prevented the effect from occurring (at a particular magnitude) under exchangeable conditions. If we discover that, for a specific population, counseling with Suboxone is just as effective as counseling without Suboxone, then Suboxone is not the mechanism of the program.1 This definition of mechanism requires counterfactual reasoning - that is, reasoning about what would have happened if we had done something different under exchangeable conditions. When we compare the first two definitions of mechanism in the realist account to this counterfactual definition, we notice immediately that they both suffer from the same problem - they are insufficiently specific. Only some program components will actually turn out to be necessary when tested under exchangeable conditions. Only some participant reactions will turn out to matter versus other reactions. (Think of the difference between a program that works by changing a participant's mindset and one that works by getting participants into a routine.) The definition of mechanisms as accounts is not even in the right right ontological ballpark, since our explanatory accounts can change dramatically without having any effect on the outcome of the program - black box programs work just fine all the time.
One might say that the "exchangeable conditions" I mentioned above are doing a lot of work in my definition of mechanism and indeed they are. What counts as a mechanism is partly a matter of the framing of the causal story. One way to decide which conditions are the "background" exchangeable ones is to notice which features of the situation are unlikely to change if we don't intervene. In their 1997 book, Pawson and Tilley give us some helpful conceptual machinery in their notation for changing context. If the the context remains the same between T1 and T2 (i.e., pre and post) but the desired outcomes occur, then it is more straightforward to attribute the outcome to the firing of the mechanism. However, if the the context changes as well such that T1 occurs under C1 and T2 occurs under C2, then we are going to have a hard time distinguishing the effect of the mechanism from the causal influences of the new context. For example, imagine the substance abuse treatment program above trying to test the importance of Suboxone if, during the experimental period, the price of street drugs climbs sharply due to supply chain issues. The context has changed from T1 to T2 and we can't be sure if people are decreasing their drug use because of treatment or because they can't afford fentanyl anymore. If we are concerned about blocking this possible alternative mechanism we can check to see whether the price of street fentanyl has changed and, if it hasn't, we can rule out this alternative cause (a question which, I admit, I am glad isn't my job to personally investigate).
The question of whether realist evaluation is truly lost because their are different definitions of "mechanism" circulating in the literature is an interesting one. Pawson and Tilley certainly appear to know what a mechanism is in their 1997 book - they produce dozens of examples of mechanisms that fit my definition above (a kind of cause that would have prevented the effect had if not occurred under exchangeable conditions). In their extended discussion of mechanisms in their original work, they spill substantial ink explaining how to go about hypothesizing and then testing different mechanisms for programs. The main point is about grain size: the "program" is not a mechanism - the program contains mechanisms. This is a good point and evaluators should take note to the extent that they aspire to explain how any program actually works.
Taking a pragmatic (in the philosophical sense) stance, we might say that semantic confusion doesn't matter as long as evaluators point to the correct mechanisms. However, we shouldn't be surprised if these definitions lead evaluators to disagree about what counts as a mechanism. In the long run, these disagreements could substantially water down the precision of realist evaluation until it resembles an exercise in listing program activities, speculating about the mental states of participants, or assembling plausible causal stories. If the program's stakeholders assert in interviews that the mechanism at work is Suboxone treatment or if the participants taking Suboxone tell us that "it helps", should these findings take precedence over a controlled trial in which we find no difference between the Suboxone and no-Suboxone condition for a specific population? An understanding of mechanisms as "ideas and resources" or the mental states of participants could lead us to put more emphasis on these findings, while my definition of mechanism would lead us to postulate that it is not the mechanism that leads them to quit drugs. How should we treat a plausible causal story that coheres nicely and appears to "explain" the outcome when evidence is lacking? If we think that mechanisms are accounts, we might end up calling the story the mechanism, but if we think mechanisms are specifically those causal processes without which the outcome would not occur (at a given magnitude) under exchangeable conditions, then a story is never enough. While I think that Pawson and Tilley would be more likely to endorse my definition of mechanism than any of these pretenders, I've learned never to underestimate the drift that can occur as definitions become shorthand.
In other words, there is no practice that can't be undermined by its own theories.
Note: Suboxone is an evidence-based treatment that has been shown to increase the likelihood of opiate cessation. I'm just using it as a hypothetical example for a hypothetical population, such as "women between 16-20 years old."