Low-Accountability Behavior
Evaluation's Adversary and a Public Menace in General
There is a term I’ve been using a lot recently that will probably be understood instantly by many readers: low-accountability behaviors. I’m writing a piece on this idea since I find it so useful and scalable, and because I want to clarify it quite a bit more. In a brief definition, behaviors can be characterized as “low-accountability” when they proactively prevent, deflect, or reject accountability.
Consider how the debate tactic of “playing the devil’s advocate” can produce propositionally valid arguments but is irritating because it is a low-accountability behavior – the disputant does not have to actually accept the position for which they are arguing, and any ethical troubles they get into during the course of argument can be easily waived away. Likewise, institutions that elicit feedback from stakeholders but then don’t make a report available describing the data in detail can indeed benefit internally from this practice and make genuine progress, but this this is irritating to stakeholders because it is a low-accountability behavior. Societies that excuse present immoral conduct by pointing to past immoralities visited upon them may find themselves able to effectively rebuff some criticism, but poison their own futures. These are examples of low-accountability behaviors, but for this idea to be clear and distinct, we need to do some definitional work.
Once we’ve done this, I think you’ll see why low-accountability behavior is a key concept the ethos of evaluation and arguably evaluation’s main adversary.
Defining Accountability
Before we can name this adversary, let’s establish what accountability actually means. My definition of accountability is the correct actors provide answerability and accept consequences. To be accountable means you must explain and justify your decisions (answerability) and face appropriate consequences if those explanations reveal wrongdoing or negligence. Both elements are necessary. Answerability without consequences gets you performative transparency, that is, plenty of explaining with nothing changing. Consequences without answerability means that people can be punished but they never have to explain or justify their actions (which, incidentally, is why the American criminal justice system is not a form of accountability).
We need to specify that accountability only obtains when the correct actors are its object because when accountability lands on the wrong people it becomes scapegoating. The American “No Child Left Behind” policy is a cautionary example: the policy aimed to hold schools accountable for student outcomes, but consequences routinely fell on individual teachers rather than the decision-makers who created poor educational conditions. The result was that many educators experienced “accountability” not as a legitimate professional obligation but as a stick used to beat them for institutional failures beyond their control.
This is part of why accountability became a dirty word in education policy circles and why we need to handle the term carefully while still insisting on its importance. The damage to the mere idea of accountability lasted into the 2010s. Then, just when one might be forgiven for thinking that the concept of accountability was about to diminish in American culture, the 2010s saw a surge of accountability discourse in which institutions and individuals increasingly called to answer for their behavior. The pendulum has of course swung and we now appear to be moving through a new low-accountability moment, where the evasion of answerability and consequences has become normalized across many domains of public and organizational life. Resistance to accountability means that we are seeing less answerability and fewer consequences, that is, low-accountability behavior.
Trans-historically, every family, village, and society is built on a foundation of accountability. If the word makes you cringe, I’m betting that it’s probably because you’ve played the role of scapegoat at some point in your life or been constantly threatened with it. The grand view is that accountability is a cultural and civilizational technology that different groups try to distribute and tune to achieve optimal outcomes. Do we need a law for x or will social stigma be enough to prevent people from doing the wrong thing? Should people always tell their partner when a mutual friend tries to flirt with them? Whose fault is it if you slip and fall at work? Can the people dissolve the government? Every culture appears to have answers to questions about who can be accountable to whom, for what, under what conditions.

What Is Low Accountability Behavior?
If accountability is answerability plus consequences, then low accountability behavior is any strategy, conscious or not, that disrupts the accountability equation. It can operate by avoiding answerability, escaping consequences, or displacing accountability onto the wrong actors, and it typically does this by preserving the form of accountability while severing its function.
Let’s think about how this happens at interpersonal, institutional, and ideological scales.1
At the interpersonal level, you will recognize it in familiar patterns of discourse. I state an opinion with confidence, then when contradictory evidence emerges, I retreat — not by saying “I was wrong” but by suddenly discovering I don’t really know much about the topic after all or by pretending that I was making a much more limited claim (a rhetorical move known as the Motte and Bailey). I demand the rights of participation in collective decision-making while refusing to stand behind my own judgments within that process. I assert entitlement to influence while rejecting the obligation to be answerable for how that influence was exercised.
These interpersonal patterns scale with remarkable consistency to institutional and systemic levels. Since the currency of bureaucracy is information, institutions operating with low-accountability behaviors mostly try to control it via embargoes of one sort or another. We could hope for no better example than the recent US Democratic Party’s postmortem report on the 2024 election, which leadership initially promised to release, refused to release, then reluctantly released. While the party has already faced consequences it was still trying to avoid real accountability by shrugging off answerability.
At the ideological level, we see versions of the same sort of evasions, most notably DARVO (Deny, Attack, Reverse Victim and Offender) and the familiar cascade of “it didn’t happen but if it did happen it was fine.” No-true-Scotsmans of the “true [political system] has never been tried” variety and conspiracy thinking about well-established facts pervade the discourse of every mainstream political movement and most of the minor ones to boot.
Here is my growing list of low-accountability behaviors, grouped by scale:
Interpersonal LABs
Ambiguous Commitments: making promises vague enough that any outcome can be retrofitted as fulfillment (”I’ll look into it”).
Generalized Detachment: treating sincere engagement as gauche and thus deflecting answerability.
Reap the Vacuum: refusing to make a decision and then blaming the outcome on whoever filled the void. “Well, you decided, so it’s on you.” The non-decision was itself a decision but is treated as a non-event for accountability purposes.
Litigating Feedback: soliciting critique, then turning the conversation into a defense of each point, such that the feedback-giver ends up doing more accountability work than the recipient.
“I’m just asking questions” : advancing a claim through interrogative form so that, if challenged, one can disclaim ever having asserted it.
Preemptive Self-deprecation: “I’m probably wrong about this, but…” or “I’m no expert, but…” used not as genuine epistemic humility but as a rhetorical airbag that absorbs any subsequent challenge before it lands.
Retroactive Intent: when an action lands badly, claiming the goal was always something different from what was originally signaled.
“After everything I’ve done for you”: taking on extra work or favors in ways that make subsequent accountability claims feel ungracious. This operates pre-emptively, by building up a ledger of generosity that can be cashed in to deflect.
Strategic Vulnerability: disclosing personal hardship at moments when accountability is approaching, such that pressing the point now feels cruel. This is distinct from genuine self-disclosure because the timing is connected with impending challenge rather than with the relationship’s natural rhythm.
Flip-out: emotional overreaction to any attempt at being held accountable, putting the interlocutor in the position of calming the mood.
“Let’s agree to disagree”: invoked when the conversation has reached a point where one party would have to either concede or defend further, and used to make continued pursuit of the question feel like incivility.
Performative Listening: “I hear you,” “That’s a really good point,” followed by no behavioral change and no engagement with the substance of the issue.
Absent Authority: “My partner / my boss / my doctor wouldn’t want me to do that.” Sometimes true and reasonable; the low-accountability use is when the absent party is invoked specifically to foreclose negotiation while preserving the speaker’s likeability.
Institutional LABs
"We’re working on it": invoked about long-standing patterns that show no change. The claim of effort substitutes for evidence of effort. Difficult to challenge because falsifiability requires institutions to be very transparent.
Strategic Timing of Disclosures: releasing damaging information during a major news cycle or scheduling controversial decisions for periods when stakeholders are predictably unavailable (summer, holidays). Answerability is technically satisfied while practically nullified.
Manufactured Complexity: generating documentation, processes, or technical apparatus so dense that meaningful external review becomes infeasible. A 400-page environmental impact statement with the critical exception buried in Appendix C.
Consultation as Legitimation: running stakeholder engagement processes whose outputs are decoupled from decisions already made, then citing the consultation as evidence of accountability. (This is adjacent to evaluation theater but operates pre-decision rather than post.)
Responsibility Hot Potato: restructuring or reassigning personnel frequently enough that no individual holds a portfolio long enough to be answerable for its outcomes. Common in agencies facing recurrent scandals.
Outsourcing: contracting out functions to vendors, then citing contractual confidentiality or vendor independence when asked to answer for outcomes. The accountability gap between client and contractor becomes a structural feature.
Metric Swap: changing the indicator being reported (e.g., from “outcomes” to “outputs,” or from “served” to “contacted”) so that historical comparisons become meaningless or year-over-year performance appears stable or improved.
Process as Outcome: reporting on the activities undertaken (”we held 47 community meetings”) rather than what they accomplished. Especially common in DEI work, community engagement, and philanthropy.
Rule-shaping: intervening in the process of rule-making in ways that benefit the institution but not the public, as in regulatory capture. Until 2019, when the 737 MAX starting falling out of the sky, the government allowed Boeing employees to conduct safety inspections and issue airworthiness certifications on behalf of the FAA.
Pre-emptive Policy Revision: when a failure is imminent, changing the policy or standard such that the failure is retroactively reclassified as compliance.
Systemic Blamelessness: claiming that mysterious systemic factors should bare the blame when the choices of particular individuals are clearly at play.
“We take this seriously”: a genre of institutional communication that performs the affect of accountability (concern, gravity) while making no specific commitments and naming no specific actors.
Classify Everything: locking down all information by default so that outsiders and subordinates can’t use any of it to pin down the institution’s intentions, commitments, or reasoning.
Ideological LABs
Whataboutism: deflecting accountability by pointing to the unaccounted-for sins of an opposing group, treating moral discourse as zero-sum.
Kafkatrapping: treating denial of an accusation as confirmation of it (”the fact that you deny being X is itself evidence of X”). This forecloses answerability by making the very attempt to answer into further indictment.
Excommunication: when a movement faces a credible internal critique, redefining the critic as never having been a true member, thereby preserving the movement’s claims from accountability. Adjacent to no-true-Scotsman but with a social enforcement mechanism.
Complexity and Common Sense: symmetric ideological moves that either inflate or deflate the epistemic difficulty of a question to avoid being pinned to a claim. “It’s more complicated than that” and “It’s really very simple” can both be accountability-evading depending on context.
Historical Inevitability: locating responsibility for current outcomes in impersonal forces (the market, history, technology, demographics, God) such that no actor can meaningfully be held to answer.
Comfort to our Enemies: refusing to acknowledge in-group wrongdoing on the grounds that doing so would benefit adversaries. The accountability claim is acknowledged in principle (yes, we’re guilty) but indefinitely deferred on tactical grounds.
Aestheticization of Politics: converting political commitments into matters of taste, style, or identity expression, such that critique of consequences feels like critique of self. It’s hard to be answerable for a policy outcome when the policy is a vibe.
There are some commonalities to note among these strategies. Many of them exploit the asymmetric costs of pursuing accountability. The complainant must spend social capital, emotional energy, and time to press the point while the evader expends nothing by deflecting. LABs are self-reinforcing at the population level because once enough actors observe that low-accountability moves go unpunished, people update their beliefs about how trustworthy other actors are and everyone shifts toward more defensive, cynical play in our assorted Prisoner’s Dilemmas.
Many low-accountability behaviors also parasitize cooperative norms like Gricean maxims, charitable interpretation, the presumption that people mean what they say, such that naming them is effectively calling your interlocutor a liar – try saying “No you’re not” next time someone says “I’m just joking” and see how they react (time for a Flip-out?). Nearly every item on my list has a plausible innocent reading available, which is key to the mechanism of low-accountability behaviors.
Another pattern is that several of the the behaviors operate by altering when accountability can be claimed. Preemptive Self-deprecation moves the airbag in front of the impact while Retroactive of Intent moves the goalposts after. “Let’s agree to disagree” and “I’m working on it” defer accountability indefinitely. Accountability has a natural temporal structure (claim, answer, consequence) which low-accountability behaviors try to disrupt wherever possible.
Agency and Accountability
If we cross-tabulate accountability and agency, it turns out that we can describe quite a lot of behavioral patterns at different scales. The term “high-agency”, a buzzword on Substack in the last couple of years, amounts to taking initiative, owning outcomes, acting boldly rather than waiting for permission. I think this discourse is valuable, but there are some important counterpoints. Frankly, a lot of the high-agency discourse is toxic sludge. I submit that the easiest way to tell the difference between the good and the bad here is how high-agency behavior interacts with accountability.
Consider one exemplar of high-agency behavior: Sam Altman of OpenAI, who dropped out of Stanford to found a social media company at 19, raised a bunch of VC funding, sold the company for tens of millions, became a partner then president of Y Combinator, and the rest is history – very high-agency stuff. Altman was recently in a San Francisco courtroom discovering how damaging discovery can be. According to testimony from former OpenAI board members, when faced with scrutiny over allegations of running the company through deception, his response was “ I can’t change my personality.” Could God possess agency so great that even he cannot be held accountable for it?
And the pattern scales. The broader AI discourse exhibits a similar moral logic: AI systems should be granted significant decision-making power, but when those systems cause harm, accountability is diffuse and displaced. High agency, low accountability.
This gives us four quadrants. Imagine people, institutions, and ideologies occupying each one. For the sake of clarity, let’s think about organizations as the unit of analysis and how these organizations might behave towards evaluators.
Low agency, low accountability organizations are the classic targets of evaluation because everyone can acknowledge they are broken. They lack clear direction and simultaneously hide the consequences of their decisions behind opacity and diffusion. These are what most people mean when they talk about broken bureaucracies. The Veterans Administration is a good example of this quadrant – a bureaucracy that is going nowhere fast and shows complete disinterest in owning its dysfunctions. When we do evaluations in organizations like this, we know in advance that evaluation will probably not result in any meaningful improvements since the organization evinces so little agency, but we also doubt that the evaluation will even be taken seriously because of low accountability. The low-accountability behaviors we commonly see are: avoiding evaluation, diffusion of responsibility, and finger-pointing.
High agency, low accountability organizations are considerably more sophisticated in their evasions. They commission studies, announce commitments, and deploy the language of accountability while systematically preventing consequences from taking hold. They loudly create a safety team for the product, then sneakily disband it. Answerability is usually performed but logical consequences are avoided. The low-accountability behaviors of this quadrant are evaluation theater, embargoes of completed evaluations, the “guided tour” inquiry (you only see what you’re supposed to see), and the edited-to-ribbons public report that is rewritten line by line by the PR Department.
Low agency, high accountability organizations are perhaps the most tragic to work with. They tend to be positioned to absorb blame for failures that originate far above them in a hierarchy. Every evaluation becomes another document of inadequacy. This may in fact be their social function within a larger system seeking to avoid its own accountability. They are scapegoats. The evaluator is the butcher. Don’t be a butcher.
High agency, high accountability is what evaluators want organizations to be and we want to help them get there. These institutions act boldly and answer fully for what they do. This is rare but there are examples. Consider Anthropic’s regular releases of safety studies on their own products, even when these studies show alarming results. Rather than immediately release the dangerous Mythos system, they are currently making the rounds to help major institutions adapt to the new reality it heralds. Anthropic has the agency to create such product and the accountability not to let it wreck society even if doing so would make a profit – this is truly a unique event in my lifetime. This approach would not be possible absent the safety evaluations the company is conducting internally.
Evaluation’s Work
The work of evaluation is to help people, organizations, and cultures move toward high agency, high accountability from wherever they currently stand. I would argue that this is key to the ethos of evaluation. Low accountability behavior, in all its forms, is evaluation’s fundamental adversary. Low-accountability behavior operates before evaluations happen, undermining our work while it is underway, and deflecting the consequences once findings are delivered.
Naming low accountability behavior, recognizing its patterns across scales, and developing strategies to address it is central to evaluation practice.
So what can we do? First, I think evaluators all probably need some practice handling interpersonal low-accountability behaviors. Many of these are more frequent in our private conversations, but they can all find their way into our conversations with stakeholders. The most common ones that evaluators hear are probably: Retroactive Intent, Flip-out, Ambiguity in Intent, and Absent Authority. If you call out each of these every time time they arise, you are going to become the most unpopular person you know and your evaluation practice will crumble to dust. In other words, you should not allow the use of LABs to be a grim trigger that ends cooperation forever. However, I think you do need to understand these behaviors for what they are so that you can prevent your judgment from being affected by them. A big problem with these behaviors is that many of them work really well, which reinforces them among the users. We learn most of these when we are children.
At the broadest, ideological level, evaluation has a very important role to play as well. When the ideological winds are blowing hardest, political factions are tempted - usually successfully – to bend their principles so that they can win and start acting like nice, normal people again. This tempest is the favorite environment of low-accountability behaviors. By insisting on facts, explicit reasoning, and non-arbitrary standards, evaluations are one of the only technologies available to us to calm the storm and return to accountability.
Society needs evaluation because evaluation reduces the costs of switching from ineffective strategies to effective ones by providing greater information about both. Evaluation sorts through the universe of possible strategies and reveals their costs. When society is doing something ineffective, we may not know this until we do an evaluation. Then, even after we do it, there will be friction costs associated with switching to anything else, and these costs will be considerably higher when we don’t know what the effective alternatives are. Low-accountability behaviors make these friction costs into a shelter for the parties refusing accountability - that is, even while they aggravate the public with their LABs, incumbents continue to profit from the status quo because switching strategies is expensive. This is the economic relationship that makes evaluation disadvantageous for players that currently benefit from ineffective strategies but advantageous for new players and those with better strategies.
It’s very important not to get too caught up in the reality of these scales - there is no actual place called “society” and you’re in all of the scales virtually all the time. All of them are made up of finite relationships between people and things.



A dysfunctional culture might refer to the above LABs as:
-emotional intelligence
-social skills
-savvy
-being a team player
-reading the room
-flexibility/adaptability
But my preferred umbrella acronym for LABs is just BS.
Thank you. Very valuable, Anthony.
I would add a third element in defining accountability - the moral repair of relationships necessary to sustain accountable relationships over time. Without this, the mutual trust needed to sustain high accountability and high agency is likely to prove elusive. Robust theories of scaling emphasise moral recognition of such reality. eg see Judge, M., Bouman, T., Steg, L., & Bolderdijk, J. W. (2024). Accelerating social tipping points in sustainable behaviors: Insights from a dynamic model of moralized social change. One Earth, 7(5), 759-770. doi:https://doi.org/10.1016/j.oneear.2024.04.004