# Anti-Rationalization Guide

Techniques for bulletproofing skills against agent rationalization.

## The Problem

Discipline-enforcing skills (like TDD) face a unique challenge: smart agents under pressure will find loopholes.

**Example**: Skill says "Write test first". Agent under deadline thinks:

- "This is too simple to test"
- "I'll test after, same result"
- "It's the spirit that matters, not ritual"

## Psychology Foundation

Understanding WHY persuasion works helps apply it systematically.

**Research basis**: Cialdini (2021), Meincke et al. (2025)

**Core principles**:

- **Authority**: "The TDD community agrees..."
- **Commitment**: "You already said you follow TDD..."
- **Scarcity**: "Missing tests now = bugs later"
- **Social Proof**: "All tested code passing CI proves value"
- **Unity**: "We're engineers who value quality"

## Technique 1: Close Every Loophole Explicitly

Don't just state the rule - forbid specific workarounds.

### Bad Example

```markdown
Write code before test? Delete it.
```

### Good Example

```markdown
Write code before test? Delete it. Start over.

**No exceptions**:

- Don't keep it as "reference"
- Don't "adapt" it while writing tests
- Don't look at it
- Delete means delete
```

**Why it works**: Agents try specific workarounds. Counter each explicitly.

## Technique 2: Address "Spirit vs Letter" Arguments

Add foundational principle early:

```markdown
**Violating the letter of the rules is violating the spirit of the rules.**
```

**Why it works**: Cuts off entire class of "I'm following the spirit" rationalizations.

## Technique 3: Build Rationalization Table

Capture excuses from baseline testing. Every rationalization goes in table:

```markdown
| Excuse                           | Reality                                                                 |
| -------------------------------- | ----------------------------------------------------------------------- |
| "Too simple to test"             | Simple code breaks. Test takes 30 seconds.                              |
| "I'll test after"                | Tests passing immediately prove nothing.                                |
| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
| "It's about spirit not ritual"   | The letter IS the spirit. TDD's value comes from the specific sequence. |
```

**Why it works**: Agents read table, recognize their own thinking, see the counter-argument.

## Technique 4: Create Red Flags List

Make it easy for agents to self-check when rationalizing:

```markdown
## Red Flags - STOP and Start Over

- Code before test
- "I already manually tested it"
- "Tests after achieve the same purpose"
- "It's about spirit not ritual"
- "This is different because..."

**All of these mean**: Delete code. Start over with TDD.
```

**Why it works**: Simple checklist, clear action (delete & restart).

## Technique 5: Update Description for Violation Symptoms

Add to description: symptoms of when you're ABOUT to violate:

```yaml
# ❌ BAD: Only describes what skill does
description: TDD methodology for writing code

# ✅ GOOD: Includes pre-violation symptoms
description: Use when implementing any feature or bugfix, before writing implementation code
metadata:
  triggers: new feature, bug fix, code change
```

**Why it works**: Triggers skill BEFORE violation, not after.

## Technique 6: Use Strong Language

Weak language invites rationalization:

```markdown
# Weak

You should write tests first.
Generally, test before code.
It's better to test first.

# Strong

ALWAYS write test first.
NEVER write code before test.
Test-first is MANDATORY.
```

**Why it works**: No ambiguity, no wiggle room.

## Technique 7: Invoke Commitment & Consistency

Reference agent's own standards:

```markdown
You claimed to follow TDD.
TDD means test-first.
Code-first is NOT TDD.

**Either**:

- Follow TDD (test-first), or
- Admit you're not doing TDD

Don't redefine TDD to fit what you already did.
```

**Why it works**: Agents resist cognitive dissonance (Festinger, 1957).

## Technique 8: Provide Escape Hatch for Legitimate Cases

If there ARE valid exceptions, state them explicitly:

```markdown
## When NOT to Use TDD

- Spike solutions (throwaway exploratory code)
- One-time scripts deleting in 1 hour
- Generated boilerplate (verified via other means)

**Everything else**: Use TDD. No exceptions.
```

**Why it works**: Removes "but this is different" argument for non-exception cases.

## Complete Bulletproofing Checklist

For discipline-enforcing skills:

**Loophole Closing**:

- [ ] Forbidden each specific workaround explicitly?
- [ ] Added "spirit vs letter" principle?
- [ ] Built rationalization table from baseline tests?
- [ ] Created red flags list?

**Strength**:

- [ ] Used strong language (ALWAYS/NEVER)?
- [ ] Invoked commitment & consistency?
- [ ] Provided explicit escape hatch?

**Discovery**:

- [ ] Description includes pre-violation symptoms?
- [ ] Keywords target moment BEFORE violation?

**Testing**:

- [ ] Tested with combined pressures?
- [ ] Agent complied under maximum pressure?
- [ ] No new rationalizations found?

## Real-World Example: TDD Skill

### Baseline Rationalizations Found

1. "Too simple to test"
2. "I'll test after"
3. "Spirit not ritual"
4. "Already manually tested"
5. "This is different because..."

### Counters Applied

**Rationalization table**:

```markdown
| Excuse               | Reality                                                        |
| -------------------- | -------------------------------------------------------------- |
| "Too simple to test" | Simple code breaks. Test takes 30 seconds.                     |
| "I'll test after"    | Tests passing immediately prove nothing.                       |
| "Spirit not ritual"  | The letter IS the spirit. TDD's value comes from the sequence. |
| "Manually tested"    | Manual tests don't run automatically. They rot.                |
```

**Red flags**:

```markdown
## Red Flags - STOP

- Code before test
- "I already tested manually"
- "Spirit not ritual"
- "This is different..."

All mean: Delete code. Start over.
```

**Result**: Agent compliance under combined time + sunk cost pressure.

## Common Mistakes

| Mistake                                | Fix                                                            |
| -------------------------------------- | -------------------------------------------------------------- |
| Trust agents will "get the spirit"     | Close explicit loopholes. Agents are smart at rationalization. |
| Use weak language ("should", "better") | Use ALWAYS/NEVER for discipline rules.                         |
| Skip rationalization table             | Every excuse needs explicit counter.                           |
| No red flags list                      | Make self-checking easy.                                       |
| Generic description                    | Add pre-violation symptoms to trigger skill earlier.           |

## Meta-Strategy

**For each new rationalization**:

1. Document it verbatim (from failed test)
2. Add to rationalization table
3. Update red flags list
4. Re-test

**Iterate until**: Agent can't find ANY rationalization that works.

That's bulletproof.