256 lines
7.1 KiB
Markdown
256 lines
7.1 KiB
Markdown
# Anti-Rationalization Guide
|
|
|
|
Techniques for bulletproofing skills against agent rationalization.
|
|
|
|
## The Problem
|
|
|
|
Discipline-enforcing skills (like TDD) face a unique challenge: smart agents under pressure will find loopholes.
|
|
|
|
**Example**: Skill says "Write test first". Agent under deadline thinks:
|
|
|
|
- "This is too simple to test"
|
|
- "I'll test after, same result"
|
|
- "It's the spirit that matters, not ritual"
|
|
|
|
## Psychology Foundation
|
|
|
|
Understanding WHY persuasion works helps apply it systematically.
|
|
|
|
**Research basis**: Cialdini (2021), Meincke et al. (2025)
|
|
|
|
**Core principles**:
|
|
|
|
- **Authority**: "The TDD community agrees..."
|
|
- **Commitment**: "You already said you follow TDD..."
|
|
- **Scarcity**: "Missing tests now = bugs later"
|
|
- **Social Proof**: "All tested code passing CI proves value"
|
|
- **Unity**: "We're engineers who value quality"
|
|
|
|
## Technique 1: Close Every Loophole Explicitly
|
|
|
|
Don't just state the rule - forbid specific workarounds.
|
|
|
|
### Bad Example
|
|
|
|
```markdown
|
|
Write code before test? Delete it.
|
|
```
|
|
|
|
### Good Example
|
|
|
|
```markdown
|
|
Write code before test? Delete it. Start over.
|
|
|
|
**No exceptions**:
|
|
|
|
- Don't keep it as "reference"
|
|
- Don't "adapt" it while writing tests
|
|
- Don't look at it
|
|
- Delete means delete
|
|
```
|
|
|
|
**Why it works**: Agents try specific workarounds. Counter each explicitly.
|
|
|
|
## Technique 2: Address "Spirit vs Letter" Arguments
|
|
|
|
Add foundational principle early:
|
|
|
|
```markdown
|
|
**Violating the letter of the rules is violating the spirit of the rules.**
|
|
```
|
|
|
|
**Why it works**: Cuts off entire class of "I'm following the spirit" rationalizations.
|
|
|
|
## Technique 3: Build Rationalization Table
|
|
|
|
Capture excuses from baseline testing. Every rationalization goes in table:
|
|
|
|
```markdown
|
|
| Excuse | Reality |
|
|
| -------------------------------- | ----------------------------------------------------------------------- |
|
|
| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
|
|
| "I'll test after" | Tests passing immediately prove nothing. |
|
|
| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
|
|
| "It's about spirit not ritual" | The letter IS the spirit. TDD's value comes from the specific sequence. |
|
|
```
|
|
|
|
**Why it works**: Agents read table, recognize their own thinking, see the counter-argument.
|
|
|
|
## Technique 4: Create Red Flags List
|
|
|
|
Make it easy for agents to self-check when rationalizing:
|
|
|
|
```markdown
|
|
## Red Flags - STOP and Start Over
|
|
|
|
- Code before test
|
|
- "I already manually tested it"
|
|
- "Tests after achieve the same purpose"
|
|
- "It's about spirit not ritual"
|
|
- "This is different because..."
|
|
|
|
**All of these mean**: Delete code. Start over with TDD.
|
|
```
|
|
|
|
**Why it works**: Simple checklist, clear action (delete & restart).
|
|
|
|
## Technique 5: Update Description for Violation Symptoms
|
|
|
|
Add to description: symptoms of when you're ABOUT to violate:
|
|
|
|
```yaml
|
|
# ❌ BAD: Only describes what skill does
|
|
description: TDD methodology for writing code
|
|
|
|
# ✅ GOOD: Includes pre-violation symptoms
|
|
description: Use when implementing any feature or bugfix, before writing implementation code
|
|
metadata:
|
|
triggers: new feature, bug fix, code change
|
|
```
|
|
|
|
**Why it works**: Triggers skill BEFORE violation, not after.
|
|
|
|
## Technique 6: Use Strong Language
|
|
|
|
Weak language invites rationalization:
|
|
|
|
```markdown
|
|
# Weak
|
|
|
|
You should write tests first.
|
|
Generally, test before code.
|
|
It's better to test first.
|
|
|
|
# Strong
|
|
|
|
ALWAYS write test first.
|
|
NEVER write code before test.
|
|
Test-first is MANDATORY.
|
|
```
|
|
|
|
**Why it works**: No ambiguity, no wiggle room.
|
|
|
|
## Technique 7: Invoke Commitment & Consistency
|
|
|
|
Reference agent's own standards:
|
|
|
|
```markdown
|
|
You claimed to follow TDD.
|
|
TDD means test-first.
|
|
Code-first is NOT TDD.
|
|
|
|
**Either**:
|
|
|
|
- Follow TDD (test-first), or
|
|
- Admit you're not doing TDD
|
|
|
|
Don't redefine TDD to fit what you already did.
|
|
```
|
|
|
|
**Why it works**: Agents resist cognitive dissonance (Festinger, 1957).
|
|
|
|
## Technique 8: Provide Escape Hatch for Legitimate Cases
|
|
|
|
If there ARE valid exceptions, state them explicitly:
|
|
|
|
```markdown
|
|
## When NOT to Use TDD
|
|
|
|
- Spike solutions (throwaway exploratory code)
|
|
- One-time scripts deleting in 1 hour
|
|
- Generated boilerplate (verified via other means)
|
|
|
|
**Everything else**: Use TDD. No exceptions.
|
|
```
|
|
|
|
**Why it works**: Removes "but this is different" argument for non-exception cases.
|
|
|
|
## Complete Bulletproofing Checklist
|
|
|
|
For discipline-enforcing skills:
|
|
|
|
**Loophole Closing**:
|
|
|
|
- [ ] Forbidden each specific workaround explicitly?
|
|
- [ ] Added "spirit vs letter" principle?
|
|
- [ ] Built rationalization table from baseline tests?
|
|
- [ ] Created red flags list?
|
|
|
|
**Strength**:
|
|
|
|
- [ ] Used strong language (ALWAYS/NEVER)?
|
|
- [ ] Invoked commitment & consistency?
|
|
- [ ] Provided explicit escape hatch?
|
|
|
|
**Discovery**:
|
|
|
|
- [ ] Description includes pre-violation symptoms?
|
|
- [ ] Keywords target moment BEFORE violation?
|
|
|
|
**Testing**:
|
|
|
|
- [ ] Tested with combined pressures?
|
|
- [ ] Agent complied under maximum pressure?
|
|
- [ ] No new rationalizations found?
|
|
|
|
## Real-World Example: TDD Skill
|
|
|
|
### Baseline Rationalizations Found
|
|
|
|
1. "Too simple to test"
|
|
2. "I'll test after"
|
|
3. "Spirit not ritual"
|
|
4. "Already manually tested"
|
|
5. "This is different because..."
|
|
|
|
### Counters Applied
|
|
|
|
**Rationalization table**:
|
|
|
|
```markdown
|
|
| Excuse | Reality |
|
|
| -------------------- | -------------------------------------------------------------- |
|
|
| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
|
|
| "I'll test after" | Tests passing immediately prove nothing. |
|
|
| "Spirit not ritual" | The letter IS the spirit. TDD's value comes from the sequence. |
|
|
| "Manually tested" | Manual tests don't run automatically. They rot. |
|
|
```
|
|
|
|
**Red flags**:
|
|
|
|
```markdown
|
|
## Red Flags - STOP
|
|
|
|
- Code before test
|
|
- "I already tested manually"
|
|
- "Spirit not ritual"
|
|
- "This is different..."
|
|
|
|
All mean: Delete code. Start over.
|
|
```
|
|
|
|
**Result**: Agent compliance under combined time + sunk cost pressure.
|
|
|
|
## Common Mistakes
|
|
|
|
| Mistake | Fix |
|
|
| -------------------------------------- | -------------------------------------------------------------- |
|
|
| Trust agents will "get the spirit" | Close explicit loopholes. Agents are smart at rationalization. |
|
|
| Use weak language ("should", "better") | Use ALWAYS/NEVER for discipline rules. |
|
|
| Skip rationalization table | Every excuse needs explicit counter. |
|
|
| No red flags list | Make self-checking easy. |
|
|
| Generic description | Add pre-violation symptoms to trigger skill earlier. |
|
|
|
|
## Meta-Strategy
|
|
|
|
**For each new rationalization**:
|
|
|
|
1. Document it verbatim (from failed test)
|
|
2. Add to rationalization table
|
|
3. Update red flags list
|
|
4. Re-test
|
|
|
|
**Iterate until**: Agent can't find ANY rationalization that works.
|
|
|
|
That's bulletproof.
|