feat(writing-skills): refactor skill to gold standard architecture
This commit is contained in:
255
skills/writing-skills/references/anti-rationalization/README.md
Normal file
255
skills/writing-skills/references/anti-rationalization/README.md
Normal file
@@ -0,0 +1,255 @@
|
||||
# Anti-Rationalization Guide
|
||||
|
||||
Techniques for bulletproofing skills against agent rationalization.
|
||||
|
||||
## The Problem
|
||||
|
||||
Discipline-enforcing skills (like TDD) face a unique challenge: smart agents under pressure will find loopholes.
|
||||
|
||||
**Example**: Skill says "Write test first". Agent under deadline thinks:
|
||||
|
||||
- "This is too simple to test"
|
||||
- "I'll test after, same result"
|
||||
- "It's the spirit that matters, not ritual"
|
||||
|
||||
## Psychology Foundation
|
||||
|
||||
Understanding WHY persuasion works helps apply it systematically.
|
||||
|
||||
**Research basis**: Cialdini (2021), Meincke et al. (2025)
|
||||
|
||||
**Core principles**:
|
||||
|
||||
- **Authority**: "The TDD community agrees..."
|
||||
- **Commitment**: "You already said you follow TDD..."
|
||||
- **Scarcity**: "Missing tests now = bugs later"
|
||||
- **Social Proof**: "All tested code passing CI proves value"
|
||||
- **Unity**: "We're engineers who value quality"
|
||||
|
||||
## Technique 1: Close Every Loophole Explicitly
|
||||
|
||||
Don't just state the rule - forbid specific workarounds.
|
||||
|
||||
### Bad Example
|
||||
|
||||
```markdown
|
||||
Write code before test? Delete it.
|
||||
```
|
||||
|
||||
### Good Example
|
||||
|
||||
```markdown
|
||||
Write code before test? Delete it. Start over.
|
||||
|
||||
**No exceptions**:
|
||||
|
||||
- Don't keep it as "reference"
|
||||
- Don't "adapt" it while writing tests
|
||||
- Don't look at it
|
||||
- Delete means delete
|
||||
```
|
||||
|
||||
**Why it works**: Agents try specific workarounds. Counter each explicitly.
|
||||
|
||||
## Technique 2: Address "Spirit vs Letter" Arguments
|
||||
|
||||
Add foundational principle early:
|
||||
|
||||
```markdown
|
||||
**Violating the letter of the rules is violating the spirit of the rules.**
|
||||
```
|
||||
|
||||
**Why it works**: Cuts off entire class of "I'm following the spirit" rationalizations.
|
||||
|
||||
## Technique 3: Build Rationalization Table
|
||||
|
||||
Capture excuses from baseline testing. Every rationalization goes in table:
|
||||
|
||||
```markdown
|
||||
| Excuse | Reality |
|
||||
| -------------------------------- | ----------------------------------------------------------------------- |
|
||||
| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
|
||||
| "I'll test after" | Tests passing immediately prove nothing. |
|
||||
| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
|
||||
| "It's about spirit not ritual" | The letter IS the spirit. TDD's value comes from the specific sequence. |
|
||||
```
|
||||
|
||||
**Why it works**: Agents read table, recognize their own thinking, see the counter-argument.
|
||||
|
||||
## Technique 4: Create Red Flags List
|
||||
|
||||
Make it easy for agents to self-check when rationalizing:
|
||||
|
||||
```markdown
|
||||
## Red Flags - STOP and Start Over
|
||||
|
||||
- Code before test
|
||||
- "I already manually tested it"
|
||||
- "Tests after achieve the same purpose"
|
||||
- "It's about spirit not ritual"
|
||||
- "This is different because..."
|
||||
|
||||
**All of these mean**: Delete code. Start over with TDD.
|
||||
```
|
||||
|
||||
**Why it works**: Simple checklist, clear action (delete & restart).
|
||||
|
||||
## Technique 5: Update Description for Violation Symptoms
|
||||
|
||||
Add to description: symptoms of when you're ABOUT to violate:
|
||||
|
||||
```yaml
|
||||
# ❌ BAD: Only describes what skill does
|
||||
description: TDD methodology for writing code
|
||||
|
||||
# ✅ GOOD: Includes pre-violation symptoms
|
||||
description: Use when implementing any feature or bugfix, before writing implementation code
|
||||
metadata:
|
||||
triggers: new feature, bug fix, code change
|
||||
```
|
||||
|
||||
**Why it works**: Triggers skill BEFORE violation, not after.
|
||||
|
||||
## Technique 6: Use Strong Language
|
||||
|
||||
Weak language invites rationalization:
|
||||
|
||||
```markdown
|
||||
# Weak
|
||||
|
||||
You should write tests first.
|
||||
Generally, test before code.
|
||||
It's better to test first.
|
||||
|
||||
# Strong
|
||||
|
||||
ALWAYS write test first.
|
||||
NEVER write code before test.
|
||||
Test-first is MANDATORY.
|
||||
```
|
||||
|
||||
**Why it works**: No ambiguity, no wiggle room.
|
||||
|
||||
## Technique 7: Invoke Commitment & Consistency
|
||||
|
||||
Reference agent's own standards:
|
||||
|
||||
```markdown
|
||||
You claimed to follow TDD.
|
||||
TDD means test-first.
|
||||
Code-first is NOT TDD.
|
||||
|
||||
**Either**:
|
||||
|
||||
- Follow TDD (test-first), or
|
||||
- Admit you're not doing TDD
|
||||
|
||||
Don't redefine TDD to fit what you already did.
|
||||
```
|
||||
|
||||
**Why it works**: Agents resist cognitive dissonance (Festinger, 1957).
|
||||
|
||||
## Technique 8: Provide Escape Hatch for Legitimate Cases
|
||||
|
||||
If there ARE valid exceptions, state them explicitly:
|
||||
|
||||
```markdown
|
||||
## When NOT to Use TDD
|
||||
|
||||
- Spike solutions (throwaway exploratory code)
|
||||
- One-time scripts deleting in 1 hour
|
||||
- Generated boilerplate (verified via other means)
|
||||
|
||||
**Everything else**: Use TDD. No exceptions.
|
||||
```
|
||||
|
||||
**Why it works**: Removes "but this is different" argument for non-exception cases.
|
||||
|
||||
## Complete Bulletproofing Checklist
|
||||
|
||||
For discipline-enforcing skills:
|
||||
|
||||
**Loophole Closing**:
|
||||
|
||||
- [ ] Forbidden each specific workaround explicitly?
|
||||
- [ ] Added "spirit vs letter" principle?
|
||||
- [ ] Built rationalization table from baseline tests?
|
||||
- [ ] Created red flags list?
|
||||
|
||||
**Strength**:
|
||||
|
||||
- [ ] Used strong language (ALWAYS/NEVER)?
|
||||
- [ ] Invoked commitment & consistency?
|
||||
- [ ] Provided explicit escape hatch?
|
||||
|
||||
**Discovery**:
|
||||
|
||||
- [ ] Description includes pre-violation symptoms?
|
||||
- [ ] Keywords target moment BEFORE violation?
|
||||
|
||||
**Testing**:
|
||||
|
||||
- [ ] Tested with combined pressures?
|
||||
- [ ] Agent complied under maximum pressure?
|
||||
- [ ] No new rationalizations found?
|
||||
|
||||
## Real-World Example: TDD Skill
|
||||
|
||||
### Baseline Rationalizations Found
|
||||
|
||||
1. "Too simple to test"
|
||||
2. "I'll test after"
|
||||
3. "Spirit not ritual"
|
||||
4. "Already manually tested"
|
||||
5. "This is different because..."
|
||||
|
||||
### Counters Applied
|
||||
|
||||
**Rationalization table**:
|
||||
|
||||
```markdown
|
||||
| Excuse | Reality |
|
||||
| -------------------- | -------------------------------------------------------------- |
|
||||
| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
|
||||
| "I'll test after" | Tests passing immediately prove nothing. |
|
||||
| "Spirit not ritual" | The letter IS the spirit. TDD's value comes from the sequence. |
|
||||
| "Manually tested" | Manual tests don't run automatically. They rot. |
|
||||
```
|
||||
|
||||
**Red flags**:
|
||||
|
||||
```markdown
|
||||
## Red Flags - STOP
|
||||
|
||||
- Code before test
|
||||
- "I already tested manually"
|
||||
- "Spirit not ritual"
|
||||
- "This is different..."
|
||||
|
||||
All mean: Delete code. Start over.
|
||||
```
|
||||
|
||||
**Result**: Agent compliance under combined time + sunk cost pressure.
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
| Mistake | Fix |
|
||||
| -------------------------------------- | -------------------------------------------------------------- |
|
||||
| Trust agents will "get the spirit" | Close explicit loopholes. Agents are smart at rationalization. |
|
||||
| Use weak language ("should", "better") | Use ALWAYS/NEVER for discipline rules. |
|
||||
| Skip rationalization table | Every excuse needs explicit counter. |
|
||||
| No red flags list | Make self-checking easy. |
|
||||
| Generic description | Add pre-violation symptoms to trigger skill earlier. |
|
||||
|
||||
## Meta-Strategy
|
||||
|
||||
**For each new rationalization**:
|
||||
|
||||
1. Document it verbatim (from failed test)
|
||||
2. Add to rationalization table
|
||||
3. Update red flags list
|
||||
4. Re-test
|
||||
|
||||
**Iterate until**: Agent can't find ANY rationalization that works.
|
||||
|
||||
That's bulletproof.
|
||||
Reference in New Issue
Block a user