feat(writing-skills): refactor skill to gold standard architecture

2026-01-29 15:12:41 -03:00
parent f2c3e783b5
commit 10ca106b79
16 changed files with 1966 additions and 698 deletions
--- a/skills/writing-skills/references/anti-rationalization/README.md
+++ b/skills/writing-skills/references/anti-rationalization/README.md
@@ -0,0 +1,255 @@
+# Anti-Rationalization Guide
+
+Techniques for bulletproofing skills against agent rationalization.
+
+## The Problem
+
+Discipline-enforcing skills (like TDD) face a unique challenge: smart agents under pressure will find loopholes.
+
+**Example**: Skill says "Write test first". Agent under deadline thinks:
+
+- "This is too simple to test"
+- "I'll test after, same result"
+- "It's the spirit that matters, not ritual"
+
+## Psychology Foundation
+
+Understanding WHY persuasion works helps apply it systematically.
+
+**Research basis**: Cialdini (2021), Meincke et al. (2025)
+
+**Core principles**:
+
+- **Authority**: "The TDD community agrees..."
+- **Commitment**: "You already said you follow TDD..."
+- **Scarcity**: "Missing tests now = bugs later"
+- **Social Proof**: "All tested code passing CI proves value"
+- **Unity**: "We're engineers who value quality"
+
+## Technique 1: Close Every Loophole Explicitly
+
+Don't just state the rule - forbid specific workarounds.
+
+### Bad Example
+
+```markdown
+Write code before test? Delete it.
+```
+
+### Good Example
+
+```markdown
+Write code before test? Delete it. Start over.
+
+**No exceptions**:
+
+- Don't keep it as "reference"
+- Don't "adapt" it while writing tests
+- Don't look at it
+- Delete means delete
+```
+
+**Why it works**: Agents try specific workarounds. Counter each explicitly.
+
+## Technique 2: Address "Spirit vs Letter" Arguments
+
+Add foundational principle early:
+
+```markdown
+**Violating the letter of the rules is violating the spirit of the rules.**
+```
+
+**Why it works**: Cuts off entire class of "I'm following the spirit" rationalizations.
+
+## Technique 3: Build Rationalization Table
+
+Capture excuses from baseline testing. Every rationalization goes in table:
+
+```markdown
+| Excuse                           | Reality                                                                 |
+| -------------------------------- | ----------------------------------------------------------------------- |
+| "Too simple to test"             | Simple code breaks. Test takes 30 seconds.                              |
+| "I'll test after"                | Tests passing immediately prove nothing.                                |
+| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
+| "It's about spirit not ritual"   | The letter IS the spirit. TDD's value comes from the specific sequence. |
+```
+
+**Why it works**: Agents read table, recognize their own thinking, see the counter-argument.
+
+## Technique 4: Create Red Flags List
+
+Make it easy for agents to self-check when rationalizing:
+
+```markdown
+## Red Flags - STOP and Start Over
+
+- Code before test
+- "I already manually tested it"
+- "Tests after achieve the same purpose"
+- "It's about spirit not ritual"
+- "This is different because..."
+
+**All of these mean**: Delete code. Start over with TDD.
+```
+
+**Why it works**: Simple checklist, clear action (delete & restart).
+
+## Technique 5: Update Description for Violation Symptoms
+
+Add to description: symptoms of when you're ABOUT to violate:
+
+```yaml
+# ❌ BAD: Only describes what skill does
+description: TDD methodology for writing code
+
+# ✅ GOOD: Includes pre-violation symptoms
+description: Use when implementing any feature or bugfix, before writing implementation code
+metadata:
+  triggers: new feature, bug fix, code change
+```
+
+**Why it works**: Triggers skill BEFORE violation, not after.
+
+## Technique 6: Use Strong Language
+
+Weak language invites rationalization:
+
+```markdown
+# Weak
+
+You should write tests first.
+Generally, test before code.
+It's better to test first.
+
+# Strong
+
+ALWAYS write test first.
+NEVER write code before test.
+Test-first is MANDATORY.
+```
+
+**Why it works**: No ambiguity, no wiggle room.
+
+## Technique 7: Invoke Commitment & Consistency
+
+Reference agent's own standards:
+
+```markdown
+You claimed to follow TDD.
+TDD means test-first.
+Code-first is NOT TDD.
+
+**Either**:
+
+- Follow TDD (test-first), or
+- Admit you're not doing TDD
+
+Don't redefine TDD to fit what you already did.
+```
+
+**Why it works**: Agents resist cognitive dissonance (Festinger, 1957).
+
+## Technique 8: Provide Escape Hatch for Legitimate Cases
+
+If there ARE valid exceptions, state them explicitly:
+
+```markdown
+## When NOT to Use TDD
+
+- Spike solutions (throwaway exploratory code)
+- One-time scripts deleting in 1 hour
+- Generated boilerplate (verified via other means)
+
+**Everything else**: Use TDD. No exceptions.
+```
+
+**Why it works**: Removes "but this is different" argument for non-exception cases.
+
+## Complete Bulletproofing Checklist
+
+For discipline-enforcing skills:
+
+**Loophole Closing**:
+
+- [ ] Forbidden each specific workaround explicitly?
+- [ ] Added "spirit vs letter" principle?
+- [ ] Built rationalization table from baseline tests?
+- [ ] Created red flags list?
+
+**Strength**:
+
+- [ ] Used strong language (ALWAYS/NEVER)?
+- [ ] Invoked commitment & consistency?
+- [ ] Provided explicit escape hatch?
+
+**Discovery**:
+
+- [ ] Description includes pre-violation symptoms?
+- [ ] Keywords target moment BEFORE violation?
+
+**Testing**:
+
+- [ ] Tested with combined pressures?
+- [ ] Agent complied under maximum pressure?
+- [ ] No new rationalizations found?
+
+## Real-World Example: TDD Skill
+
+### Baseline Rationalizations Found
+
+1. "Too simple to test"
+2. "I'll test after"
+3. "Spirit not ritual"
+4. "Already manually tested"
+5. "This is different because..."
+
+### Counters Applied
+
+**Rationalization table**:
+
+```markdown
+| Excuse               | Reality                                                        |
+| -------------------- | -------------------------------------------------------------- |
+| "Too simple to test" | Simple code breaks. Test takes 30 seconds.                     |
+| "I'll test after"    | Tests passing immediately prove nothing.                       |
+| "Spirit not ritual"  | The letter IS the spirit. TDD's value comes from the sequence. |
+| "Manually tested"    | Manual tests don't run automatically. They rot.                |
+```
+
+**Red flags**:
+
+```markdown
+## Red Flags - STOP
+
+- Code before test
+- "I already tested manually"
+- "Spirit not ritual"
+- "This is different..."
+
+All mean: Delete code. Start over.
+```
+
+**Result**: Agent compliance under combined time + sunk cost pressure.
+
+## Common Mistakes
+
+| Mistake                                | Fix                                                            |
+| -------------------------------------- | -------------------------------------------------------------- |
+| Trust agents will "get the spirit"     | Close explicit loopholes. Agents are smart at rationalization. |
+| Use weak language ("should", "better") | Use ALWAYS/NEVER for discipline rules.                         |
+| Skip rationalization table             | Every excuse needs explicit counter.                           |
+| No red flags list                      | Make self-checking easy.                                       |
+| Generic description                    | Add pre-violation symptoms to trigger skill earlier.           |
+
+## Meta-Strategy
+
+**For each new rationalization**:
+
+1. Document it verbatim (from failed test)
+2. Add to rationalization table
+3. Update red flags list
+4. Re-test
+
+**Iterate until**: Agent can't find ANY rationalization that works.
+
+That's bulletproof.