Compare commits

...

11 Commits

Author SHA1 Message Date
sck_0
3dded33731 chore: finalize v3.4.0 versioning in README 2026-01-27 09:20:39 +01:00
sck_0
cc0f4a2ec4 chore: sync generated files (fix CI drift in v3.4.0) 2026-01-27 09:18:26 +01:00
sck_0
1b606d851d chore(release): v3.4.0 - Voice AI & Categorization 2026-01-27 09:16:41 +01:00
sck_0
7e2f243bfa chore: fixes CI drift (sync generated files +1 skill) 2026-01-27 09:11:06 +01:00
sck_0
3688425884 chore: sync generated files and apply categorization 2026-01-27 08:59:59 +01:00
sickn33
8801592bd2 Merge pull request #33 from taksrules/feat/voice-ai-engine-development
feat: add voice-ai-engine-development skill for building real-time co…
2026-01-27 08:54:35 +01:00
taksrules
d972c4fa3a feat: add voice-ai-engine-development skill for building real-time conversational AI 2026-01-27 07:24:06 +02:00
sck_0
e9783892c1 chore: fixes ci drift in generated files 2026-01-26 20:06:02 +01:00
sck_0
d8d8e70ebb chore: standardize maintenance rules and sync docs (v3.3.0 audit) 2026-01-26 20:04:54 +01:00
sck_0
deafaa6e77 chore: fix drift in generated files (CI sync) 2026-01-26 19:28:32 +01:00
sck_0
790573472c chore: sync generated files and stats 2026-01-26 19:25:20 +01:00
15 changed files with 3768 additions and 293 deletions

View File

@@ -7,6 +7,41 @@ It covers the **Quality Bar**, **Documentation Consistency**, and **Release Work
---
## 0. 🤖 Agent Protocol (THE BIBLE)
**AGENTS MUST READ AND FOLLOW THIS SECTION BEFORE MARKING ANY TASK AS COMPLETE.**
There are 3 things that usually fail/get forgotten. **DO NOT FORGET THEM:**
### 1. 📤 ALWAYS PUSH (Non-Negotiable)
Committing is NOT enough. You must PUSH to the remote.
- **BAD**: `git commit -m "feat: new skill"` (User sees nothing)
- **GOOD**: `git commit -m "..." && git push origin main`
### 2. 🔄 SYNC GENERATED FILES (Avoid CI Drift)
If you touch `skills/`, you **MUST** run the validation chain.
- Running `validate_skills.py` is NOT optional.
- Running `generate_index.py` is NOT optional.
- Running `update_readme.py` is NOT optional.
_If CI fails because of drift, you have failed._
### 3. 📝 EVIDENCE OF WORK
- You must create/update `walkthrough.md` or `RELEASE_NOTES.md` to document what changed.
- If you made something new, **link it** in the artifacts.
### 4. 🚫 NO BRANCHES
- **ALWAYS use the `main` branch.**
- NEVER create feature branches (e.g., `feat/new-skill`).
- We commit directly to `main` to keep history linear and simple.
---
## 1. 🚦 Daily Maintenance Routine
### A. Validation Chain
@@ -84,15 +119,25 @@ If you update installation instructions or tool compatibility, you MUST update a
_Common pitfall: Updating the clone URL in README but leaving an old one in FAQ._
### C. Statistics
### C. Statistics Consistency (CRITICAL)
If you add skills, update the counts:
If you add/remove skills, you **MUST** ensure the total count is identical in ALL locations.
**Do not allow drift** (e.g., 356 in title, 354 in header).
- Title of `README.md`: "253+ Agentic Skills..."
- `## Full Skill Registry (253/253)` header.
- `GETTING_STARTED.md` intro.
Locations to check:
### D. Badges & Links
1. **Title of `README.md`**: "356+ Agentic Skills..."
2. **`## Full Skill Registry (356/356)` header**.
3. **`GETTING_STARTED.md` intro**.
### D. Credits Policy (Who goes where?)
- **Credits & Sources**: Use this for **External Repos**.
- _Rule_: "I extracted skills from this link you sent me." -> Add to `## Credits & Sources`.
- **Repo Contributors**: Use this for **Pull Requests**.
- _Rule_: "This user sent a PR." -> Add to `## Repo Contributors`.
### E. Badges & Links
- **Antigravity Badge**: Must point to `https://github.com/sickn33/antigravity-awesome-skills`, NOT `anthropics/antigravity`.
- **License**: Ensure the link points to `LICENSE` file.
@@ -132,6 +177,48 @@ When cutting a new version (e.g., V4):
git push origin v3.0.0
```
### 📋 Release Note Template
All changeslogs/release notes MUST follow this structure to ensure professionalism and quality:
```markdown
# Release vX.Y.Z: [Theme Name]
> **[One-line catchy summary of the release]**
[Brief 2-3 sentence intro about the release's impact]
## 🚀 New Skills
### [Emoji] [Skill Name](skills/skill-name/)
**[Bold high-level benefit]**
[Description of what it does]
- **Key Feature 1**: [Detail]
- **Key Feature 2**: [Detail]
> **Try it:** `(User Prompt) ...`
---
## 📦 Improvements
- **Registry Update**: Now tracking [N] skills.
- **[Component]**: [Change detail]
## 👥 Credits
A huge shoutout to our community contributors:
- **@username** for `skill-name`
- **@username** for `fix-name`
---
_Upgrade now: `git pull origin main` to fetch the latest skills._
```
---
## 5. 🚨 Emergency Fixes

View File

@@ -11,6 +11,23 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
---
## [3.4.0] - 2026-01-27 - "Voice Intelligence & Categorization"
### Added
- **New Skill**: `voice-ai-engine-development` - Complete toolkit for building real-time voice agents (OpenAI Realtime, Vapi, Deepgram, ElevenLabs).
- **Categorization**: Major README update introducing a concise "Features & Categories" summary table.
### Changed
- **README**: Replaced text-heavy category lists with a high-level summary table.
- **Registry**: Synced generic skill count (256) across documentation.
### Contributors
- [@sickn33](https://github.com/sickn33) - Voice AI Engine (PR #33)
- [@community](https://github.com/community) - Categorization Initiative (PR #32)
## [3.3.0] - 2026-01-26 - "News & Research"
### Added

2
FAQ.md
View File

@@ -116,7 +116,7 @@ Use the `@` symbol followed by the skill name:
### How do I know which skill to use?
1. **Browse the README**: Check the [Full Skill Registry](README.md#full-skill-registry-253253).
1. **Browse the README**: Check the [Full Skill Registry](README.md#full-skill-registry-255255).
2. **Search**: `ls skills/ | grep "keyword"`
3. **Ask your AI**: "What skills do you have for testing?"

View File

@@ -15,7 +15,7 @@ AI Agents (like **Claude Code**, **Gemini**, **Cursor**) are smart, but they lac
## ⚡️ Quick Start: The "Starter Packs"
Don't panic about the 253+ skills. You don't need them all at once.
Don't panic about the 255+ skills. You don't need them all at once.
We have curated **Starter Packs** to get you running immediately.
### 1. Install the Repo
@@ -84,7 +84,7 @@ We classify skills so you know what you're running:
- 🔵 **Safe**: Community skills that are non-destructive (Read-only/Planning).
- 🔴 **Risk**: Skills that modify systems or perform security tests (Authorized Use Only).
_Check the [Full Registry](README.md#full-skill-registry-253253) for risk labels._
_Check the [Full Registry](README.md#full-skill-registry-255255) for risk labels._
---
@@ -103,6 +103,6 @@ A: Yes, MIT License. Open Source forever.
## ⏭️ Next Steps
1. [Browse the Bundles](docs/BUNDLES.md)
2. [See Real-World Examples](docs/EXAMPLES.md)
3. [Contribute a Skill](CONTRIBUTING.md)
1. [Browse the Bundles](docs/BUNDLES.md)
2. [See Real-World Examples](docs/EXAMPLES.md)
3. [Contribute a Skill](CONTRIBUTING.md)

564
README.md
View File

@@ -1,6 +1,6 @@
# 🌌 Antigravity Awesome Skills: 255+ Agentic Skills for Claude Code, Gemini CLI, Cursor, Copilot & More
# 🌌 Antigravity Awesome Skills: 256+ Agentic Skills for Claude Code, Gemini CLI, Cursor, Copilot & More
> **The Ultimate Collection of 255+ Universal Agentic Skills for AI Coding Assistants — Claude Code, Gemini CLI, Codex CLI, Antigravity IDE, GitHub Copilot, Cursor, OpenCode**
> **The Ultimate Collection of 256+ Universal Agentic Skills for AI Coding Assistants — Claude Code, Gemini CLI, Codex CLI, Antigravity IDE, GitHub Copilot, Cursor, OpenCode**
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Claude Code](https://img.shields.io/badge/Claude%20Code-Anthropic-purple)](https://claude.ai)
@@ -11,7 +11,7 @@
[![OpenCode](https://img.shields.io/badge/OpenCode-CLI-gray)](https://github.com/opencode-ai/opencode)
[![Antigravity](https://img.shields.io/badge/Antigravity-DeepMind-red)](https://github.com/sickn33/antigravity-awesome-skills)
**Antigravity Awesome Skills** is a curated, battle-tested library of **255 high-performance agentic skills** designed to work seamlessly across all major AI coding assistants:
**Antigravity Awesome Skills** is a curated, battle-tested library of **256 high-performance agentic skills** designed to work seamlessly across all major AI coding assistants:
- 🟣 **Claude Code** (Anthropic CLI)
- 🔵 **Gemini CLI** (Google DeepMind)
@@ -29,7 +29,7 @@ This repository provides essential skills to transform your AI assistant into a
- [🔌 Compatibility & Invocation](#compatibility--invocation)
- [📦 Features & Categories](#features--categories)
- [🎁 Curated Collections (Bundles)](#curated-collections)
- [📜 Full Skill Registry](#full-skill-registry-253253)
- [📜 Full Skill Registry](#full-skill-registry-256256)
- [🛠️ Installation](#installation)
- [🤝 How to Contribute](#how-to-contribute)
- [👥 Contributors & Credits](#credits--sources)
@@ -104,25 +104,19 @@ The repository is organized into several key areas of expertise:
| Category | Skills Count | Key Skills Included |
| :-------------------------- | :----------- | :--------------------------------------------------------------------------------------------------------------------------- |
| **🛸 Autonomous & Agentic** | **~8** | Loki Mode (Startup-in-a-box), Subagent Driven Dev, Dispatching Parallel Agents, Planning With Files, Skill Creator/Developer |
| **🔌 Integrations & APIs** | **~25** | Stripe, Firebase, Supabase, Vercel, Clerk Auth, Twilio, Discord Bot, Slack Bot, GraphQL, AWS Serverless |
| **🛡️ Cybersecurity** | **~51** | Ethical Hacking, Metasploit, Burp Suite, SQLMap, Active Directory, AWS/Cloud Pentesting, OWASP Top 100, Red Team Tools |
| **🎨 Creative & Design** | **~10** | UI/UX Pro Max, Frontend Design, Canvas, Algorithmic Art, Theme Factory, D3 Viz, Web Artifacts |
| **🛠️ Development** | **~33** | TDD, Systematic Debugging, React Patterns, Backend/Frontend Guidelines, Senior Fullstack, Software Architecture |
| **🏗️ Infrastructure & Git** | **~8** | Linux Shell Scripting, Git Worktrees, Git Pushing, Conventional Commits, File Organization, GitHub Workflow Automation |
| **🤖 AI Agents & LLM** | **~31** | LangGraph, CrewAI, Langfuse, RAG Engineer, Prompt Engineer, Voice Agents, Browser Automation, Agent Memory Systems |
| **🔄 Workflow & Planning** | **~6** | Writing Plans, Executing Plans, Concise Planning, Verification Before Completion, Code Review (Requesting/Receiving) |
| **📄 Document Processing** | **~4** | DOCX (Official), PDF (Official), PPTX (Official), XLSX (Official) |
| **🧪 Testing & QA** | **~4** | Webapp Testing, Playwright Automation, Test Fixing, Testing Patterns |
| **📈 Product & Strategy** | **~8** | Product Manager Toolkit, Content Creator, ASO, Doc Co-authoring, Brainstorming, Internal Comms |
| **📣 Marketing & Growth** | **~23** | Page CRO, Copywriting, SEO Audit, Paid Ads, Email Sequence, Pricing Strategy, Referral Program, Launch Strategy |
| **🚀 Maker Tools** | **~11** | Micro-SaaS Launcher, Browser Extension Builder, Telegram Bot, AI Wrapper Product, Viral Generator, 3D Web Experience |
---
## Curated Collections
[Check out our Starter Packs in docs/BUNDLES.md](docs/BUNDLES.md) to find the perfect toolkit for your role.
| **🛸 Autonomous & Agentic** | **(13)** | Loki Mode (Startup-in-a-box), Subagent Driven Dev, Dispatching Parallel Agents, Planning With Files, Skill Creator/Developer |
| **🔌 Integrations & APIs** | **(35)** | Stripe, Firebase, Supabase, Vercel, Clerk Auth, Twilio, Discord Bot, Slack Bot, GraphQL, AWS Serverless |
| **🛡️ Cybersecurity** | **(32)** | Ethical Hacking, Metasploit, Burp Suite, SQLMap, Active Directory, AWS/Cloud Pentesting, OWASP Top 100, Red Team Tools |
| **🎨 Creative & Design** | **(21)** | UI/UX Pro Max, Frontend Design, Canvas, Algorithmic Art, Theme Factory, D3 Viz, Web Artifacts |
| **🛠️ Development** | **(44)** | TDD, Systematic Debugging, React Patterns, Backend/Frontend Guidelines, Senior Fullstack, Software Architecture |
| **🏗️ Infrastructure & Git** | **(13)** | Linux Shell Scripting, Git Worktrees, Git Pushing, Conventional Commits, File Organization, GitHub Workflow Automation |
| **🤖 AI Agents & LLM** | **(27)** | Voice AI Engine, LangGraph, CrewAI, Langfuse, RAG Engineer, Prompt Engineer, Browser Automation, Agent Memory Systems |
| **🔄 Workflow & Planning** | **(19)** | Writing Plans, Executing Plans, Concise Planning, Verification Before Completion, Code Review (Requesting/Receiving) |
| **📄 Document Processing** | **(5)** | DOCX (Official), PDF (Official), PPTX (Official), XLSX (Official) |
| **🧪 Testing & QA** | **(8)** | Webapp Testing, Playwright Automation, Test Fixing, Testing Patterns |
| **📈 Product & Strategy** | **(4)** | Product Manager Toolkit, Content Creator, ASO, Doc Co-authoring, Brainstorming, Internal Comms |
| **📣 Marketing & Growth** | **(26)** | Page CRO, Copywriting, SEO Audit, Paid Ads, Email Sequence, Pricing Strategy, Referral Program, Launch Strategy |
| **🚀 Maker Tools** | **(8)** | Micro-SaaS Launcher, Browser Extension Builder, Telegram Bot, AI Wrapper Product, Viral Generator, 3D Web Experience |
## Curated Collections
@@ -132,267 +126,268 @@ The repository is organized into several key areas of expertise:
[Check out our Starter Packs in docs/BUNDLES.md](docs/BUNDLES.md) to find the perfect toolkit for your role.
## Full Skill Registry (255/255)
## Full Skill Registry (256/256)
> [!NOTE] > **Document Skills**: We provide both **community** and **official Anthropic** versions for DOCX, PDF, PPTX, and XLSX. Locally, the official versions are used by default (via symlinks). In the repository, both versions are available for flexibility.
| Skill Name | Risk | Description | Path |
| :-------------------------------------------------- | :--- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------- |
| **2d-games** | ⚪ | 2D game development principles. Sprites, tilemaps, physics, camera. | `skills/game-development/2d-games` |
| **3d-games** | ⚪ | 3D game development principles. Rendering, shaders, physics, cameras. | `skills/game-development/3d-games` |
| **3d-web-experience** | ⚪ | Expert in building 3D experiences for the web - Three.js, React Three Fiber, Spline, WebGL, and interactive 3D scenes. Covers product configurators, 3D portfolios, immersive websites, and bringing depth to web experiences. Use when: 3D website, three.js, WebGL, react three fiber, 3D experience. | `skills/3d-web-experience` |
| **ab-test-setup** | ⚪ | Structured guide for setting up A/B tests with mandatory gates for hypothesis, metrics, and execution readiness. | `skills/ab-test-setup` |
| **Active Directory Attacks** | ⚪ | This skill should be used when the user asks to "attack Active Directory", "exploit AD", "Kerberoasting", "DCSync", "pass-the-hash", "BloodHound enumeration", "Golden Ticket", "Silver Ticket", "AS-REP roasting", "NTLM relay", or needs guidance on Windows domain penetration testing. | `skills/active-directory-attacks` |
| **address-github-comments** | ⚪ | Use when you need to address review or issue comments on an open GitHub Pull Request using the gh CLI. | `skills/address-github-comments` |
| **agent-evaluation** | ⚪ | Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world benchmarks Use when: agent testing, agent evaluation, benchmark agents, agent reliability, test agent. | `skills/agent-evaluation` |
| **agent-manager-skill** | ⚪ | Manage multiple local CLI agents via tmux sessions (start/stop/monitor/assign) with cron-friendly scheduling. | `skills/agent-manager-skill` |
| **agent-memory-mcp** | ⚪ | A hybrid memory system that provides persistent, searchable knowledge management for AI agents (Architecture, Patterns, Decisions). | `skills/agent-memory-mcp` |
| **agent-memory-systems** | ⚪ | Memory is the cornerstone of intelligent agents. Without it, every interaction starts from zero. This skill covers the architecture of agent memory: short-term (context window), long-term (vector stores), and the cognitive architectures that organize them. Key insight: Memory isn't just storage - it's retrieval. A million stored facts mean nothing if you can't find the right one. Chunking, embedding, and retrieval strategies determine whether your agent remembers or forgets. The field is fragm | `skills/agent-memory-systems` |
| **agent-tool-builder** | ⚪ | Tools are how AI agents interact with the world. A well-designed tool is the difference between an agent that works and one that hallucinates, fails silently, or costs 10x more tokens than necessary. This skill covers tool design from schema to error handling. JSON Schema best practices, description writing that actually helps the LLM, validation, and the emerging MCP standard that's becoming the lingua franca for AI tools. Key insight: Tool descriptions are more important than tool implementa | `skills/agent-tool-builder` |
| **ai-agents-architect** | ⚪ | Expert in designing and building autonomous AI agents. Masters tool use, memory systems, planning strategies, and multi-agent orchestration. Use when: build agent, AI agent, autonomous agent, tool use, function calling. | `skills/ai-agents-architect` |
| **ai-product** | ⚪ | Every product will be AI-powered. The question is whether you'll build it right or ship a demo that falls apart in production. This skill covers LLM integration patterns, RAG architecture, prompt engineering that scales, AI UX that users trust, and cost optimization that doesn't bankrupt you. Use when: keywords, file_patterns, code_patterns. | `skills/ai-product` |
| **ai-wrapper-product** | ⚪ | Expert in building products that wrap AI APIs (OpenAI, Anthropic, etc.) into focused tools people will pay for. Not just 'ChatGPT but different' - products that solve specific problems with AI. Covers prompt engineering for products, cost management, rate limiting, and building defensible AI businesses. Use when: AI wrapper, GPT product, AI tool, wrap AI, AI SaaS. | `skills/ai-wrapper-product` |
| **algolia-search** | ⚪ | Expert patterns for Algolia search implementation, indexing strategies, React InstantSearch, and relevance tuning Use when: adding search to, algolia, instantsearch, search api, search functionality. | `skills/algolia-search` |
| **algorithmic-art** | ⚪ | Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations. | `skills/algorithmic-art` |
| **analytics-tracking** | ⚪ | Design, audit, and improve analytics tracking systems that produce reliable, decision-ready data. Use when the user wants to set up, fix, or evaluate analytics tracking (GA4, GTM, product analytics, events, conversions, UTMs). This skill focuses on measurement strategy, signal quality, and validation— not just firing events. | `skills/analytics-tracking` |
| **API Fuzzing for Bug Bounty** | ⚪ | This skill should be used when the user asks to "test API security", "fuzz APIs", "find IDOR vulnerabilities", "test REST API", "test GraphQL", "API penetration testing", "bug bounty API testing", or needs guidance on API security assessment techniques. | `skills/api-fuzzing-bug-bounty` |
| **api-documentation-generator** | ⚪ | Generate comprehensive, developer-friendly API documentation from code, including endpoints, parameters, examples, and best practices | `skills/api-documentation-generator` |
| **api-patterns** | ⚪ | API design principles and decision-making. REST vs GraphQL vs tRPC selection, response formats, versioning, pagination. | `skills/api-patterns` |
| **api-security-best-practices** | ⚪ | Implement secure API design patterns including authentication, authorization, input validation, rate limiting, and protection against common API vulnerabilities | `skills/api-security-best-practices` |
| **app-builder** | ⚪ | Main application building orchestrator. Creates full-stack applications from natural language requests. Determines project type, selects tech stack, coordinates agents. | `skills/app-builder` |
| **app-store-optimization** | ⚪ | Complete App Store Optimization (ASO) toolkit for researching, optimizing, and tracking mobile app performance on Apple App Store and Google Play Store | `skills/app-store-optimization` |
| **architecture** | ⚪ | Architectural decision-making framework. Requirements analysis, trade-off evaluation, ADR documentation. Use when making architecture decisions or analyzing system design. | `skills/architecture` |
| **autonomous-agent-patterns** | ⚪ | Design patterns for building autonomous coding agents. Covers tool integration, permission systems, browser automation, and human-in-the-loop workflows. Use when building AI agents, designing tool APIs, implementing permission systems, or creating autonomous coding assistants. | `skills/autonomous-agent-patterns` |
| **autonomous-agents** | ⚪ | Autonomous agents are AI systems that can independently decompose goals, plan actions, execute tools, and self-correct without constant human guidance. The challenge isn't making them capable - it's making them reliable. Every extra decision multiplies failure probability. This skill covers agent loops (ReAct, Plan-Execute), goal decomposition, reflection patterns, and production reliability. Key insight: compounding error rates kill autonomous agents. A 95% success rate per step drops to 60% b | `skills/autonomous-agents` |
| **avalonia-layout-zafiro** | ⚪ | Guidelines for modern Avalonia UI layout using Zafiro.Avalonia, emphasizing shared styles, generic components, and avoiding XAML redundancy. | `skills/avalonia-layout-zafiro` |
| **avalonia-viewmodels-zafiro** | ⚪ | Optimal ViewModel and Wizard creation patterns for Avalonia using Zafiro and ReactiveUI. | `skills/avalonia-viewmodels-zafiro` |
| **avalonia-zafiro-development** | ⚪ | Mandatory skills, conventions, and behavioral rules for Avalonia UI development using the Zafiro toolkit. | `skills/avalonia-zafiro-development` |
| **AWS Penetration Testing** | ⚪ | This skill should be used when the user asks to "pentest AWS", "test AWS security", "enumerate IAM", "exploit cloud infrastructure", "AWS privilege escalation", "S3 bucket testing", "metadata SSRF", "Lambda exploitation", or needs guidance on Amazon Web Services security assessment. | `skills/aws-penetration-testing` |
| **aws-serverless** | ⚪ | Specialized skill for building production-ready serverless applications on AWS. Covers Lambda functions, API Gateway, DynamoDB, SQS/SNS event-driven patterns, SAM/CDK deployment, and cold start optimization. | `skills/aws-serverless` |
| **azure-functions** | ⚪ | Expert patterns for Azure Functions development including isolated worker model, Durable Functions orchestration, cold start optimization, and production patterns. Covers .NET, Python, and Node.js programming models. Use when: azure function, azure functions, durable functions, azure serverless, function app. | `skills/azure-functions` |
| **backend-dev-guidelines** | ⚪ | Opinionated backend development standards for Node.js + Express + TypeScript microservices. Covers layered architecture, BaseController pattern, dependency injection, Prisma repositories, Zod validation, unifiedConfig, Sentry error tracking, async safety, and testing discipline. | `skills/backend-dev-guidelines` |
| **backend-patterns** | ⚪ | Backend architecture patterns, API design, database optimization, and server-side best practices for Node.js, Express, and Next.js API routes. | `skills/cc-skill-backend-patterns` |
| **bash-linux** | ⚪ | Bash/Linux terminal patterns. Critical commands, piping, error handling, scripting. Use when working on macOS or Linux systems. | `skills/bash-linux` |
| **behavioral-modes** | ⚪ | AI operational modes (brainstorm, implement, debug, review, teach, ship, orchestrate). Use to adapt behavior based on task type. | `skills/behavioral-modes` |
| **blockrun** | ⚪ | Use when user needs capabilities Claude lacks (image generation, real-time X/Twitter data) or explicitly requests external models ("blockrun", "use grok", "use gpt", "dall-e", "deepseek") | `skills/blockrun` |
| **brainstorming** | ⚪ | Use this skill before any creative or constructive work (features, components, architecture, behavior changes, or functionality). This skill transforms vague ideas into validated designs through disciplined, incremental reasoning and collaboration. | `skills/brainstorming` |
| **brand-guidelines** | ⚪ | Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply. | `skills/brand-guidelines-anthropic` |
| **brand-guidelines** | ⚪ | Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply. | `skills/brand-guidelines-community` |
| **Broken Authentication Testing** | ⚪ | This skill should be used when the user asks to "test for broken authentication vulnerabilities", "assess session management security", "perform credential stuffing tests", "evaluate password policies", "test for session fixation", or "identify authentication bypass flaws". It provides comprehensive techniques for identifying authentication and session management weaknesses in web applications. | `skills/broken-authentication` |
| **browser-automation** | ⚪ | Browser automation powers web testing, scraping, and AI agent interactions. The difference between a flaky script and a reliable system comes down to understanding selectors, waiting strategies, and anti-detection patterns. This skill covers Playwright (recommended) and Puppeteer, with patterns for testing, scraping, and agentic browser control. Key insight: Playwright won the framework war. Unless you need Puppeteer's stealth ecosystem or are Chrome-only, Playwright is the better choice in 202 | `skills/browser-automation` |
| **browser-extension-builder** | ⚪ | Expert in building browser extensions that solve real problems - Chrome, Firefox, and cross-browser extensions. Covers extension architecture, manifest v3, content scripts, popup UIs, monetization strategies, and Chrome Web Store publishing. Use when: browser extension, chrome extension, firefox addon, extension, manifest v3. | `skills/browser-extension-builder` |
| **bullmq-specialist** | ⚪ | BullMQ expert for Redis-backed job queues, background processing, and reliable async execution in Node.js/TypeScript applications. Use when: bullmq, bull queue, redis queue, background job, job queue. | `skills/bullmq-specialist` |
| **bun-development** | ⚪ | Modern JavaScript/TypeScript development with Bun runtime. Covers package management, bundling, testing, and migration from Node.js. Use when working with Bun, optimizing JS/TS development speed, or migrating from Node.js to Bun. | `skills/bun-development` |
| **Burp Suite Web Application Testing** | ⚪ | This skill should be used when the user asks to "intercept HTTP traffic", "modify web requests", "use Burp Suite for testing", "perform web vulnerability scanning", "test with Burp Repeater", "analyze HTTP history", or "configure proxy for web testing". It provides comprehensive guidance for using Burp Suite's core features for web application security testing. | `skills/burp-suite-testing` |
| **busybox-on-windows** | ⚪ | How to use a Win32 build of BusyBox to run many of the standard UNIX command line tools on Windows. | `skills/busybox-on-windows` |
| **canvas-design** | ⚪ | Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations. | `skills/canvas-design` |
| **cc-skill-continuous-learning** | ⚪ | Development skill from everything-claude-code | `skills/cc-skill-continuous-learning` |
| **cc-skill-project-guidelines-example** | ⚪ | Project Guidelines Skill (Example) | `skills/cc-skill-project-guidelines-example` |
| **cc-skill-strategic-compact** | ⚪ | Development skill from everything-claude-code | `skills/cc-skill-strategic-compact` |
| **Claude Code Guide** | ⚪ | Master guide for using Claude Code effectively. Includes configuration templates, prompting strategies "Thinking" keywords, debugging techniques, and best practices for interacting with the agent. | `skills/claude-code-guide` |
| **clean-code** | ⚪ | Pragmatic coding standards - concise, direct, no over-engineering, no unnecessary comments | `skills/clean-code` |
| **clerk-auth** | ⚪ | Expert patterns for Clerk auth implementation, middleware, organizations, webhooks, and user sync Use when: adding authentication, clerk auth, user authentication, sign in, sign up. | `skills/clerk-auth` |
| **clickhouse-io** | ⚪ | ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads. | `skills/cc-skill-clickhouse-io` |
| **Cloud Penetration Testing** | ⚪ | This skill should be used when the user asks to "perform cloud penetration testing", "assess Azure or AWS or GCP security", "enumerate cloud resources", "exploit cloud misconfigurations", "test O365 security", "extract secrets from cloud environments", or "audit cloud infrastructure". It provides comprehensive techniques for security assessment across major cloud platforms. | `skills/cloud-penetration-testing` |
| **code-review-checklist** | ⚪ | Comprehensive checklist for conducting thorough code reviews covering functionality, security, performance, and maintainability | `skills/code-review-checklist` |
| **codex-review** | ⚪ | Professional code review with auto CHANGELOG generation, integrated with Codex AI | `skills/codex-review` |
| **coding-standards** | ⚪ | Universal coding standards, best practices, and patterns for TypeScript, JavaScript, React, and Node.js development. | `skills/cc-skill-coding-standards` |
| **competitor-alternatives** | ⚪ | When the user wants to create competitor comparison or alternative pages for SEO and sales enablement. Also use when the user mentions 'alternative page,' 'vs page,' 'competitor comparison,' 'comparison page,' '[Product] vs [Product],' '[Product] alternative,' or 'competitive landing pages.' Covers four formats: singular alternative, plural alternatives, you vs competitor, and competitor vs competitor. Emphasizes deep research, modular content architecture, and varied section types beyond feature tables. | `skills/competitor-alternatives` |
| **computer-use-agents** | ⚪ | Build AI agents that interact with computers like humans do - viewing screens, moving cursors, clicking buttons, and typing text. Covers Anthropic's Computer Use, OpenAI's Operator/CUA, and open-source alternatives. Critical focus on sandboxing, security, and handling the unique challenges of vision-based control. Use when: computer use, desktop automation agent, screen control AI, vision-based agent, GUI automation. | `skills/computer-use-agents` |
| **concise-planning** | ⚪ | Use when a user asks for a plan for a coding task, to generate a clear, actionable, and atomic checklist. | `skills/concise-planning` |
| **content-creator** | ⚪ | Create SEO-optimized marketing content with consistent brand voice. Includes brand voice analyzer, SEO optimizer, content frameworks, and social media templates. Use when writing blog posts, creating social media content, analyzing brand voice, optimizing SEO, planning content calendars, or when user mentions content creation, brand voice, SEO optimization, social media marketing, or content strategy. | `skills/content-creator` |
| **context-window-management** | ⚪ | Strategies for managing LLM context windows including summarization, trimming, routing, and avoiding context rot Use when: context window, token limit, context management, context engineering, long context. | `skills/context-window-management` |
| **context7-auto-research** | ⚪ | Automatically fetch latest library/framework documentation for Claude Code via Context7 API | `skills/context7-auto-research` |
| **conversation-memory** | ⚪ | Persistent memory systems for LLM conversations including short-term, long-term, and entity-based memory Use when: conversation memory, remember, memory persistence, long-term memory, chat history. | `skills/conversation-memory` |
| **copy-editing** | ⚪ | When the user wants to edit, review, or improve existing marketing copy. Also use when the user mentions 'edit this copy,' 'review my copy,' 'copy feedback,' 'proofread,' 'polish this,' 'make this better,' or 'copy sweep.' This skill provides a systematic approach to editing marketing copy through multiple focused passes. | `skills/copy-editing` |
| **copywriting** | ⚪ | Use this skill when writing, rewriting, or improving marketing copy for any page (homepage, landing page, pricing, feature, product, or about page). This skill produces clear, compelling, and testable copy while enforcing alignment, honesty, and conversion best practices. | `skills/copywriting` |
| **core-components** | ⚪ | Core component library and design system patterns. Use when building UI, using design tokens, or working with the component library. | `skills/core-components` |
| **crewai** | ⚪ | Expert in CrewAI - the leading role-based multi-agent framework used by 60% of Fortune 500 companies. Covers agent design with roles and goals, task definition, crew orchestration, process types (sequential, hierarchical, parallel), memory systems, and flows for complex workflows. Essential for building collaborative AI agent teams. Use when: crewai, multi-agent team, agent roles, crew of agents, role-based agents. | `skills/crewai` |
| **Cross-Site Scripting and HTML Injection Testing** | ⚪ | This skill should be used when the user asks to "test for XSS vulnerabilities", "perform cross-site scripting attacks", "identify HTML injection flaws", "exploit client-side injection vulnerabilities", "steal cookies via XSS", or "bypass content security policies". It provides comprehensive techniques for detecting, exploiting, and understanding XSS and HTML injection attack vectors in web applications. | `skills/xss-html-injection` |
| **d3-viz** | ⚪ | Creating interactive data visualisations using d3.js. This skill should be used when creating custom charts, graphs, network diagrams, geographic visualisations, or any complex SVG-based data visualisation that requires fine-grained control over visual elements, transitions, or interactions. Use this for bespoke visualisations beyond standard charting libraries, whether in React, Vue, Svelte, vanilla JavaScript, or any other environment. | `skills/claude-d3js-skill` |
| **daily-news-report** | ⚪ | 基于预设 URL 列表抓取内容,筛选高质量技术信息并生成每日 Markdown 报告。 | `skills/daily-news-report` |
| **database-design** | ⚪ | Database design principles and decision-making. Schema design, indexing strategy, ORM selection, serverless databases. | `skills/database-design` |
| **deployment-procedures** | ⚪ | Production deployment principles and decision-making. Safe deployment workflows, rollback strategies, and verification. Teaches thinking, not scripts. | `skills/deployment-procedures` |
| **design-orchestration** | ⚪ | Orchestrates design workflows by routing work through brainstorming, multi-agent review, and execution readiness in the correct order. Prevents premature implementation, skipped validation, and unreviewed high-risk designs. | `skills/design-orchestration` |
| **discord-bot-architect** | ⚪ | Specialized skill for building production-ready Discord bots. Covers Discord.js (JavaScript) and Pycord (Python), gateway intents, slash commands, interactive components, rate limiting, and sharding. | `skills/discord-bot-architect` |
| **dispatching-parallel-agents** | ⚪ | Use when facing 2+ independent tasks that can be worked on without shared state or sequential dependencies | `skills/dispatching-parallel-agents` |
| **doc-coauthoring** | ⚪ | Guide users through a structured workflow for co-authoring documentation. Use when user wants to write documentation, proposals, technical specs, decision docs, or similar structured content. This workflow helps users efficiently transfer context, refine content through iteration, and verify the doc works for readers. Trigger when user mentions writing docs, creating proposals, drafting specs, or similar documentation tasks. | `skills/doc-coauthoring` |
| **docker-expert** | ⚪ | Docker containerization expert with deep knowledge of multi-stage builds, image optimization, container security, Docker Compose orchestration, and production deployment patterns. Use PROACTIVELY for Dockerfile optimization, container issues, image size problems, security hardening, networking, and orchestration challenges. | `skills/docker-expert` |
| **documentation-templates** | ⚪ | Documentation templates and structure guidelines. README, API docs, code comments, and AI-friendly documentation. | `skills/documentation-templates` |
| **docx** | ⚪ | Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. When Claude needs to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, (3) Working with tracked changes, (4) Adding comments, or any other document tasks | `skills/docx-official` |
| **email-sequence** | ⚪ | When the user wants to create or optimize an email sequence, drip campaign, automated email flow, or lifecycle email program. Also use when the user mentions "email sequence," "drip campaign," "nurture sequence," "onboarding emails," "welcome sequence," "re-engagement emails," "email automation," or "lifecycle emails." For in-app onboarding, see onboarding-cro. | `skills/email-sequence` |
| **email-systems** | ⚪ | Email has the highest ROI of any marketing channel. $36 for every $1 spent. Yet most startups treat it as an afterthought - bulk blasts, no personalization, landing in spam folders. This skill covers transactional email that works, marketing automation that converts, deliverability that reaches inboxes, and the infrastructure decisions that scale. Use when: keywords, file_patterns, code_patterns. | `skills/email-systems` |
| **environment-setup-guide** | ⚪ | Guide developers through setting up development environments with proper tools, dependencies, and configurations | `skills/environment-setup-guide` |
| **Ethical Hacking Methodology** | ⚪ | This skill should be used when the user asks to "learn ethical hacking", "understand penetration testing lifecycle", "perform reconnaissance", "conduct security scanning", "exploit vulnerabilities", or "write penetration test reports". It provides comprehensive ethical hacking methodology and techniques. | `skills/ethical-hacking-methodology` |
| **exa-search** | ⚪ | Semantic search, similar content discovery, and structured research using Exa API | `skills/exa-search` |
| **executing-plans** | ⚪ | Use when you have a written implementation plan to execute in a separate session with review checkpoints | `skills/executing-plans` |
| **File Path Traversal Testing** | ⚪ | This skill should be used when the user asks to "test for directory traversal", "exploit path traversal vulnerabilities", "read arbitrary files through web applications", "find LFI vulnerabilities", or "access files outside web root". It provides comprehensive file path traversal attack and testing methodologies. | `skills/file-path-traversal` |
| **file-organizer** | ⚪ | Intelligently organizes files and folders by understanding context, finding duplicates, and suggesting better organizational structures. Use when user wants to clean up directories, organize downloads, remove duplicates, or restructure projects. | `skills/file-organizer` |
| **file-uploads** | ⚪ | Expert at handling file uploads and cloud storage. Covers S3, Cloudflare R2, presigned URLs, multipart uploads, and image optimization. Knows how to handle large files without blocking. Use when: file upload, S3, R2, presigned URL, multipart. | `skills/file-uploads` |
| **finishing-a-development-branch** | ⚪ | Use when implementation is complete, all tests pass, and you need to decide how to integrate the work - guides completion of development work by presenting structured options for merge, PR, or cleanup | `skills/finishing-a-development-branch` |
| **firebase** | ⚪ | Firebase gives you a complete backend in minutes - auth, database, storage, functions, hosting. But the ease of setup hides real complexity. Security rules are your last line of defense, and they're often wrong. Firestore queries are limited, and you learn this after you've designed your data model. This skill covers Firebase Authentication, Firestore, Realtime Database, Cloud Functions, Cloud Storage, and Firebase Hosting. Key insight: Firebase is optimized for read-heavy, denormalized data. I | `skills/firebase` |
| **firecrawl-scraper** | ⚪ | Deep web scraping, screenshots, PDF parsing, and website crawling using Firecrawl API | `skills/firecrawl-scraper` |
| **form-cro** | ⚪ | Optimize any form that is NOT signup or account registration — including lead capture, contact, demo request, application, survey, quote, and checkout forms. Use when the goal is to increase form completion rate, reduce friction, or improve lead quality without breaking compliance or downstream workflows. | `skills/form-cro` |
| **free-tool-strategy** | ⚪ | When the user wants to plan, evaluate, or build a free tool for marketing purposes — lead generation, SEO value, or brand awareness. Also use when the user mentions "engineering as marketing," "free tool," "marketing tool," "calculator," "generator," "interactive tool," "lead gen tool," "build a tool for leads," or "free resource." This skill bridges engineering and marketing — useful for founders and technical marketers. | `skills/free-tool-strategy` |
| **frontend-design** | ⚪ | Create distinctive, production-grade frontend interfaces with intentional aesthetics, high craft, and non-generic visual identity. Use when building or styling web UIs, components, pages, dashboards, or frontend applications. | `skills/frontend-design` |
| **frontend-dev-guidelines** | ⚪ | Opinionated frontend development standards for modern React + TypeScript applications. Covers Suspense-first data fetching, lazy loading, feature-based architecture, MUI v7 styling, TanStack Router, performance optimization, and strict TypeScript practices. | `skills/frontend-dev-guidelines` |
| **frontend-patterns** | ⚪ | Frontend development patterns for React, Next.js, state management, performance optimization, and UI best practices. | `skills/cc-skill-frontend-patterns` |
| **game-art** | ⚪ | Game art principles. Visual style selection, asset pipeline, animation workflow. | `skills/game-development/game-art` |
| **game-audio** | ⚪ | Game audio principles. Sound design, music integration, adaptive audio systems. | `skills/game-development/game-audio` |
| **game-design** | ⚪ | Game design principles. GDD structure, balancing, player psychology, progression. | `skills/game-development/game-design` |
| **game-development** | ⚪ | Game development orchestrator. Routes to platform-specific skills based on project needs. | `skills/game-development` |
| **gcp-cloud-run** | ⚪ | Specialized skill for building production-ready serverless applications on GCP. Covers Cloud Run services (containerized), Cloud Run Functions (event-driven), cold start optimization, and event-driven architecture with Pub/Sub. | `skills/gcp-cloud-run` |
| **geo-fundamentals** | ⚪ | Generative Engine Optimization for AI search engines (ChatGPT, Claude, Perplexity). | `skills/geo-fundamentals` |
| **git-pushing** | ⚪ | Stage, commit, and push git changes with conventional commit messages. Use when user wants to commit and push changes, mentions pushing to remote, or asks to save and push their work. Also activates when user says "push changes", "commit and push", "push this", "push to github", or similar git workflow requests. | `skills/git-pushing` |
| **github-workflow-automation** | ⚪ | Automate GitHub workflows with AI assistance. Includes PR reviews, issue triage, CI/CD integration, and Git operations. Use when automating GitHub workflows, setting up PR review automation, creating GitHub Actions, or triaging issues. | `skills/github-workflow-automation` |
| **graphql** | ⚪ | GraphQL gives clients exactly the data they need - no more, no less. One endpoint, typed schema, introspection. But the flexibility that makes it powerful also makes it dangerous. Without proper controls, clients can craft queries that bring down your server. This skill covers schema design, resolvers, DataLoader for N+1 prevention, federation for microservices, and client integration with Apollo/urql. Key insight: GraphQL is a contract. The schema is the API documentation. Design it carefully. | `skills/graphql` |
| **HTML Injection Testing** | ⚪ | This skill should be used when the user asks to "test for HTML injection", "inject HTML into web pages", "perform HTML injection attacks", "deface web applications", or "test content injection vulnerabilities". It provides comprehensive HTML injection attack techniques and testing methodologies. | `skills/html-injection-testing` |
| **hubspot-integration** | ⚪ | Expert patterns for HubSpot CRM integration including OAuth authentication, CRM objects, associations, batch operations, webhooks, and custom objects. Covers Node.js and Python SDKs. Use when: hubspot, hubspot api, hubspot crm, hubspot integration, contacts api. | `skills/hubspot-integration` |
| **i18n-localization** | ⚪ | Internationalization and localization patterns. Detecting hardcoded strings, managing translations, locale files, RTL support. | `skills/i18n-localization` |
| **IDOR Vulnerability Testing** | ⚪ | This skill should be used when the user asks to "test for insecure direct object references," "find IDOR vulnerabilities," "exploit broken access control," "enumerate user IDs or object references," or "bypass authorization to access other users' data." It provides comprehensive guidance for detecting, exploiting, and remediating IDOR vulnerabilities in web applications. | `skills/idor-testing` |
| **inngest** | ⚪ | Inngest expert for serverless-first background jobs, event-driven workflows, and durable execution without managing queues or workers. Use when: inngest, serverless background job, event-driven workflow, step function, durable execution. | `skills/inngest` |
| **interactive-portfolio** | ⚪ | Expert in building portfolios that actually land jobs and clients - not just showing work, but creating memorable experiences. Covers developer portfolios, designer portfolios, creative portfolios, and portfolios that convert visitors into opportunities. Use when: portfolio, personal website, showcase work, developer portfolio, designer portfolio. | `skills/interactive-portfolio` |
| **internal-comms** | ⚪ | A set of resources to help me write all kinds of internal communications, using the formats that my company likes to use. Claude should use this skill whenever asked to write some sort of internal communications (status reports, leadership updates, 3P updates, company newsletters, FAQs, incident reports, project updates, etc.). | `skills/internal-comms-anthropic` |
| **internal-comms** | ⚪ | A set of resources to help me write all kinds of internal communications, using the formats that my company likes to use. Claude should use this skill whenever asked to write some sort of internal communications (status reports, leadership updates, 3P updates, company newsletters, FAQs, incident reports, project updates, etc.). | `skills/internal-comms-community` |
| **javascript-mastery** | ⚪ | Comprehensive JavaScript reference covering 33+ essential concepts every developer should know. From fundamentals like primitives and closures to advanced patterns like async/await and functional programming. Use when explaining JS concepts, debugging JavaScript issues, or teaching JavaScript fundamentals. | `skills/javascript-mastery` |
| **kaizen** | ⚪ | Guide for continuous improvement, error proofing, and standardization. Use this skill when the user wants to improve code quality, refactor, or discuss process improvements. | `skills/kaizen` |
| **langfuse** | ⚪ | Expert in Langfuse - the open-source LLM observability platform. Covers tracing, prompt management, evaluation, datasets, and integration with LangChain, LlamaIndex, and OpenAI. Essential for debugging, monitoring, and improving LLM applications in production. Use when: langfuse, llm observability, llm tracing, prompt management, llm evaluation. | `skills/langfuse` |
| **langgraph** | ⚪ | Expert in LangGraph - the production-grade framework for building stateful, multi-actor AI applications. Covers graph construction, state management, cycles and branches, persistence with checkpointers, human-in-the-loop patterns, and the ReAct agent pattern. Used in production at LinkedIn, Uber, and 400+ companies. This is LangChain's recommended approach for building agents. Use when: langgraph, langchain agent, stateful agent, agent graph, react agent. | `skills/langgraph` |
| **last30days** | ⚪ | Research a topic from the last 30 days on Reddit + X + Web, become an expert, and write copy-paste-ready prompts for the user's target tool. | `skills/last30days` |
| **launch-strategy** | ⚪ | When the user wants to plan a product launch, feature announcement, or release strategy. Also use when the user mentions 'launch,' 'Product Hunt,' 'feature release,' 'announcement,' 'go-to-market,' 'beta launch,' 'early access,' 'waitlist,' or 'product update.' This skill covers phased launches, channel strategy, and ongoing launch momentum. | `skills/launch-strategy` |
| **lint-and-validate** | ⚪ | Automatic quality control, linting, and static analysis procedures. Use after every code modification to ensure syntax correctness and project standards. Triggers onKeywords: lint, format, check, validate, types, static analysis. | `skills/lint-and-validate` |
| **Linux Privilege Escalation** | ⚪ | This skill should be used when the user asks to "escalate privileges on Linux", "find privesc vectors on Linux systems", "exploit sudo misconfigurations", "abuse SUID binaries", "exploit cron jobs for root access", "enumerate Linux systems for privilege escalation", or "gain root access from low-privilege shell". It provides comprehensive techniques for identifying and exploiting privilege escalation paths on Linux systems. | `skills/linux-privilege-escalation` |
| **Linux Production Shell Scripts** | ⚪ | This skill should be used when the user asks to "create bash scripts", "automate Linux tasks", "monitor system resources", "backup files", "manage users", or "write production shell scripts". It provides ready-to-use shell script templates for system administration. | `skills/linux-shell-scripting` |
| **llm-app-patterns** | ⚪ | Production-ready patterns for building LLM applications. Covers RAG pipelines, agent architectures, prompt IDEs, and LLMOps monitoring. Use when designing AI applications, implementing RAG, building agents, or setting up LLM observability. | `skills/llm-app-patterns` |
| **loki-mode** | ⚪ | Multi-agent autonomous startup system for Claude Code. Triggers on "Loki Mode". Orchestrates 100+ specialized agents across engineering, QA, DevOps, security, data/ML, business operations, marketing, HR, and customer success. Takes PRD to fully deployed, revenue-generating product with zero human intervention. Features Task tool for subagent dispatch, parallel code review with 3 specialized reviewers, severity-based issue triage, distributed task queue with dead letter handling, automatic deployment to cloud providers, A/B testing, customer feedback loops, incident response, circuit breakers, and self-healing. Handles rate limits via distributed state checkpoints and auto-resume with exponential backoff. Requires --dangerously-skip-permissions flag. | `skills/loki-mode` |
| **marketing-ideas** | ⚪ | Provide proven marketing strategies and growth ideas for SaaS and software products, prioritized using a marketing feasibility scoring system. | `skills/marketing-ideas` |
| **marketing-psychology** | ⚪ | Apply behavioral science and mental models to marketing decisions, prioritized using a psychological leverage and feasibility scoring system. | `skills/marketing-psychology` |
| **mcp-builder** | ⚪ | Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK). | `skills/mcp-builder` |
| **Metasploit Framework** | ⚪ | This skill should be used when the user asks to "use Metasploit for penetration testing", "exploit vulnerabilities with msfconsole", "create payloads with msfvenom", "perform post-exploitation", "use auxiliary modules for scanning", or "develop custom exploits". It provides comprehensive guidance for leveraging the Metasploit Framework in security assessments. | `skills/metasploit-framework` |
| **micro-saas-launcher** | ⚪ | Expert in launching small, focused SaaS products fast - the indie hacker approach to building profitable software. Covers idea validation, MVP development, pricing, launch strategies, and growing to sustainable revenue. Ship in weeks, not months. Use when: micro saas, indie hacker, small saas, side project, saas mvp. | `skills/micro-saas-launcher` |
| **mobile-design** | ⚪ | Mobile-first design and engineering doctrine for iOS and Android apps. Covers touch interaction, performance, platform conventions, offline behavior, and mobile-specific decision-making. Teaches principles and constraints, not fixed layouts. Use for React Native, Flutter, or native mobile apps. | `skills/mobile-design` |
| **mobile-games** | ⚪ | Mobile game development principles. Touch input, battery, performance, app stores. | `skills/game-development/mobile-games` |
| **moodle-external-api-development** | ⚪ | Create custom external web service APIs for Moodle LMS. Use when implementing web services for course management, user tracking, quiz operations, or custom plugin functionality. Covers parameter validation, database operations, error handling, service registration, and Moodle coding standards. | `skills/moodle-external-api-development` |
| **multi-agent-brainstorming** | ⚪ | Use this skill when a design or idea requires higher confidence, risk reduction, or formal review. This skill orchestrates a structured, sequential multi-agent design review where each agent has a strict, non-overlapping role. It prevents blind spots, false confidence, and premature convergence. | `skills/multi-agent-brainstorming` |
| **multiplayer** | ⚪ | Multiplayer game development principles. Architecture, networking, synchronization. | `skills/game-development/multiplayer` |
| **neon-postgres** | ⚪ | Expert patterns for Neon serverless Postgres, branching, connection pooling, and Prisma/Drizzle integration Use when: neon database, serverless postgres, database branching, neon postgres, postgres serverless. | `skills/neon-postgres` |
| **nestjs-expert** | ⚪ | Nest.js framework expert specializing in module architecture, dependency injection, middleware, guards, interceptors, testing with Jest/Supertest, TypeORM/Mongoose integration, and Passport.js authentication. Use PROACTIVELY for any Nest.js application issues including architecture decisions, testing strategies, performance optimization, or debugging complex dependency injection problems. If a specialized expert is a better fit, I will recommend switching and stop. | `skills/nestjs-expert` |
| **Network 101** | ⚪ | This skill should be used when the user asks to "set up a web server", "configure HTTP or HTTPS", "perform SNMP enumeration", "configure SMB shares", "test network services", or needs guidance on configuring and testing network services for penetration testing labs. | `skills/network-101` |
| **nextjs-best-practices** | ⚪ | Next.js App Router principles. Server Components, data fetching, routing patterns. | `skills/nextjs-best-practices` |
| **nextjs-supabase-auth** | ⚪ | Expert integration of Supabase Auth with Next.js App Router Use when: supabase auth next, authentication next.js, login supabase, auth middleware, protected route. | `skills/nextjs-supabase-auth` |
| **nodejs-best-practices** | ⚪ | Node.js development principles and decision-making. Framework selection, async patterns, security, and architecture. Teaches thinking, not copying. | `skills/nodejs-best-practices` |
| **nosql-expert** | ⚪ | Expert guidance for distributed NoSQL databases (Cassandra, DynamoDB). Focuses on mental models, query-first modeling, single-table design, and avoiding hot partitions in high-scale systems. | `skills/nosql-expert` |
| **notebooklm** | ⚪ | Use this skill to query your Google NotebookLM notebooks directly from Claude Code for source-grounded, citation-backed answers from Gemini. Browser automation, library management, persistent auth. Drastically reduced hallucinations through document-only responses. | `skills/notebooklm` |
| **notion-template-business** | ⚪ | Expert in building and selling Notion templates as a business - not just making templates, but building a sustainable digital product business. Covers template design, pricing, marketplaces, marketing, and scaling to real revenue. Use when: notion template, sell templates, digital product, notion business, gumroad. | `skills/notion-template-business` |
| **obsidian-clipper-template-creator** | ⚪ | Guide for creating templates for the Obsidian Web Clipper. Use when you want to create a new clipping template, understand available variables, or format clipped content. | `skills/obsidian-clipper-template-creator` |
| **onboarding-cro** | ⚪ | When the user wants to optimize post-signup onboarding, user activation, first-run experience, or time-to-value. Also use when the user mentions "onboarding flow," "activation rate," "user activation," "first-run experience," "empty states," "onboarding checklist," "aha moment," or "new user experience." For signup/registration optimization, see signup-flow-cro. For ongoing email sequences, see email-sequence. | `skills/onboarding-cro` |
| **page-cro** | ⚪ | Analyze and optimize individual pages for conversion performance. Use when the user wants to improve conversion rates, diagnose why a page is underperforming, or increase the effectiveness of marketing pages (homepage, landing pages, pricing, feature pages, or blog posts). This skill focuses on diagnosis, prioritization, and testable recommendations— not blind optimization. | `skills/page-cro` |
| **paid-ads** | ⚪ | When the user wants help with paid advertising campaigns on Google Ads, Meta (Facebook/Instagram), LinkedIn, Twitter/X, or other ad platforms. Also use when the user mentions 'PPC,' 'paid media,' 'ad copy,' 'ad creative,' 'ROAS,' 'CPA,' 'ad campaign,' 'retargeting,' or 'audience targeting.' This skill covers campaign strategy, ad creation, audience targeting, and optimization. | `skills/paid-ads` |
| **parallel-agents** | ⚪ | Multi-agent orchestration patterns. Use when multiple independent tasks can run with different domain expertise or when comprehensive analysis requires multiple perspectives. | `skills/parallel-agents` |
| **paywall-upgrade-cro** | ⚪ | When the user wants to create or optimize in-app paywalls, upgrade screens, upsell modals, or feature gates. Also use when the user mentions "paywall," "upgrade screen," "upgrade modal," "upsell," "feature gate," "convert free to paid," "freemium conversion," "trial expiration screen," "limit reached screen," "plan upgrade prompt," or "in-app pricing." Distinct from public pricing pages (see page-cro) — this skill focuses on in-product upgrade moments where the user has already experienced value. | `skills/paywall-upgrade-cro` |
| **pc-games** | ⚪ | PC and console game development principles. Engine selection, platform features, optimization strategies. | `skills/game-development/pc-games` |
| **pdf** | ⚪ | Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When Claude needs to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale. | `skills/pdf-official` |
| **Pentest Checklist** | ⚪ | This skill should be used when the user asks to "plan a penetration test", "create a security assessment checklist", "prepare for penetration testing", "define pentest scope", "follow security testing best practices", or needs a structured methodology for penetration testing engagements. | `skills/pentest-checklist` |
| **Pentest Commands** | ⚪ | This skill should be used when the user asks to "run pentest commands", "scan with nmap", "use metasploit exploits", "crack passwords with hydra or john", "scan web vulnerabilities with nikto", "enumerate networks", or needs essential penetration testing command references. | `skills/pentest-commands` |
| **performance-profiling** | ⚪ | Performance profiling principles. Measurement, analysis, and optimization techniques. | `skills/performance-profiling` |
| **personal-tool-builder** | ⚪ | Expert in building custom tools that solve your own problems first. The best products often start as personal tools - scratch your own itch, build for yourself, then discover others have the same itch. Covers rapid prototyping, local-first apps, CLI tools, scripts that grow into products, and the art of dogfooding. Use when: build a tool, personal tool, scratch my itch, solve my problem, CLI tool. | `skills/personal-tool-builder` |
| **plaid-fintech** | ⚪ | Expert patterns for Plaid API integration including Link token flows, transactions sync, identity verification, Auth for ACH, balance checks, webhook handling, and fintech compliance best practices. Use when: plaid, bank account linking, bank connection, ach, account aggregation. | `skills/plaid-fintech` |
| **plan-writing** | ⚪ | Structured task planning with clear breakdowns, dependencies, and verification criteria. Use when implementing features, refactoring, or any multi-step work. | `skills/plan-writing` |
| **planning-with-files** | ⚪ | Implements Manus-style file-based planning for complex tasks. Creates task_plan.md, findings.md, and progress.md. Use when starting complex multi-step tasks, research projects, or any task requiring >5 tool calls. | `skills/planning-with-files` |
| **playwright-skill** | ⚪ | Complete browser automation with Playwright. Auto-detects dev servers, writes clean test scripts to /tmp. Test pages, fill forms, take screenshots, check responsive design, validate UX, test login flows, check links, automate any browser task. Use when user wants to test websites, automate browser interactions, validate web functionality, or perform any browser-based testing. | `skills/playwright-skill` |
| **popup-cro** | ⚪ | Create and optimize popups, modals, overlays, slide-ins, and banners to increase conversions without harming user experience or brand trust. | `skills/popup-cro` |
| **powershell-windows** | ⚪ | PowerShell Windows patterns. Critical pitfalls, operator syntax, error handling. | `skills/powershell-windows` |
| **pptx** | ⚪ | Presentation creation, editing, and analysis. When Claude needs to work with presentations (.pptx files) for: (1) Creating new presentations, (2) Modifying or editing content, (3) Working with layouts, (4) Adding comments or speaker notes, or any other presentation tasks | `skills/pptx-official` |
| **pricing-strategy** | ⚪ | Design pricing, packaging, and monetization strategies based on value, customer willingness to pay, and growth objectives. | `skills/pricing-strategy` |
| **prisma-expert** | ⚪ | Prisma ORM expert for schema design, migrations, query optimization, relations modeling, and database operations. Use PROACTIVELY for Prisma schema issues, migration problems, query performance, relation design, or database connection issues. | `skills/prisma-expert` |
| **Privilege Escalation Methods** | ⚪ | This skill should be used when the user asks to "escalate privileges", "get root access", "become administrator", "privesc techniques", "abuse sudo", "exploit SUID binaries", "Kerberoasting", "pass-the-ticket", "token impersonation", or needs guidance on post-exploitation privilege escalation for Linux or Windows systems. | `skills/privilege-escalation-methods` |
| **product-manager-toolkit** | ⚪ | Comprehensive toolkit for product managers including RICE prioritization, customer interview analysis, PRD templates, discovery frameworks, and go-to-market strategies. Use for feature prioritization, user research synthesis, requirement documentation, and product strategy development. | `skills/product-manager-toolkit` |
| **production-code-audit** | ⚪ | Autonomously deep-scan entire codebase line-by-line, understand architecture and patterns, then systematically transform it to production-grade, corporate-level professional quality with optimizations | `skills/production-code-audit` |
| **programmatic-seo** | ⚪ | Design and evaluate programmatic SEO strategies for creating SEO-driven pages at scale using templates and structured data. Use when the user mentions programmatic SEO, pages at scale, template pages, directory pages, location pages, comparison pages, integration pages, or keyword-pattern page generation. This skill focuses on feasibility, strategy, and page system design—not execution unless explicitly requested. | `skills/programmatic-seo` |
| **prompt-caching** | ⚪ | Caching strategies for LLM prompts including Anthropic prompt caching, response caching, and CAG (Cache Augmented Generation) Use when: prompt caching, cache prompt, response cache, cag, cache augmented. | `skills/prompt-caching` |
| **prompt-engineer** | ⚪ | Expert in designing effective prompts for LLM-powered applications. Masters prompt structure, context management, output formatting, and prompt evaluation. Use when: prompt engineering, system prompt, few-shot, chain of thought, prompt design. | `skills/prompt-engineer` |
| **prompt-engineering** | ⚪ | Expert guide on prompt engineering patterns, best practices, and optimization techniques. Use when user wants to improve prompts, learn prompting strategies, or debug agent behavior. | `skills/prompt-engineering` |
| **prompt-library** | ⚪ | Curated collection of high-quality prompts for various use cases. Includes role-based prompts, task-specific templates, and prompt refinement techniques. Use when user needs prompt templates, role-play prompts, or ready-to-use prompt examples for coding, writing, analysis, or creative tasks. | `skills/prompt-library` |
| **python-patterns** | ⚪ | Python development principles and decision-making. Framework selection, async patterns, type hints, project structure. Teaches thinking, not copying. | `skills/python-patterns` |
| **rag-engineer** | ⚪ | Expert in building Retrieval-Augmented Generation systems. Masters embedding models, vector databases, chunking strategies, and retrieval optimization for LLM applications. Use when: building RAG, vector search, embeddings, semantic search, document retrieval. | `skills/rag-engineer` |
| **rag-implementation** | ⚪ | Retrieval-Augmented Generation patterns including chunking, embeddings, vector stores, and retrieval optimization Use when: rag, retrieval augmented, vector search, embeddings, semantic search. | `skills/rag-implementation` |
| **react-patterns** | ⚪ | Modern React patterns and principles. Hooks, composition, performance, TypeScript best practices. | `skills/react-patterns` |
| **react-ui-patterns** | ⚪ | Modern React UI patterns for loading states, error handling, and data fetching. Use when building UI components, handling async data, or managing UI states. | `skills/react-ui-patterns` |
| **receiving-code-review** | ⚪ | Use when receiving code review feedback, before implementing suggestions, especially if feedback seems unclear or technically questionable - requires technical rigor and verification, not performative agreement or blind implementation | `skills/receiving-code-review` |
| **Red Team Tools and Methodology** | ⚪ | This skill should be used when the user asks to "follow red team methodology", "perform bug bounty hunting", "automate reconnaissance", "hunt for XSS vulnerabilities", "enumerate subdomains", or needs security researcher techniques and tool configurations from top bug bounty hunters. | `skills/red-team-tools` |
| **red-team-tactics** | ⚪ | Red team tactics principles based on MITRE ATT&CK. Attack phases, detection evasion, reporting. | `skills/red-team-tactics` |
| **referral-program** | ⚪ | When the user wants to create, optimize, or analyze a referral program, affiliate program, or word-of-mouth strategy. Also use when the user mentions 'referral,' 'affiliate,' 'ambassador,' 'word of mouth,' 'viral loop,' 'refer a friend,' or 'partner program.' This skill covers program design, incentive structure, and growth optimization. | `skills/referral-program` |
| **remotion-best-practices** | ⚪ | Best practices for Remotion - Video creation in React | `skills/remotion-best-practices` |
| **requesting-code-review** | ⚪ | Use when completing tasks, implementing major features, or before merging to verify work meets requirements | `skills/requesting-code-review` |
| **research-engineer** | ⚪ | An uncompromising Academic Research Engineer. Operates with absolute scientific rigor, objective criticism, and zero flair. Focuses on theoretical correctness, formal verification, and optimal implementation across any required technology. | `skills/research-engineer` |
| **salesforce-development** | ⚪ | Expert patterns for Salesforce platform development including Lightning Web Components (LWC), Apex triggers and classes, REST/Bulk APIs, Connected Apps, and Salesforce DX with scratch orgs and 2nd generation packages (2GP). Use when: salesforce, sfdc, apex, lwc, lightning web components. | `skills/salesforce-development` |
| **schema-markup** | ⚪ | Design, validate, and optimize schema.org structured data for eligibility, correctness, and measurable SEO impact. Use when the user wants to add, fix, audit, or scale schema markup (JSON-LD) for rich results. This skill evaluates whether schema should be implemented, what types are valid, and how to deploy safely according to Google guidelines. | `skills/schema-markup` |
| **scroll-experience** | ⚪ | Expert in building immersive scroll-driven experiences - parallax storytelling, scroll animations, interactive narratives, and cinematic web experiences. Like NY Times interactives, Apple product pages, and award-winning web experiences. Makes websites feel like experiences, not just pages. Use when: scroll animation, parallax, scroll storytelling, interactive story, cinematic website. | `skills/scroll-experience` |
| **Security Scanning Tools** | ⚪ | This skill should be used when the user asks to "perform vulnerability scanning", "scan networks for open ports", "assess web application security", "scan wireless networks", "detect malware", "check cloud security", or "evaluate system compliance". It provides comprehensive guidance on security scanning tools and methodologies. | `skills/scanning-tools` |
| **security-review** | ⚪ | Use this skill when adding authentication, handling user input, working with secrets, creating API endpoints, or implementing payment/sensitive features. Provides comprehensive security checklist and patterns. | `skills/cc-skill-security-review` |
| **segment-cdp** | ⚪ | Expert patterns for Segment Customer Data Platform including Analytics.js, server-side tracking, tracking plans with Protocols, identity resolution, destinations configuration, and data governance best practices. Use when: segment, analytics.js, customer data platform, cdp, tracking plan. | `skills/segment-cdp` |
| **senior-architect** | ⚪ | Comprehensive software architecture skill for designing scalable, maintainable systems using ReactJS, NextJS, NodeJS, Express, React Native, Swift, Kotlin, Flutter, Postgres, GraphQL, Go, Python. Includes architecture diagram generation, system design patterns, tech stack decision frameworks, and dependency analysis. Use when designing system architecture, making technical decisions, creating architecture diagrams, evaluating trade-offs, or defining integration patterns. | `skills/senior-architect` |
| **senior-fullstack** | ⚪ | Comprehensive fullstack development skill for building complete web applications with React, Next.js, Node.js, GraphQL, and PostgreSQL. Includes project scaffolding, code quality analysis, architecture patterns, and complete tech stack guidance. Use when building new projects, analyzing code quality, implementing design patterns, or setting up development workflows. | `skills/senior-fullstack` |
| **seo-audit** | ⚪ | Diagnose and audit SEO issues affecting crawlability, indexation, rankings, and organic performance. Use when the user asks for an SEO audit, technical SEO review, ranking diagnosis, on-page SEO review, meta tag audit, or SEO health check. This skill identifies issues and prioritizes actions but does not execute changes. For large-scale page creation, use programmatic-seo. For structured data, use schema-markup. | `skills/seo-audit` |
| **seo-fundamentals** | ⚪ | Core principles of SEO including E-E-A-T, Core Web Vitals, technical foundations, content quality, and how modern search engines evaluate pages. This skill explains _why_ SEO works, not how to execute specific optimizations. | `skills/seo-fundamentals` |
| **server-management** | ⚪ | Server management principles and decision-making. Process management, monitoring strategy, and scaling decisions. Teaches thinking, not commands. | `skills/server-management` |
| **Shodan Reconnaissance and Pentesting** | ⚪ | This skill should be used when the user asks to "search for exposed devices on the internet," "perform Shodan reconnaissance," "find vulnerable services using Shodan," "scan IP ranges with Shodan," or "discover IoT devices and open ports." It provides comprehensive guidance for using Shodan's search engine, CLI, and API for penetration testing reconnaissance. | `skills/shodan-reconnaissance` |
| **shopify-apps** | ⚪ | Expert patterns for Shopify app development including Remix/React Router apps, embedded apps with App Bridge, webhook handling, GraphQL Admin API, Polaris components, billing, and app extensions. Use when: shopify app, shopify, embedded app, polaris, app bridge. | `skills/shopify-apps` |
| **shopify-development** | ⚪ | Build Shopify apps, extensions, themes using GraphQL Admin API, Shopify CLI, Polaris UI, and Liquid. TRIGGER: "shopify", "shopify app", "checkout extension", "admin extension", "POS extension", "shopify theme", "liquid template", "polaris", "shopify graphql", "shopify webhook", "shopify billing", "app subscription", "metafields", "shopify functions" | `skills/shopify-development` |
| **signup-flow-cro** | ⚪ | When the user wants to optimize signup, registration, account creation, or trial activation flows. Also use when the user mentions "signup conversions," "registration friction," "signup form optimization," "free trial signup," "reduce signup dropoff," or "account creation flow." For post-signup onboarding, see onboarding-cro. For lead capture forms (not account creation), see form-cro. | `skills/signup-flow-cro` |
| **skill-creator** | ⚪ | Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Claude's capabilities with specialized knowledge, workflows, or tool integrations. | `skills/skill-creator` |
| **skill-developer** | ⚪ | Create and manage Claude Code skills following Anthropic best practices. Use when creating new skills, modifying skill-rules.json, understanding trigger patterns, working with hooks, debugging skill activation, or implementing progressive disclosure. Covers skill structure, YAML frontmatter, trigger types (keywords, intent patterns, file paths, content patterns), enforcement levels (block, suggest, warn), hook mechanisms (UserPromptSubmit, PreToolUse), session tracking, and the 500-line rule. | `skills/skill-developer` |
| **slack-bot-builder** | ⚪ | Build Slack apps using the Bolt framework across Python, JavaScript, and Java. Covers Block Kit for rich UIs, interactive components, slash commands, event handling, OAuth installation flows, and Workflow Builder integration. Focus on best practices for production-ready Slack apps. Use when: slack bot, slack app, bolt framework, block kit, slash command. | `skills/slack-bot-builder` |
| **slack-gif-creator** | ⚪ | Knowledge and utilities for creating animated GIFs optimized for Slack. Provides constraints, validation tools, and animation concepts. Use when users request animated GIFs for Slack like "make me a GIF of X doing Y for Slack." | `skills/slack-gif-creator` |
| **SMTP Penetration Testing** | ⚪ | This skill should be used when the user asks to "perform SMTP penetration testing", "enumerate email users", "test for open mail relays", "grab SMTP banners", "brute force email credentials", or "assess mail server security". It provides comprehensive techniques for testing SMTP server security. | `skills/smtp-penetration-testing` |
| **social-content** | ⚪ | When the user wants help creating, scheduling, or optimizing social media content for LinkedIn, Twitter/X, Instagram, TikTok, Facebook, or other platforms. Also use when the user mentions 'LinkedIn post,' 'Twitter thread,' 'social media,' 'content calendar,' 'social scheduling,' 'engagement,' or 'viral content.' This skill covers content creation, repurposing, and platform-specific strategies. | `skills/social-content` |
| **software-architecture** | ⚪ | Guide for quality focused software architecture. This skill should be used when users want to write code, design architecture, analyze code, in any case that relates to software development. | `skills/software-architecture` |
| **SQL Injection Testing** | ⚪ | This skill should be used when the user asks to "test for SQL injection vulnerabilities", "perform SQLi attacks", "bypass authentication using SQL injection", "extract database information through injection", "detect SQL injection flaws", or "exploit database query vulnerabilities". It provides comprehensive techniques for identifying, exploiting, and understanding SQL injection attack vectors across different database systems. | `skills/sql-injection-testing` |
| **SQLMap Database Penetration Testing** | ⚪ | This skill should be used when the user asks to "automate SQL injection testing," "enumerate database structure," "extract database credentials using sqlmap," "dump tables and columns from a vulnerable database," or "perform automated database penetration testing." It provides comprehensive guidance for using SQLMap to detect and exploit SQL injection vulnerabilities. | `skills/sqlmap-database-pentesting` |
| **SSH Penetration Testing** | ⚪ | This skill should be used when the user asks to "pentest SSH services", "enumerate SSH configurations", "brute force SSH credentials", "exploit SSH vulnerabilities", "perform SSH tunneling", or "audit SSH security". It provides comprehensive SSH penetration testing methodologies and techniques. | `skills/ssh-penetration-testing` |
| **stripe-integration** | ⚪ | Get paid from day one. Payments, subscriptions, billing portal, webhooks, metered billing, Stripe Connect. The complete guide to implementing Stripe correctly, including all the edge cases that will bite you at 3am. This isn't just API calls - it's the full payment system: handling failures, managing subscriptions, dealing with dunning, and keeping revenue flowing. Use when: stripe, payments, subscription, billing, checkout. | `skills/stripe-integration` |
| **subagent-driven-development** | ⚪ | Use when executing implementation plans with independent tasks in the current session | `skills/subagent-driven-development` |
| **supabase-postgres-best-practices** | ⚪ | Postgres performance optimization and best practices from Supabase. Use this skill when writing, reviewing, or optimizing Postgres queries, schema designs, or database configurations. | `skills/postgres-best-practices` |
| **systematic-debugging** | ⚪ | Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes | `skills/systematic-debugging` |
| **tailwind-patterns** | ⚪ | Tailwind CSS v4 principles. CSS-first configuration, container queries, modern patterns, design token architecture. | `skills/tailwind-patterns` |
| **tavily-web** | ⚪ | Web search, content extraction, crawling, and research capabilities using Tavily API | `skills/tavily-web` |
| **tdd-workflow** | ⚪ | Test-Driven Development workflow principles. RED-GREEN-REFACTOR cycle. | `skills/tdd-workflow` |
| **telegram-bot-builder** | ⚪ | Expert in building Telegram bots that solve real problems - from simple automation to complex AI-powered bots. Covers bot architecture, the Telegram Bot API, user experience, monetization strategies, and scaling bots to thousands of users. Use when: telegram bot, bot api, telegram automation, chat bot telegram, tg bot. | `skills/telegram-bot-builder` |
| **telegram-mini-app** | ⚪ | Expert in building Telegram Mini Apps (TWA) - web apps that run inside Telegram with native-like experience. Covers the TON ecosystem, Telegram Web App API, payments, user authentication, and building viral mini apps that monetize. Use when: telegram mini app, TWA, telegram web app, TON app, mini app. | `skills/telegram-mini-app` |
| **templates** | ⚪ | Project scaffolding templates for new applications. Use when creating new projects from scratch. Contains 12 templates for various tech stacks. | `skills/app-builder/templates` |
| **test-driven-development** | ⚪ | Use when implementing any feature or bugfix, before writing implementation code | `skills/test-driven-development` |
| **test-fixing** | ⚪ | Run tests and systematically fix all failing tests using smart error grouping. Use when user asks to fix failing tests, mentions test failures, runs test suite and failures occur, or requests to make tests pass. | `skills/test-fixing` |
| **testing-patterns** | ⚪ | Jest testing patterns, factory functions, mocking strategies, and TDD workflow. Use when writing unit tests, creating test factories, or following TDD red-green-refactor cycle. | `skills/testing-patterns` |
| **theme-factory** | ⚪ | Toolkit for styling artifacts with a theme. These artifacts can be slides, docs, reportings, HTML landing pages, etc. There are 10 pre-set themes with colors/fonts that you can apply to any artifact that has been creating, or can generate a new theme on-the-fly. | `skills/theme-factory` |
| **Top 100 Web Vulnerabilities Reference** | ⚪ | This skill should be used when the user asks to "identify web application vulnerabilities", "explain common security flaws", "understand vulnerability categories", "learn about injection attacks", "review access control weaknesses", "analyze API security issues", "assess security misconfigurations", "understand client-side vulnerabilities", "examine mobile and IoT security flaws", or "reference the OWASP-aligned vulnerability taxonomy". Use this skill to provide comprehensive vulnerability definitions, root causes, impacts, and mitigation strategies across all major web security categories. | `skills/top-web-vulnerabilities` |
| **trigger-dev** | ⚪ | Trigger.dev expert for background jobs, AI workflows, and reliable async execution with excellent developer experience and TypeScript-first design. Use when: trigger.dev, trigger dev, background task, ai background job, long running task. | `skills/trigger-dev` |
| **twilio-communications** | ⚪ | Build communication features with Twilio: SMS messaging, voice calls, WhatsApp Business API, and user verification (2FA). Covers the full spectrum from simple notifications to complex IVR systems and multi-channel authentication. Critical focus on compliance, rate limits, and error handling. Use when: twilio, send SMS, text message, voice call, phone verification. | `skills/twilio-communications` |
| **typescript-expert** | ⚪ | TypeScript and JavaScript expert with deep knowledge of type-level programming, performance optimization, monorepo management, migration strategies, and modern tooling. Use PROACTIVELY for any TypeScript/JavaScript issues including complex type gymnastics, build performance, debugging, and architectural decisions. If a specialized expert is a better fit, I will recommend switching and stop. | `skills/typescript-expert` |
| **ui-ux-pro-max** | ⚪ | UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 9 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind, shadcn/ui). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient. Integrations: shadcn/ui MCP for component search and examples. | `skills/ui-ux-pro-max` |
| **upstash-qstash** | ⚪ | Upstash QStash expert for serverless message queues, scheduled jobs, and reliable HTTP-based task delivery without managing infrastructure. Use when: qstash, upstash queue, serverless cron, scheduled http, message queue serverless. | `skills/upstash-qstash` |
| **using-git-worktrees** | ⚪ | Use when starting feature work that needs isolation from current workspace or before executing implementation plans - creates isolated git worktrees with smart directory selection and safety verification | `skills/using-git-worktrees` |
| **using-superpowers** | ⚪ | Use when starting any conversation - establishes how to find and use skills, requiring Skill tool invocation before ANY response including clarifying questions | `skills/using-superpowers` |
| **vercel-deployment** | ⚪ | Expert knowledge for deploying to Vercel with Next.js Use when: vercel, deploy, deployment, hosting, production. | `skills/vercel-deployment` |
| **vercel-react-best-practices** | ⚪ | React and Next.js performance optimization guidelines from Vercel Engineering. This skill should be used when writing, reviewing, or refactoring React/Next.js code to ensure optimal performance patterns. Triggers on tasks involving React components, Next.js pages, data fetching, bundle optimization, or performance improvements. | `skills/react-best-practices` |
| **verification-before-completion** | ⚪ | Use when about to claim work is complete, fixed, or passing, before committing or creating PRs - requires running verification commands and confirming output before making any success claims; evidence before assertions always | `skills/verification-before-completion` |
| **viral-generator-builder** | ⚪ | Expert in building shareable generator tools that go viral - name generators, quiz makers, avatar creators, personality tests, and calculator tools. Covers the psychology of sharing, viral mechanics, and building tools people can't resist sharing with friends. Use when: generator tool, quiz maker, name generator, avatar creator, viral tool. | `skills/viral-generator-builder` |
| **voice-agents** | ⚪ | Voice agents represent the frontier of AI interaction - humans speaking naturally with AI systems. The challenge isn't just speech recognition and synthesis, it's achieving natural conversation flow with sub-800ms latency while handling interruptions, background noise, and emotional nuance. This skill covers two architectures: speech-to-speech (OpenAI Realtime API, lowest latency, most natural) and pipeline (STT→LLM→TTS, more control, easier to debug). Key insight: latency is the constraint. Hu | `skills/voice-agents` |
| **voice-ai-development** | ⚪ | Expert in building voice AI applications - from real-time voice agents to voice-enabled apps. Covers OpenAI Realtime API, Vapi for voice agents, Deepgram for transcription, ElevenLabs for synthesis, LiveKit for real-time infrastructure, and WebRTC fundamentals. Knows how to build low-latency, production-ready voice experiences. Use when: voice ai, voice agent, speech to text, text to speech, realtime voice. | `skills/voice-ai-development` |
| **vr-ar** | ⚪ | VR/AR development principles. Comfort, interaction, performance requirements. | `skills/game-development/vr-ar` |
| **vulnerability-scanner** | ⚪ | Advanced vulnerability analysis principles. OWASP 2025, Supply Chain Security, attack surface mapping, risk prioritization. | `skills/vulnerability-scanner` |
| **web-artifacts-builder** | ⚪ | Suite of tools for creating elaborate, multi-component claude.ai HTML artifacts using modern frontend web technologies (React, Tailwind CSS, shadcn/ui). Use for complex artifacts requiring state management, routing, or shadcn/ui components - not for simple single-file HTML/JSX artifacts. | `skills/web-artifacts-builder` |
| **web-design-guidelines** | ⚪ | Review UI code for Web Interface Guidelines compliance. Use when asked to "review my UI", "check accessibility", "audit design", "review UX", or "check my site against best practices". | `skills/web-design-guidelines` |
| **web-games** | ⚪ | Web browser game development principles. Framework selection, WebGPU, optimization, PWA. | `skills/game-development/web-games` |
| **web-performance-optimization** | ⚪ | Optimize website and web application performance including loading speed, Core Web Vitals, bundle size, caching strategies, and runtime performance | `skills/web-performance-optimization` |
| **webapp-testing** | ⚪ | Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs. | `skills/webapp-testing` |
| **Windows Privilege Escalation** | ⚪ | This skill should be used when the user asks to "escalate privileges on Windows," "find Windows privesc vectors," "enumerate Windows for privilege escalation," "exploit Windows misconfigurations," or "perform post-exploitation privilege escalation." It provides comprehensive guidance for discovering and exploiting privilege escalation vulnerabilities in Windows environments. | `skills/windows-privilege-escalation` |
| **Wireshark Network Traffic Analysis** | ⚪ | This skill should be used when the user asks to "analyze network traffic with Wireshark", "capture packets for troubleshooting", "filter PCAP files", "follow TCP/UDP streams", "detect network anomalies", "investigate suspicious traffic", or "perform protocol analysis". It provides comprehensive techniques for network packet capture, filtering, and analysis using Wireshark. | `skills/wireshark-analysis` |
| **WordPress Penetration Testing** | ⚪ | This skill should be used when the user asks to "pentest WordPress sites", "scan WordPress for vulnerabilities", "enumerate WordPress users, themes, or plugins", "exploit WordPress vulnerabilities", or "use WPScan". It provides comprehensive WordPress security assessment methodologies. | `skills/wordpress-penetration-testing` |
| **workflow-automation** | ⚪ | Workflow automation is the infrastructure that makes AI agents reliable. Without durable execution, a network hiccup during a 10-step payment flow means lost money and angry customers. With it, workflows resume exactly where they left off. This skill covers the platforms (n8n, Temporal, Inngest) and patterns (sequential, parallel, orchestrator-worker) that turn brittle scripts into production-grade automation. Key insight: The platforms make different tradeoffs. n8n optimizes for accessibility | `skills/workflow-automation` |
| **writing-plans** | ⚪ | Use when you have a spec or requirements for a multi-step task, before touching code | `skills/writing-plans` |
| **writing-skills** | ⚪ | Use when creating new skills, editing existing skills, or verifying skills work before deployment | `skills/writing-skills` |
| **xlsx** | ⚪ | Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas | `skills/xlsx-official` |
| **zapier-make-patterns** | ⚪ | No-code automation democratizes workflow building. Zapier and Make (formerly Integromat) let non-developers automate business processes without writing code. But no-code doesn't mean no-complexity - these platforms have their own patterns, pitfalls, and breaking points. This skill covers when to use which platform, how to build reliable automations, and when to graduate to code-based solutions. Key insight: Zapier optimizes for simplicity and integrations (7000+ apps), Make optimizes for power | `skills/zapier-make-patterns` |
| Skill Name | Risk | Description | Path |
| :--- | :--- | :--- | :--- |
| **2d-games** | ⚪ | 2D game development principles. Sprites, tilemaps, physics, camera. | `skills/game-development/2d-games` |
| **3d-games** | ⚪ | 3D game development principles. Rendering, shaders, physics, cameras. | `skills/game-development/3d-games` |
| **3d-web-experience** | ⚪ | Expert in building 3D experiences for the web - Three.js, React Three Fiber, Spline, WebGL, and interactive 3D scenes. Covers product configurators, 3D portfolios, immersive websites, and bringing depth to web experiences. Use when: 3D website, three.js, WebGL, react three fiber, 3D experience. | `skills/3d-web-experience` |
| **ab-test-setup** | ⚪ | Structured guide for setting up A/B tests with mandatory gates for hypothesis, metrics, and execution readiness. | `skills/ab-test-setup` |
| **Active Directory Attacks** | ⚪ | This skill should be used when the user asks to "attack Active Directory", "exploit AD", "Kerberoasting", "DCSync", "pass-the-hash", "BloodHound enumeration", "Golden Ticket", "Silver Ticket", "AS-REP roasting", "NTLM relay", or needs guidance on Windows domain penetration testing. | `skills/active-directory-attacks` |
| **address-github-comments** | ⚪ | Use when you need to address review or issue comments on an open GitHub Pull Request using the gh CLI. | `skills/address-github-comments` |
| **agent-evaluation** | ⚪ | Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world benchmarks Use when: agent testing, agent evaluation, benchmark agents, agent reliability, test agent. | `skills/agent-evaluation` |
| **agent-manager-skill** | ⚪ | Manage multiple local CLI agents via tmux sessions (start/stop/monitor/assign) with cron-friendly scheduling. | `skills/agent-manager-skill` |
| **agent-memory-mcp** | ⚪ | A hybrid memory system that provides persistent, searchable knowledge management for AI agents (Architecture, Patterns, Decisions). | `skills/agent-memory-mcp` |
| **agent-memory-systems** | ⚪ | Memory is the cornerstone of intelligent agents. Without it, every interaction starts from zero. This skill covers the architecture of agent memory: short-term (context window), long-term (vector stores), and the cognitive architectures that organize them. Key insight: Memory isn't just storage - it's retrieval. A million stored facts mean nothing if you can't find the right one. Chunking, embedding, and retrieval strategies determine whether your agent remembers or forgets. The field is fragm | `skills/agent-memory-systems` |
| **agent-tool-builder** | ⚪ | Tools are how AI agents interact with the world. A well-designed tool is the difference between an agent that works and one that hallucinates, fails silently, or costs 10x more tokens than necessary. This skill covers tool design from schema to error handling. JSON Schema best practices, description writing that actually helps the LLM, validation, and the emerging MCP standard that's becoming the lingua franca for AI tools. Key insight: Tool descriptions are more important than tool implementa | `skills/agent-tool-builder` |
| **ai-agents-architect** | ⚪ | Expert in designing and building autonomous AI agents. Masters tool use, memory systems, planning strategies, and multi-agent orchestration. Use when: build agent, AI agent, autonomous agent, tool use, function calling. | `skills/ai-agents-architect` |
| **ai-product** | ⚪ | Every product will be AI-powered. The question is whether you'll build it right or ship a demo that falls apart in production. This skill covers LLM integration patterns, RAG architecture, prompt engineering that scales, AI UX that users trust, and cost optimization that doesn't bankrupt you. Use when: keywords, file_patterns, code_patterns. | `skills/ai-product` |
| **ai-wrapper-product** | ⚪ | Expert in building products that wrap AI APIs (OpenAI, Anthropic, etc.) into focused tools people will pay for. Not just 'ChatGPT but different' - products that solve specific problems with AI. Covers prompt engineering for products, cost management, rate limiting, and building defensible AI businesses. Use when: AI wrapper, GPT product, AI tool, wrap AI, AI SaaS. | `skills/ai-wrapper-product` |
| **algolia-search** | ⚪ | Expert patterns for Algolia search implementation, indexing strategies, React InstantSearch, and relevance tuning Use when: adding search to, algolia, instantsearch, search api, search functionality. | `skills/algolia-search` |
| **algorithmic-art** | ⚪ | Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations. | `skills/algorithmic-art` |
| **analytics-tracking** | ⚪ | Design, audit, and improve analytics tracking systems that produce reliable, decision-ready data. Use when the user wants to set up, fix, or evaluate analytics tracking (GA4, GTM, product analytics, events, conversions, UTMs). This skill focuses on measurement strategy, signal quality, and validation— not just firing events. | `skills/analytics-tracking` |
| **API Fuzzing for Bug Bounty** | ⚪ | This skill should be used when the user asks to "test API security", "fuzz APIs", "find IDOR vulnerabilities", "test REST API", "test GraphQL", "API penetration testing", "bug bounty API testing", or needs guidance on API security assessment techniques. | `skills/api-fuzzing-bug-bounty` |
| **api-documentation-generator** | ⚪ | Generate comprehensive, developer-friendly API documentation from code, including endpoints, parameters, examples, and best practices | `skills/api-documentation-generator` |
| **api-patterns** | ⚪ | API design principles and decision-making. REST vs GraphQL vs tRPC selection, response formats, versioning, pagination. | `skills/api-patterns` |
| **api-security-best-practices** | ⚪ | Implement secure API design patterns including authentication, authorization, input validation, rate limiting, and protection against common API vulnerabilities | `skills/api-security-best-practices` |
| **app-builder** | ⚪ | Main application building orchestrator. Creates full-stack applications from natural language requests. Determines project type, selects tech stack, coordinates agents. | `skills/app-builder` |
| **app-store-optimization** | ⚪ | Complete App Store Optimization (ASO) toolkit for researching, optimizing, and tracking mobile app performance on Apple App Store and Google Play Store | `skills/app-store-optimization` |
| **architecture** | ⚪ | Architectural decision-making framework. Requirements analysis, trade-off evaluation, ADR documentation. Use when making architecture decisions or analyzing system design. | `skills/architecture` |
| **autonomous-agent-patterns** | ⚪ | Design patterns for building autonomous coding agents. Covers tool integration, permission systems, browser automation, and human-in-the-loop workflows. Use when building AI agents, designing tool APIs, implementing permission systems, or creating autonomous coding assistants. | `skills/autonomous-agent-patterns` |
| **autonomous-agents** | ⚪ | Autonomous agents are AI systems that can independently decompose goals, plan actions, execute tools, and self-correct without constant human guidance. The challenge isn't making them capable - it's making them reliable. Every extra decision multiplies failure probability. This skill covers agent loops (ReAct, Plan-Execute), goal decomposition, reflection patterns, and production reliability. Key insight: compounding error rates kill autonomous agents. A 95% success rate per step drops to 60% b | `skills/autonomous-agents` |
| **avalonia-layout-zafiro** | ⚪ | Guidelines for modern Avalonia UI layout using Zafiro.Avalonia, emphasizing shared styles, generic components, and avoiding XAML redundancy. | `skills/avalonia-layout-zafiro` |
| **avalonia-viewmodels-zafiro** | ⚪ | Optimal ViewModel and Wizard creation patterns for Avalonia using Zafiro and ReactiveUI. | `skills/avalonia-viewmodels-zafiro` |
| **avalonia-zafiro-development** | ⚪ | Mandatory skills, conventions, and behavioral rules for Avalonia UI development using the Zafiro toolkit. | `skills/avalonia-zafiro-development` |
| **AWS Penetration Testing** | ⚪ | This skill should be used when the user asks to "pentest AWS", "test AWS security", "enumerate IAM", "exploit cloud infrastructure", "AWS privilege escalation", "S3 bucket testing", "metadata SSRF", "Lambda exploitation", or needs guidance on Amazon Web Services security assessment. | `skills/aws-penetration-testing` |
| **aws-serverless** | ⚪ | Specialized skill for building production-ready serverless applications on AWS. Covers Lambda functions, API Gateway, DynamoDB, SQS/SNS event-driven patterns, SAM/CDK deployment, and cold start optimization. | `skills/aws-serverless` |
| **azure-functions** | ⚪ | Expert patterns for Azure Functions development including isolated worker model, Durable Functions orchestration, cold start optimization, and production patterns. Covers .NET, Python, and Node.js programming models. Use when: azure function, azure functions, durable functions, azure serverless, function app. | `skills/azure-functions` |
| **backend-dev-guidelines** | ⚪ | Opinionated backend development standards for Node.js + Express + TypeScript microservices. Covers layered architecture, BaseController pattern, dependency injection, Prisma repositories, Zod validation, unifiedConfig, Sentry error tracking, async safety, and testing discipline. | `skills/backend-dev-guidelines` |
| **backend-patterns** | ⚪ | Backend architecture patterns, API design, database optimization, and server-side best practices for Node.js, Express, and Next.js API routes. | `skills/cc-skill-backend-patterns` |
| **bash-linux** | ⚪ | Bash/Linux terminal patterns. Critical commands, piping, error handling, scripting. Use when working on macOS or Linux systems. | `skills/bash-linux` |
| **behavioral-modes** | ⚪ | AI operational modes (brainstorm, implement, debug, review, teach, ship, orchestrate). Use to adapt behavior based on task type. | `skills/behavioral-modes` |
| **blockrun** | ⚪ | Use when user needs capabilities Claude lacks (image generation, real-time X/Twitter data) or explicitly requests external models ("blockrun", "use grok", "use gpt", "dall-e", "deepseek") | `skills/blockrun` |
| **brainstorming** | ⚪ | Use this skill before any creative or constructive work (features, components, architecture, behavior changes, or functionality). This skill transforms vague ideas into validated designs through disciplined, incremental reasoning and collaboration. | `skills/brainstorming` |
| **brand-guidelines** | ⚪ | Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply. | `skills/brand-guidelines-anthropic` |
| **brand-guidelines** | ⚪ | Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply. | `skills/brand-guidelines-community` |
| **Broken Authentication Testing** | ⚪ | This skill should be used when the user asks to "test for broken authentication vulnerabilities", "assess session management security", "perform credential stuffing tests", "evaluate password policies", "test for session fixation", or "identify authentication bypass flaws". It provides comprehensive techniques for identifying authentication and session management weaknesses in web applications. | `skills/broken-authentication` |
| **browser-automation** | ⚪ | Browser automation powers web testing, scraping, and AI agent interactions. The difference between a flaky script and a reliable system comes down to understanding selectors, waiting strategies, and anti-detection patterns. This skill covers Playwright (recommended) and Puppeteer, with patterns for testing, scraping, and agentic browser control. Key insight: Playwright won the framework war. Unless you need Puppeteer's stealth ecosystem or are Chrome-only, Playwright is the better choice in 202 | `skills/browser-automation` |
| **browser-extension-builder** | ⚪ | Expert in building browser extensions that solve real problems - Chrome, Firefox, and cross-browser extensions. Covers extension architecture, manifest v3, content scripts, popup UIs, monetization strategies, and Chrome Web Store publishing. Use when: browser extension, chrome extension, firefox addon, extension, manifest v3. | `skills/browser-extension-builder` |
| **bullmq-specialist** | ⚪ | BullMQ expert for Redis-backed job queues, background processing, and reliable async execution in Node.js/TypeScript applications. Use when: bullmq, bull queue, redis queue, background job, job queue. | `skills/bullmq-specialist` |
| **bun-development** | ⚪ | Modern JavaScript/TypeScript development with Bun runtime. Covers package management, bundling, testing, and migration from Node.js. Use when working with Bun, optimizing JS/TS development speed, or migrating from Node.js to Bun. | `skills/bun-development` |
| **Burp Suite Web Application Testing** | ⚪ | This skill should be used when the user asks to "intercept HTTP traffic", "modify web requests", "use Burp Suite for testing", "perform web vulnerability scanning", "test with Burp Repeater", "analyze HTTP history", or "configure proxy for web testing". It provides comprehensive guidance for using Burp Suite's core features for web application security testing. | `skills/burp-suite-testing` |
| **busybox-on-windows** | ⚪ | How to use a Win32 build of BusyBox to run many of the standard UNIX command line tools on Windows. | `skills/busybox-on-windows` |
| **canvas-design** | ⚪ | Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations. | `skills/canvas-design` |
| **cc-skill-continuous-learning** | ⚪ | Development skill from everything-claude-code | `skills/cc-skill-continuous-learning` |
| **cc-skill-project-guidelines-example** | ⚪ | Project Guidelines Skill (Example) | `skills/cc-skill-project-guidelines-example` |
| **cc-skill-strategic-compact** | ⚪ | Development skill from everything-claude-code | `skills/cc-skill-strategic-compact` |
| **Claude Code Guide** | ⚪ | Master guide for using Claude Code effectively. Includes configuration templates, prompting strategies "Thinking" keywords, debugging techniques, and best practices for interacting with the agent. | `skills/claude-code-guide` |
| **clean-code** | ⚪ | Pragmatic coding standards - concise, direct, no over-engineering, no unnecessary comments | `skills/clean-code` |
| **clerk-auth** | ⚪ | Expert patterns for Clerk auth implementation, middleware, organizations, webhooks, and user sync Use when: adding authentication, clerk auth, user authentication, sign in, sign up. | `skills/clerk-auth` |
| **clickhouse-io** | ⚪ | ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads. | `skills/cc-skill-clickhouse-io` |
| **Cloud Penetration Testing** | ⚪ | This skill should be used when the user asks to "perform cloud penetration testing", "assess Azure or AWS or GCP security", "enumerate cloud resources", "exploit cloud misconfigurations", "test O365 security", "extract secrets from cloud environments", or "audit cloud infrastructure". It provides comprehensive techniques for security assessment across major cloud platforms. | `skills/cloud-penetration-testing` |
| **code-review-checklist** | ⚪ | Comprehensive checklist for conducting thorough code reviews covering functionality, security, performance, and maintainability | `skills/code-review-checklist` |
| **codex-review** | ⚪ | Professional code review with auto CHANGELOG generation, integrated with Codex AI | `skills/codex-review` |
| **coding-standards** | ⚪ | Universal coding standards, best practices, and patterns for TypeScript, JavaScript, React, and Node.js development. | `skills/cc-skill-coding-standards` |
| **competitor-alternatives** | ⚪ | When the user wants to create competitor comparison or alternative pages for SEO and sales enablement. Also use when the user mentions 'alternative page,' 'vs page,' 'competitor comparison,' 'comparison page,' '[Product] vs [Product],' '[Product] alternative,' or 'competitive landing pages.' Covers four formats: singular alternative, plural alternatives, you vs competitor, and competitor vs competitor. Emphasizes deep research, modular content architecture, and varied section types beyond feature tables. | `skills/competitor-alternatives` |
| **computer-use-agents** | ⚪ | Build AI agents that interact with computers like humans do - viewing screens, moving cursors, clicking buttons, and typing text. Covers Anthropic's Computer Use, OpenAI's Operator/CUA, and open-source alternatives. Critical focus on sandboxing, security, and handling the unique challenges of vision-based control. Use when: computer use, desktop automation agent, screen control AI, vision-based agent, GUI automation. | `skills/computer-use-agents` |
| **concise-planning** | ⚪ | Use when a user asks for a plan for a coding task, to generate a clear, actionable, and atomic checklist. | `skills/concise-planning` |
| **content-creator** | ⚪ | Create SEO-optimized marketing content with consistent brand voice. Includes brand voice analyzer, SEO optimizer, content frameworks, and social media templates. Use when writing blog posts, creating social media content, analyzing brand voice, optimizing SEO, planning content calendars, or when user mentions content creation, brand voice, SEO optimization, social media marketing, or content strategy. | `skills/content-creator` |
| **context-window-management** | ⚪ | Strategies for managing LLM context windows including summarization, trimming, routing, and avoiding context rot Use when: context window, token limit, context management, context engineering, long context. | `skills/context-window-management` |
| **context7-auto-research** | ⚪ | Automatically fetch latest library/framework documentation for Claude Code via Context7 API | `skills/context7-auto-research` |
| **conversation-memory** | ⚪ | Persistent memory systems for LLM conversations including short-term, long-term, and entity-based memory Use when: conversation memory, remember, memory persistence, long-term memory, chat history. | `skills/conversation-memory` |
| **copy-editing** | ⚪ | When the user wants to edit, review, or improve existing marketing copy. Also use when the user mentions 'edit this copy,' 'review my copy,' 'copy feedback,' 'proofread,' 'polish this,' 'make this better,' or 'copy sweep.' This skill provides a systematic approach to editing marketing copy through multiple focused passes. | `skills/copy-editing` |
| **copywriting** | ⚪ | Use this skill when writing, rewriting, or improving marketing copy for any page (homepage, landing page, pricing, feature, product, or about page). This skill produces clear, compelling, and testable copy while enforcing alignment, honesty, and conversion best practices. | `skills/copywriting` |
| **core-components** | ⚪ | Core component library and design system patterns. Use when building UI, using design tokens, or working with the component library. | `skills/core-components` |
| **crewai** | ⚪ | Expert in CrewAI - the leading role-based multi-agent framework used by 60% of Fortune 500 companies. Covers agent design with roles and goals, task definition, crew orchestration, process types (sequential, hierarchical, parallel), memory systems, and flows for complex workflows. Essential for building collaborative AI agent teams. Use when: crewai, multi-agent team, agent roles, crew of agents, role-based agents. | `skills/crewai` |
| **Cross-Site Scripting and HTML Injection Testing** | ⚪ | This skill should be used when the user asks to "test for XSS vulnerabilities", "perform cross-site scripting attacks", "identify HTML injection flaws", "exploit client-side injection vulnerabilities", "steal cookies via XSS", or "bypass content security policies". It provides comprehensive techniques for detecting, exploiting, and understanding XSS and HTML injection attack vectors in web applications. | `skills/xss-html-injection` |
| **d3-viz** | ⚪ | Creating interactive data visualisations using d3.js. This skill should be used when creating custom charts, graphs, network diagrams, geographic visualisations, or any complex SVG-based data visualisation that requires fine-grained control over visual elements, transitions, or interactions. Use this for bespoke visualisations beyond standard charting libraries, whether in React, Vue, Svelte, vanilla JavaScript, or any other environment. | `skills/claude-d3js-skill` |
| **daily-news-report** | ⚪ | 基于预设 URL 列表抓取内容,筛选高质量技术信息并生成每日 Markdown 报告。 | `skills/daily-news-report` |
| **database-design** | ⚪ | Database design principles and decision-making. Schema design, indexing strategy, ORM selection, serverless databases. | `skills/database-design` |
| **deployment-procedures** | ⚪ | Production deployment principles and decision-making. Safe deployment workflows, rollback strategies, and verification. Teaches thinking, not scripts. | `skills/deployment-procedures` |
| **design-orchestration** | ⚪ | Orchestrates design workflows by routing work through brainstorming, multi-agent review, and execution readiness in the correct order. Prevents premature implementation, skipped validation, and unreviewed high-risk designs. | `skills/design-orchestration` |
| **discord-bot-architect** | ⚪ | Specialized skill for building production-ready Discord bots. Covers Discord.js (JavaScript) and Pycord (Python), gateway intents, slash commands, interactive components, rate limiting, and sharding. | `skills/discord-bot-architect` |
| **dispatching-parallel-agents** | ⚪ | Use when facing 2+ independent tasks that can be worked on without shared state or sequential dependencies | `skills/dispatching-parallel-agents` |
| **doc-coauthoring** | ⚪ | Guide users through a structured workflow for co-authoring documentation. Use when user wants to write documentation, proposals, technical specs, decision docs, or similar structured content. This workflow helps users efficiently transfer context, refine content through iteration, and verify the doc works for readers. Trigger when user mentions writing docs, creating proposals, drafting specs, or similar documentation tasks. | `skills/doc-coauthoring` |
| **docker-expert** | ⚪ | Docker containerization expert with deep knowledge of multi-stage builds, image optimization, container security, Docker Compose orchestration, and production deployment patterns. Use PROACTIVELY for Dockerfile optimization, container issues, image size problems, security hardening, networking, and orchestration challenges. | `skills/docker-expert` |
| **documentation-templates** | ⚪ | Documentation templates and structure guidelines. README, API docs, code comments, and AI-friendly documentation. | `skills/documentation-templates` |
| **docx** | ⚪ | Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. When Claude needs to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, (3) Working with tracked changes, (4) Adding comments, or any other document tasks | `skills/docx-official` |
| **email-sequence** | ⚪ | When the user wants to create or optimize an email sequence, drip campaign, automated email flow, or lifecycle email program. Also use when the user mentions "email sequence," "drip campaign," "nurture sequence," "onboarding emails," "welcome sequence," "re-engagement emails," "email automation," or "lifecycle emails." For in-app onboarding, see onboarding-cro. | `skills/email-sequence` |
| **email-systems** | ⚪ | Email has the highest ROI of any marketing channel. $36 for every $1 spent. Yet most startups treat it as an afterthought - bulk blasts, no personalization, landing in spam folders. This skill covers transactional email that works, marketing automation that converts, deliverability that reaches inboxes, and the infrastructure decisions that scale. Use when: keywords, file_patterns, code_patterns. | `skills/email-systems` |
| **environment-setup-guide** | ⚪ | Guide developers through setting up development environments with proper tools, dependencies, and configurations | `skills/environment-setup-guide` |
| **Ethical Hacking Methodology** | ⚪ | This skill should be used when the user asks to "learn ethical hacking", "understand penetration testing lifecycle", "perform reconnaissance", "conduct security scanning", "exploit vulnerabilities", or "write penetration test reports". It provides comprehensive ethical hacking methodology and techniques. | `skills/ethical-hacking-methodology` |
| **exa-search** | ⚪ | Semantic search, similar content discovery, and structured research using Exa API | `skills/exa-search` |
| **executing-plans** | ⚪ | Use when you have a written implementation plan to execute in a separate session with review checkpoints | `skills/executing-plans` |
| **File Path Traversal Testing** | ⚪ | This skill should be used when the user asks to "test for directory traversal", "exploit path traversal vulnerabilities", "read arbitrary files through web applications", "find LFI vulnerabilities", or "access files outside web root". It provides comprehensive file path traversal attack and testing methodologies. | `skills/file-path-traversal` |
| **file-organizer** | ⚪ | Intelligently organizes files and folders by understanding context, finding duplicates, and suggesting better organizational structures. Use when user wants to clean up directories, organize downloads, remove duplicates, or restructure projects. | `skills/file-organizer` |
| **file-uploads** | ⚪ | Expert at handling file uploads and cloud storage. Covers S3, Cloudflare R2, presigned URLs, multipart uploads, and image optimization. Knows how to handle large files without blocking. Use when: file upload, S3, R2, presigned URL, multipart. | `skills/file-uploads` |
| **finishing-a-development-branch** | ⚪ | Use when implementation is complete, all tests pass, and you need to decide how to integrate the work - guides completion of development work by presenting structured options for merge, PR, or cleanup | `skills/finishing-a-development-branch` |
| **firebase** | ⚪ | Firebase gives you a complete backend in minutes - auth, database, storage, functions, hosting. But the ease of setup hides real complexity. Security rules are your last line of defense, and they're often wrong. Firestore queries are limited, and you learn this after you've designed your data model. This skill covers Firebase Authentication, Firestore, Realtime Database, Cloud Functions, Cloud Storage, and Firebase Hosting. Key insight: Firebase is optimized for read-heavy, denormalized data. I | `skills/firebase` |
| **firecrawl-scraper** | ⚪ | Deep web scraping, screenshots, PDF parsing, and website crawling using Firecrawl API | `skills/firecrawl-scraper` |
| **form-cro** | ⚪ | Optimize any form that is NOT signup or account registration — including lead capture, contact, demo request, application, survey, quote, and checkout forms. Use when the goal is to increase form completion rate, reduce friction, or improve lead quality without breaking compliance or downstream workflows. | `skills/form-cro` |
| **free-tool-strategy** | ⚪ | When the user wants to plan, evaluate, or build a free tool for marketing purposes — lead generation, SEO value, or brand awareness. Also use when the user mentions "engineering as marketing," "free tool," "marketing tool," "calculator," "generator," "interactive tool," "lead gen tool," "build a tool for leads," or "free resource." This skill bridges engineering and marketing — useful for founders and technical marketers. | `skills/free-tool-strategy` |
| **frontend-design** | ⚪ | Create distinctive, production-grade frontend interfaces with intentional aesthetics, high craft, and non-generic visual identity. Use when building or styling web UIs, components, pages, dashboards, or frontend applications. | `skills/frontend-design` |
| **frontend-dev-guidelines** | ⚪ | Opinionated frontend development standards for modern React + TypeScript applications. Covers Suspense-first data fetching, lazy loading, feature-based architecture, MUI v7 styling, TanStack Router, performance optimization, and strict TypeScript practices. | `skills/frontend-dev-guidelines` |
| **frontend-patterns** | ⚪ | Frontend development patterns for React, Next.js, state management, performance optimization, and UI best practices. | `skills/cc-skill-frontend-patterns` |
| **game-art** | ⚪ | Game art principles. Visual style selection, asset pipeline, animation workflow. | `skills/game-development/game-art` |
| **game-audio** | ⚪ | Game audio principles. Sound design, music integration, adaptive audio systems. | `skills/game-development/game-audio` |
| **game-design** | ⚪ | Game design principles. GDD structure, balancing, player psychology, progression. | `skills/game-development/game-design` |
| **game-development** | ⚪ | Game development orchestrator. Routes to platform-specific skills based on project needs. | `skills/game-development` |
| **gcp-cloud-run** | ⚪ | Specialized skill for building production-ready serverless applications on GCP. Covers Cloud Run services (containerized), Cloud Run Functions (event-driven), cold start optimization, and event-driven architecture with Pub/Sub. | `skills/gcp-cloud-run` |
| **geo-fundamentals** | ⚪ | Generative Engine Optimization for AI search engines (ChatGPT, Claude, Perplexity). | `skills/geo-fundamentals` |
| **git-pushing** | ⚪ | Stage, commit, and push git changes with conventional commit messages. Use when user wants to commit and push changes, mentions pushing to remote, or asks to save and push their work. Also activates when user says "push changes", "commit and push", "push this", "push to github", or similar git workflow requests. | `skills/git-pushing` |
| **github-workflow-automation** | ⚪ | Automate GitHub workflows with AI assistance. Includes PR reviews, issue triage, CI/CD integration, and Git operations. Use when automating GitHub workflows, setting up PR review automation, creating GitHub Actions, or triaging issues. | `skills/github-workflow-automation` |
| **graphql** | ⚪ | GraphQL gives clients exactly the data they need - no more, no less. One endpoint, typed schema, introspection. But the flexibility that makes it powerful also makes it dangerous. Without proper controls, clients can craft queries that bring down your server. This skill covers schema design, resolvers, DataLoader for N+1 prevention, federation for microservices, and client integration with Apollo/urql. Key insight: GraphQL is a contract. The schema is the API documentation. Design it carefully. | `skills/graphql` |
| **HTML Injection Testing** | ⚪ | This skill should be used when the user asks to "test for HTML injection", "inject HTML into web pages", "perform HTML injection attacks", "deface web applications", or "test content injection vulnerabilities". It provides comprehensive HTML injection attack techniques and testing methodologies. | `skills/html-injection-testing` |
| **hubspot-integration** | ⚪ | Expert patterns for HubSpot CRM integration including OAuth authentication, CRM objects, associations, batch operations, webhooks, and custom objects. Covers Node.js and Python SDKs. Use when: hubspot, hubspot api, hubspot crm, hubspot integration, contacts api. | `skills/hubspot-integration` |
| **i18n-localization** | ⚪ | Internationalization and localization patterns. Detecting hardcoded strings, managing translations, locale files, RTL support. | `skills/i18n-localization` |
| **IDOR Vulnerability Testing** | ⚪ | This skill should be used when the user asks to "test for insecure direct object references," "find IDOR vulnerabilities," "exploit broken access control," "enumerate user IDs or object references," or "bypass authorization to access other users' data." It provides comprehensive guidance for detecting, exploiting, and remediating IDOR vulnerabilities in web applications. | `skills/idor-testing` |
| **inngest** | ⚪ | Inngest expert for serverless-first background jobs, event-driven workflows, and durable execution without managing queues or workers. Use when: inngest, serverless background job, event-driven workflow, step function, durable execution. | `skills/inngest` |
| **interactive-portfolio** | ⚪ | Expert in building portfolios that actually land jobs and clients - not just showing work, but creating memorable experiences. Covers developer portfolios, designer portfolios, creative portfolios, and portfolios that convert visitors into opportunities. Use when: portfolio, personal website, showcase work, developer portfolio, designer portfolio. | `skills/interactive-portfolio` |
| **internal-comms** | ⚪ | A set of resources to help me write all kinds of internal communications, using the formats that my company likes to use. Claude should use this skill whenever asked to write some sort of internal communications (status reports, leadership updates, 3P updates, company newsletters, FAQs, incident reports, project updates, etc.). | `skills/internal-comms-anthropic` |
| **internal-comms** | ⚪ | A set of resources to help me write all kinds of internal communications, using the formats that my company likes to use. Claude should use this skill whenever asked to write some sort of internal communications (status reports, leadership updates, 3P updates, company newsletters, FAQs, incident reports, project updates, etc.). | `skills/internal-comms-community` |
| **javascript-mastery** | ⚪ | Comprehensive JavaScript reference covering 33+ essential concepts every developer should know. From fundamentals like primitives and closures to advanced patterns like async/await and functional programming. Use when explaining JS concepts, debugging JavaScript issues, or teaching JavaScript fundamentals. | `skills/javascript-mastery` |
| **kaizen** | ⚪ | Guide for continuous improvement, error proofing, and standardization. Use this skill when the user wants to improve code quality, refactor, or discuss process improvements. | `skills/kaizen` |
| **langfuse** | ⚪ | Expert in Langfuse - the open-source LLM observability platform. Covers tracing, prompt management, evaluation, datasets, and integration with LangChain, LlamaIndex, and OpenAI. Essential for debugging, monitoring, and improving LLM applications in production. Use when: langfuse, llm observability, llm tracing, prompt management, llm evaluation. | `skills/langfuse` |
| **langgraph** | ⚪ | Expert in LangGraph - the production-grade framework for building stateful, multi-actor AI applications. Covers graph construction, state management, cycles and branches, persistence with checkpointers, human-in-the-loop patterns, and the ReAct agent pattern. Used in production at LinkedIn, Uber, and 400+ companies. This is LangChain's recommended approach for building agents. Use when: langgraph, langchain agent, stateful agent, agent graph, react agent. | `skills/langgraph` |
| **last30days** | ⚪ | Research a topic from the last 30 days on Reddit + X + Web, become an expert, and write copy-paste-ready prompts for the user's target tool. | `skills/last30days` |
| **launch-strategy** | ⚪ | When the user wants to plan a product launch, feature announcement, or release strategy. Also use when the user mentions 'launch,' 'Product Hunt,' 'feature release,' 'announcement,' 'go-to-market,' 'beta launch,' 'early access,' 'waitlist,' or 'product update.' This skill covers phased launches, channel strategy, and ongoing launch momentum. | `skills/launch-strategy` |
| **lint-and-validate** | ⚪ | Automatic quality control, linting, and static analysis procedures. Use after every code modification to ensure syntax correctness and project standards. Triggers onKeywords: lint, format, check, validate, types, static analysis. | `skills/lint-and-validate` |
| **Linux Privilege Escalation** | ⚪ | This skill should be used when the user asks to "escalate privileges on Linux", "find privesc vectors on Linux systems", "exploit sudo misconfigurations", "abuse SUID binaries", "exploit cron jobs for root access", "enumerate Linux systems for privilege escalation", or "gain root access from low-privilege shell". It provides comprehensive techniques for identifying and exploiting privilege escalation paths on Linux systems. | `skills/linux-privilege-escalation` |
| **Linux Production Shell Scripts** | ⚪ | This skill should be used when the user asks to "create bash scripts", "automate Linux tasks", "monitor system resources", "backup files", "manage users", or "write production shell scripts". It provides ready-to-use shell script templates for system administration. | `skills/linux-shell-scripting` |
| **llm-app-patterns** | ⚪ | Production-ready patterns for building LLM applications. Covers RAG pipelines, agent architectures, prompt IDEs, and LLMOps monitoring. Use when designing AI applications, implementing RAG, building agents, or setting up LLM observability. | `skills/llm-app-patterns` |
| **loki-mode** | ⚪ | Multi-agent autonomous startup system for Claude Code. Triggers on "Loki Mode". Orchestrates 100+ specialized agents across engineering, QA, DevOps, security, data/ML, business operations, marketing, HR, and customer success. Takes PRD to fully deployed, revenue-generating product with zero human intervention. Features Task tool for subagent dispatch, parallel code review with 3 specialized reviewers, severity-based issue triage, distributed task queue with dead letter handling, automatic deployment to cloud providers, A/B testing, customer feedback loops, incident response, circuit breakers, and self-healing. Handles rate limits via distributed state checkpoints and auto-resume with exponential backoff. Requires --dangerously-skip-permissions flag. | `skills/loki-mode` |
| **marketing-ideas** | ⚪ | Provide proven marketing strategies and growth ideas for SaaS and software products, prioritized using a marketing feasibility scoring system. | `skills/marketing-ideas` |
| **marketing-psychology** | ⚪ | Apply behavioral science and mental models to marketing decisions, prioritized using a psychological leverage and feasibility scoring system. | `skills/marketing-psychology` |
| **mcp-builder** | ⚪ | Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK). | `skills/mcp-builder` |
| **Metasploit Framework** | ⚪ | This skill should be used when the user asks to "use Metasploit for penetration testing", "exploit vulnerabilities with msfconsole", "create payloads with msfvenom", "perform post-exploitation", "use auxiliary modules for scanning", or "develop custom exploits". It provides comprehensive guidance for leveraging the Metasploit Framework in security assessments. | `skills/metasploit-framework` |
| **micro-saas-launcher** | ⚪ | Expert in launching small, focused SaaS products fast - the indie hacker approach to building profitable software. Covers idea validation, MVP development, pricing, launch strategies, and growing to sustainable revenue. Ship in weeks, not months. Use when: micro saas, indie hacker, small saas, side project, saas mvp. | `skills/micro-saas-launcher` |
| **mobile-design** | ⚪ | Mobile-first design and engineering doctrine for iOS and Android apps. Covers touch interaction, performance, platform conventions, offline behavior, and mobile-specific decision-making. Teaches principles and constraints, not fixed layouts. Use for React Native, Flutter, or native mobile apps. | `skills/mobile-design` |
| **mobile-games** | ⚪ | Mobile game development principles. Touch input, battery, performance, app stores. | `skills/game-development/mobile-games` |
| **moodle-external-api-development** | ⚪ | Create custom external web service APIs for Moodle LMS. Use when implementing web services for course management, user tracking, quiz operations, or custom plugin functionality. Covers parameter validation, database operations, error handling, service registration, and Moodle coding standards. | `skills/moodle-external-api-development` |
| **multi-agent-brainstorming** | ⚪ | Use this skill when a design or idea requires higher confidence, risk reduction, or formal review. This skill orchestrates a structured, sequential multi-agent design review where each agent has a strict, non-overlapping role. It prevents blind spots, false confidence, and premature convergence. | `skills/multi-agent-brainstorming` |
| **multiplayer** | ⚪ | Multiplayer game development principles. Architecture, networking, synchronization. | `skills/game-development/multiplayer` |
| **neon-postgres** | ⚪ | Expert patterns for Neon serverless Postgres, branching, connection pooling, and Prisma/Drizzle integration Use when: neon database, serverless postgres, database branching, neon postgres, postgres serverless. | `skills/neon-postgres` |
| **nestjs-expert** | ⚪ | Nest.js framework expert specializing in module architecture, dependency injection, middleware, guards, interceptors, testing with Jest/Supertest, TypeORM/Mongoose integration, and Passport.js authentication. Use PROACTIVELY for any Nest.js application issues including architecture decisions, testing strategies, performance optimization, or debugging complex dependency injection problems. If a specialized expert is a better fit, I will recommend switching and stop. | `skills/nestjs-expert` |
| **Network 101** | ⚪ | This skill should be used when the user asks to "set up a web server", "configure HTTP or HTTPS", "perform SNMP enumeration", "configure SMB shares", "test network services", or needs guidance on configuring and testing network services for penetration testing labs. | `skills/network-101` |
| **nextjs-best-practices** | ⚪ | Next.js App Router principles. Server Components, data fetching, routing patterns. | `skills/nextjs-best-practices` |
| **nextjs-supabase-auth** | ⚪ | Expert integration of Supabase Auth with Next.js App Router Use when: supabase auth next, authentication next.js, login supabase, auth middleware, protected route. | `skills/nextjs-supabase-auth` |
| **nodejs-best-practices** | ⚪ | Node.js development principles and decision-making. Framework selection, async patterns, security, and architecture. Teaches thinking, not copying. | `skills/nodejs-best-practices` |
| **nosql-expert** | ⚪ | Expert guidance for distributed NoSQL databases (Cassandra, DynamoDB). Focuses on mental models, query-first modeling, single-table design, and avoiding hot partitions in high-scale systems. | `skills/nosql-expert` |
| **notebooklm** | ⚪ | Use this skill to query your Google NotebookLM notebooks directly from Claude Code for source-grounded, citation-backed answers from Gemini. Browser automation, library management, persistent auth. Drastically reduced hallucinations through document-only responses. | `skills/notebooklm` |
| **notion-template-business** | ⚪ | Expert in building and selling Notion templates as a business - not just making templates, but building a sustainable digital product business. Covers template design, pricing, marketplaces, marketing, and scaling to real revenue. Use when: notion template, sell templates, digital product, notion business, gumroad. | `skills/notion-template-business` |
| **obsidian-clipper-template-creator** | ⚪ | Guide for creating templates for the Obsidian Web Clipper. Use when you want to create a new clipping template, understand available variables, or format clipped content. | `skills/obsidian-clipper-template-creator` |
| **onboarding-cro** | ⚪ | When the user wants to optimize post-signup onboarding, user activation, first-run experience, or time-to-value. Also use when the user mentions "onboarding flow," "activation rate," "user activation," "first-run experience," "empty states," "onboarding checklist," "aha moment," or "new user experience." For signup/registration optimization, see signup-flow-cro. For ongoing email sequences, see email-sequence. | `skills/onboarding-cro` |
| **page-cro** | ⚪ | Analyze and optimize individual pages for conversion performance. Use when the user wants to improve conversion rates, diagnose why a page is underperforming, or increase the effectiveness of marketing pages (homepage, landing pages, pricing, feature pages, or blog posts). This skill focuses on diagnosis, prioritization, and testable recommendations— not blind optimization. | `skills/page-cro` |
| **paid-ads** | ⚪ | When the user wants help with paid advertising campaigns on Google Ads, Meta (Facebook/Instagram), LinkedIn, Twitter/X, or other ad platforms. Also use when the user mentions 'PPC,' 'paid media,' 'ad copy,' 'ad creative,' 'ROAS,' 'CPA,' 'ad campaign,' 'retargeting,' or 'audience targeting.' This skill covers campaign strategy, ad creation, audience targeting, and optimization. | `skills/paid-ads` |
| **parallel-agents** | ⚪ | Multi-agent orchestration patterns. Use when multiple independent tasks can run with different domain expertise or when comprehensive analysis requires multiple perspectives. | `skills/parallel-agents` |
| **paywall-upgrade-cro** | ⚪ | When the user wants to create or optimize in-app paywalls, upgrade screens, upsell modals, or feature gates. Also use when the user mentions "paywall," "upgrade screen," "upgrade modal," "upsell," "feature gate," "convert free to paid," "freemium conversion," "trial expiration screen," "limit reached screen," "plan upgrade prompt," or "in-app pricing." Distinct from public pricing pages (see page-cro) — this skill focuses on in-product upgrade moments where the user has already experienced value. | `skills/paywall-upgrade-cro` |
| **pc-games** | ⚪ | PC and console game development principles. Engine selection, platform features, optimization strategies. | `skills/game-development/pc-games` |
| **pdf** | ⚪ | Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When Claude needs to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale. | `skills/pdf-official` |
| **Pentest Checklist** | ⚪ | This skill should be used when the user asks to "plan a penetration test", "create a security assessment checklist", "prepare for penetration testing", "define pentest scope", "follow security testing best practices", or needs a structured methodology for penetration testing engagements. | `skills/pentest-checklist` |
| **Pentest Commands** | ⚪ | This skill should be used when the user asks to "run pentest commands", "scan with nmap", "use metasploit exploits", "crack passwords with hydra or john", "scan web vulnerabilities with nikto", "enumerate networks", or needs essential penetration testing command references. | `skills/pentest-commands` |
| **performance-profiling** | ⚪ | Performance profiling principles. Measurement, analysis, and optimization techniques. | `skills/performance-profiling` |
| **personal-tool-builder** | ⚪ | Expert in building custom tools that solve your own problems first. The best products often start as personal tools - scratch your own itch, build for yourself, then discover others have the same itch. Covers rapid prototyping, local-first apps, CLI tools, scripts that grow into products, and the art of dogfooding. Use when: build a tool, personal tool, scratch my itch, solve my problem, CLI tool. | `skills/personal-tool-builder` |
| **plaid-fintech** | ⚪ | Expert patterns for Plaid API integration including Link token flows, transactions sync, identity verification, Auth for ACH, balance checks, webhook handling, and fintech compliance best practices. Use when: plaid, bank account linking, bank connection, ach, account aggregation. | `skills/plaid-fintech` |
| **plan-writing** | ⚪ | Structured task planning with clear breakdowns, dependencies, and verification criteria. Use when implementing features, refactoring, or any multi-step work. | `skills/plan-writing` |
| **planning-with-files** | ⚪ | Implements Manus-style file-based planning for complex tasks. Creates task_plan.md, findings.md, and progress.md. Use when starting complex multi-step tasks, research projects, or any task requiring >5 tool calls. | `skills/planning-with-files` |
| **playwright-skill** | ⚪ | Complete browser automation with Playwright. Auto-detects dev servers, writes clean test scripts to /tmp. Test pages, fill forms, take screenshots, check responsive design, validate UX, test login flows, check links, automate any browser task. Use when user wants to test websites, automate browser interactions, validate web functionality, or perform any browser-based testing. | `skills/playwright-skill` |
| **popup-cro** | ⚪ | Create and optimize popups, modals, overlays, slide-ins, and banners to increase conversions without harming user experience or brand trust. | `skills/popup-cro` |
| **powershell-windows** | ⚪ | PowerShell Windows patterns. Critical pitfalls, operator syntax, error handling. | `skills/powershell-windows` |
| **pptx** | ⚪ | Presentation creation, editing, and analysis. When Claude needs to work with presentations (.pptx files) for: (1) Creating new presentations, (2) Modifying or editing content, (3) Working with layouts, (4) Adding comments or speaker notes, or any other presentation tasks | `skills/pptx-official` |
| **pricing-strategy** | ⚪ | Design pricing, packaging, and monetization strategies based on value, customer willingness to pay, and growth objectives. | `skills/pricing-strategy` |
| **prisma-expert** | ⚪ | Prisma ORM expert for schema design, migrations, query optimization, relations modeling, and database operations. Use PROACTIVELY for Prisma schema issues, migration problems, query performance, relation design, or database connection issues. | `skills/prisma-expert` |
| **Privilege Escalation Methods** | ⚪ | This skill should be used when the user asks to "escalate privileges", "get root access", "become administrator", "privesc techniques", "abuse sudo", "exploit SUID binaries", "Kerberoasting", "pass-the-ticket", "token impersonation", or needs guidance on post-exploitation privilege escalation for Linux or Windows systems. | `skills/privilege-escalation-methods` |
| **product-manager-toolkit** | ⚪ | Comprehensive toolkit for product managers including RICE prioritization, customer interview analysis, PRD templates, discovery frameworks, and go-to-market strategies. Use for feature prioritization, user research synthesis, requirement documentation, and product strategy development. | `skills/product-manager-toolkit` |
| **production-code-audit** | ⚪ | Autonomously deep-scan entire codebase line-by-line, understand architecture and patterns, then systematically transform it to production-grade, corporate-level professional quality with optimizations | `skills/production-code-audit` |
| **programmatic-seo** | ⚪ | Design and evaluate programmatic SEO strategies for creating SEO-driven pages at scale using templates and structured data. Use when the user mentions programmatic SEO, pages at scale, template pages, directory pages, location pages, comparison pages, integration pages, or keyword-pattern page generation. This skill focuses on feasibility, strategy, and page system design—not execution unless explicitly requested. | `skills/programmatic-seo` |
| **prompt-caching** | ⚪ | Caching strategies for LLM prompts including Anthropic prompt caching, response caching, and CAG (Cache Augmented Generation) Use when: prompt caching, cache prompt, response cache, cag, cache augmented. | `skills/prompt-caching` |
| **prompt-engineer** | ⚪ | Expert in designing effective prompts for LLM-powered applications. Masters prompt structure, context management, output formatting, and prompt evaluation. Use when: prompt engineering, system prompt, few-shot, chain of thought, prompt design. | `skills/prompt-engineer` |
| **prompt-engineering** | ⚪ | Expert guide on prompt engineering patterns, best practices, and optimization techniques. Use when user wants to improve prompts, learn prompting strategies, or debug agent behavior. | `skills/prompt-engineering` |
| **prompt-library** | ⚪ | Curated collection of high-quality prompts for various use cases. Includes role-based prompts, task-specific templates, and prompt refinement techniques. Use when user needs prompt templates, role-play prompts, or ready-to-use prompt examples for coding, writing, analysis, or creative tasks. | `skills/prompt-library` |
| **python-patterns** | ⚪ | Python development principles and decision-making. Framework selection, async patterns, type hints, project structure. Teaches thinking, not copying. | `skills/python-patterns` |
| **rag-engineer** | ⚪ | Expert in building Retrieval-Augmented Generation systems. Masters embedding models, vector databases, chunking strategies, and retrieval optimization for LLM applications. Use when: building RAG, vector search, embeddings, semantic search, document retrieval. | `skills/rag-engineer` |
| **rag-implementation** | ⚪ | Retrieval-Augmented Generation patterns including chunking, embeddings, vector stores, and retrieval optimization Use when: rag, retrieval augmented, vector search, embeddings, semantic search. | `skills/rag-implementation` |
| **react-patterns** | ⚪ | Modern React patterns and principles. Hooks, composition, performance, TypeScript best practices. | `skills/react-patterns` |
| **react-ui-patterns** | ⚪ | Modern React UI patterns for loading states, error handling, and data fetching. Use when building UI components, handling async data, or managing UI states. | `skills/react-ui-patterns` |
| **receiving-code-review** | ⚪ | Use when receiving code review feedback, before implementing suggestions, especially if feedback seems unclear or technically questionable - requires technical rigor and verification, not performative agreement or blind implementation | `skills/receiving-code-review` |
| **Red Team Tools and Methodology** | ⚪ | This skill should be used when the user asks to "follow red team methodology", "perform bug bounty hunting", "automate reconnaissance", "hunt for XSS vulnerabilities", "enumerate subdomains", or needs security researcher techniques and tool configurations from top bug bounty hunters. | `skills/red-team-tools` |
| **red-team-tactics** | ⚪ | Red team tactics principles based on MITRE ATT&CK. Attack phases, detection evasion, reporting. | `skills/red-team-tactics` |
| **referral-program** | ⚪ | When the user wants to create, optimize, or analyze a referral program, affiliate program, or word-of-mouth strategy. Also use when the user mentions 'referral,' 'affiliate,' 'ambassador,' 'word of mouth,' 'viral loop,' 'refer a friend,' or 'partner program.' This skill covers program design, incentive structure, and growth optimization. | `skills/referral-program` |
| **remotion-best-practices** | ⚪ | Best practices for Remotion - Video creation in React | `skills/remotion-best-practices` |
| **requesting-code-review** | ⚪ | Use when completing tasks, implementing major features, or before merging to verify work meets requirements | `skills/requesting-code-review` |
| **research-engineer** | ⚪ | An uncompromising Academic Research Engineer. Operates with absolute scientific rigor, objective criticism, and zero flair. Focuses on theoretical correctness, formal verification, and optimal implementation across any required technology. | `skills/research-engineer` |
| **salesforce-development** | ⚪ | Expert patterns for Salesforce platform development including Lightning Web Components (LWC), Apex triggers and classes, REST/Bulk APIs, Connected Apps, and Salesforce DX with scratch orgs and 2nd generation packages (2GP). Use when: salesforce, sfdc, apex, lwc, lightning web components. | `skills/salesforce-development` |
| **schema-markup** | ⚪ | Design, validate, and optimize schema.org structured data for eligibility, correctness, and measurable SEO impact. Use when the user wants to add, fix, audit, or scale schema markup (JSON-LD) for rich results. This skill evaluates whether schema should be implemented, what types are valid, and how to deploy safely according to Google guidelines. | `skills/schema-markup` |
| **scroll-experience** | ⚪ | Expert in building immersive scroll-driven experiences - parallax storytelling, scroll animations, interactive narratives, and cinematic web experiences. Like NY Times interactives, Apple product pages, and award-winning web experiences. Makes websites feel like experiences, not just pages. Use when: scroll animation, parallax, scroll storytelling, interactive story, cinematic website. | `skills/scroll-experience` |
| **Security Scanning Tools** | ⚪ | This skill should be used when the user asks to "perform vulnerability scanning", "scan networks for open ports", "assess web application security", "scan wireless networks", "detect malware", "check cloud security", or "evaluate system compliance". It provides comprehensive guidance on security scanning tools and methodologies. | `skills/scanning-tools` |
| **security-review** | ⚪ | Use this skill when adding authentication, handling user input, working with secrets, creating API endpoints, or implementing payment/sensitive features. Provides comprehensive security checklist and patterns. | `skills/cc-skill-security-review` |
| **segment-cdp** | ⚪ | Expert patterns for Segment Customer Data Platform including Analytics.js, server-side tracking, tracking plans with Protocols, identity resolution, destinations configuration, and data governance best practices. Use when: segment, analytics.js, customer data platform, cdp, tracking plan. | `skills/segment-cdp` |
| **senior-architect** | ⚪ | Comprehensive software architecture skill for designing scalable, maintainable systems using ReactJS, NextJS, NodeJS, Express, React Native, Swift, Kotlin, Flutter, Postgres, GraphQL, Go, Python. Includes architecture diagram generation, system design patterns, tech stack decision frameworks, and dependency analysis. Use when designing system architecture, making technical decisions, creating architecture diagrams, evaluating trade-offs, or defining integration patterns. | `skills/senior-architect` |
| **senior-fullstack** | ⚪ | Comprehensive fullstack development skill for building complete web applications with React, Next.js, Node.js, GraphQL, and PostgreSQL. Includes project scaffolding, code quality analysis, architecture patterns, and complete tech stack guidance. Use when building new projects, analyzing code quality, implementing design patterns, or setting up development workflows. | `skills/senior-fullstack` |
| **seo-audit** | ⚪ | Diagnose and audit SEO issues affecting crawlability, indexation, rankings, and organic performance. Use when the user asks for an SEO audit, technical SEO review, ranking diagnosis, on-page SEO review, meta tag audit, or SEO health check. This skill identifies issues and prioritizes actions but does not execute changes. For large-scale page creation, use programmatic-seo. For structured data, use schema-markup. | `skills/seo-audit` |
| **seo-fundamentals** | ⚪ | Core principles of SEO including E-E-A-T, Core Web Vitals, technical foundations, content quality, and how modern search engines evaluate pages. This skill explains *why* SEO works, not how to execute specific optimizations. | `skills/seo-fundamentals` |
| **server-management** | ⚪ | Server management principles and decision-making. Process management, monitoring strategy, and scaling decisions. Teaches thinking, not commands. | `skills/server-management` |
| **Shodan Reconnaissance and Pentesting** | ⚪ | This skill should be used when the user asks to "search for exposed devices on the internet," "perform Shodan reconnaissance," "find vulnerable services using Shodan," "scan IP ranges with Shodan," or "discover IoT devices and open ports." It provides comprehensive guidance for using Shodan's search engine, CLI, and API for penetration testing reconnaissance. | `skills/shodan-reconnaissance` |
| **shopify-apps** | ⚪ | Expert patterns for Shopify app development including Remix/React Router apps, embedded apps with App Bridge, webhook handling, GraphQL Admin API, Polaris components, billing, and app extensions. Use when: shopify app, shopify, embedded app, polaris, app bridge. | `skills/shopify-apps` |
| **shopify-development** | ⚪ | Build Shopify apps, extensions, themes using GraphQL Admin API, Shopify CLI, Polaris UI, and Liquid. TRIGGER: "shopify", "shopify app", "checkout extension", "admin extension", "POS extension", "shopify theme", "liquid template", "polaris", "shopify graphql", "shopify webhook", "shopify billing", "app subscription", "metafields", "shopify functions" | `skills/shopify-development` |
| **signup-flow-cro** | ⚪ | When the user wants to optimize signup, registration, account creation, or trial activation flows. Also use when the user mentions "signup conversions," "registration friction," "signup form optimization," "free trial signup," "reduce signup dropoff," or "account creation flow." For post-signup onboarding, see onboarding-cro. For lead capture forms (not account creation), see form-cro. | `skills/signup-flow-cro` |
| **skill-creator** | ⚪ | Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Claude's capabilities with specialized knowledge, workflows, or tool integrations. | `skills/skill-creator` |
| **skill-developer** | ⚪ | Create and manage Claude Code skills following Anthropic best practices. Use when creating new skills, modifying skill-rules.json, understanding trigger patterns, working with hooks, debugging skill activation, or implementing progressive disclosure. Covers skill structure, YAML frontmatter, trigger types (keywords, intent patterns, file paths, content patterns), enforcement levels (block, suggest, warn), hook mechanisms (UserPromptSubmit, PreToolUse), session tracking, and the 500-line rule. | `skills/skill-developer` |
| **slack-bot-builder** | ⚪ | Build Slack apps using the Bolt framework across Python, JavaScript, and Java. Covers Block Kit for rich UIs, interactive components, slash commands, event handling, OAuth installation flows, and Workflow Builder integration. Focus on best practices for production-ready Slack apps. Use when: slack bot, slack app, bolt framework, block kit, slash command. | `skills/slack-bot-builder` |
| **slack-gif-creator** | ⚪ | Knowledge and utilities for creating animated GIFs optimized for Slack. Provides constraints, validation tools, and animation concepts. Use when users request animated GIFs for Slack like "make me a GIF of X doing Y for Slack." | `skills/slack-gif-creator` |
| **SMTP Penetration Testing** | ⚪ | This skill should be used when the user asks to "perform SMTP penetration testing", "enumerate email users", "test for open mail relays", "grab SMTP banners", "brute force email credentials", or "assess mail server security". It provides comprehensive techniques for testing SMTP server security. | `skills/smtp-penetration-testing` |
| **social-content** | ⚪ | When the user wants help creating, scheduling, or optimizing social media content for LinkedIn, Twitter/X, Instagram, TikTok, Facebook, or other platforms. Also use when the user mentions 'LinkedIn post,' 'Twitter thread,' 'social media,' 'content calendar,' 'social scheduling,' 'engagement,' or 'viral content.' This skill covers content creation, repurposing, and platform-specific strategies. | `skills/social-content` |
| **software-architecture** | ⚪ | Guide for quality focused software architecture. This skill should be used when users want to write code, design architecture, analyze code, in any case that relates to software development. | `skills/software-architecture` |
| **SQL Injection Testing** | ⚪ | This skill should be used when the user asks to "test for SQL injection vulnerabilities", "perform SQLi attacks", "bypass authentication using SQL injection", "extract database information through injection", "detect SQL injection flaws", or "exploit database query vulnerabilities". It provides comprehensive techniques for identifying, exploiting, and understanding SQL injection attack vectors across different database systems. | `skills/sql-injection-testing` |
| **SQLMap Database Penetration Testing** | ⚪ | This skill should be used when the user asks to "automate SQL injection testing," "enumerate database structure," "extract database credentials using sqlmap," "dump tables and columns from a vulnerable database," or "perform automated database penetration testing." It provides comprehensive guidance for using SQLMap to detect and exploit SQL injection vulnerabilities. | `skills/sqlmap-database-pentesting` |
| **SSH Penetration Testing** | ⚪ | This skill should be used when the user asks to "pentest SSH services", "enumerate SSH configurations", "brute force SSH credentials", "exploit SSH vulnerabilities", "perform SSH tunneling", or "audit SSH security". It provides comprehensive SSH penetration testing methodologies and techniques. | `skills/ssh-penetration-testing` |
| **stripe-integration** | ⚪ | Get paid from day one. Payments, subscriptions, billing portal, webhooks, metered billing, Stripe Connect. The complete guide to implementing Stripe correctly, including all the edge cases that will bite you at 3am. This isn't just API calls - it's the full payment system: handling failures, managing subscriptions, dealing with dunning, and keeping revenue flowing. Use when: stripe, payments, subscription, billing, checkout. | `skills/stripe-integration` |
| **subagent-driven-development** | ⚪ | Use when executing implementation plans with independent tasks in the current session | `skills/subagent-driven-development` |
| **supabase-postgres-best-practices** | ⚪ | Postgres performance optimization and best practices from Supabase. Use this skill when writing, reviewing, or optimizing Postgres queries, schema designs, or database configurations. | `skills/postgres-best-practices` |
| **systematic-debugging** | ⚪ | Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes | `skills/systematic-debugging` |
| **tailwind-patterns** | ⚪ | Tailwind CSS v4 principles. CSS-first configuration, container queries, modern patterns, design token architecture. | `skills/tailwind-patterns` |
| **tavily-web** | ⚪ | Web search, content extraction, crawling, and research capabilities using Tavily API | `skills/tavily-web` |
| **tdd-workflow** | ⚪ | Test-Driven Development workflow principles. RED-GREEN-REFACTOR cycle. | `skills/tdd-workflow` |
| **telegram-bot-builder** | ⚪ | Expert in building Telegram bots that solve real problems - from simple automation to complex AI-powered bots. Covers bot architecture, the Telegram Bot API, user experience, monetization strategies, and scaling bots to thousands of users. Use when: telegram bot, bot api, telegram automation, chat bot telegram, tg bot. | `skills/telegram-bot-builder` |
| **telegram-mini-app** | ⚪ | Expert in building Telegram Mini Apps (TWA) - web apps that run inside Telegram with native-like experience. Covers the TON ecosystem, Telegram Web App API, payments, user authentication, and building viral mini apps that monetize. Use when: telegram mini app, TWA, telegram web app, TON app, mini app. | `skills/telegram-mini-app` |
| **templates** | ⚪ | Project scaffolding templates for new applications. Use when creating new projects from scratch. Contains 12 templates for various tech stacks. | `skills/app-builder/templates` |
| **test-driven-development** | ⚪ | Use when implementing any feature or bugfix, before writing implementation code | `skills/test-driven-development` |
| **test-fixing** | ⚪ | Run tests and systematically fix all failing tests using smart error grouping. Use when user asks to fix failing tests, mentions test failures, runs test suite and failures occur, or requests to make tests pass. | `skills/test-fixing` |
| **testing-patterns** | ⚪ | Jest testing patterns, factory functions, mocking strategies, and TDD workflow. Use when writing unit tests, creating test factories, or following TDD red-green-refactor cycle. | `skills/testing-patterns` |
| **theme-factory** | ⚪ | Toolkit for styling artifacts with a theme. These artifacts can be slides, docs, reportings, HTML landing pages, etc. There are 10 pre-set themes with colors/fonts that you can apply to any artifact that has been creating, or can generate a new theme on-the-fly. | `skills/theme-factory` |
| **Top 100 Web Vulnerabilities Reference** | ⚪ | This skill should be used when the user asks to "identify web application vulnerabilities", "explain common security flaws", "understand vulnerability categories", "learn about injection attacks", "review access control weaknesses", "analyze API security issues", "assess security misconfigurations", "understand client-side vulnerabilities", "examine mobile and IoT security flaws", or "reference the OWASP-aligned vulnerability taxonomy". Use this skill to provide comprehensive vulnerability definitions, root causes, impacts, and mitigation strategies across all major web security categories. | `skills/top-web-vulnerabilities` |
| **trigger-dev** | ⚪ | Trigger.dev expert for background jobs, AI workflows, and reliable async execution with excellent developer experience and TypeScript-first design. Use when: trigger.dev, trigger dev, background task, ai background job, long running task. | `skills/trigger-dev` |
| **twilio-communications** | ⚪ | Build communication features with Twilio: SMS messaging, voice calls, WhatsApp Business API, and user verification (2FA). Covers the full spectrum from simple notifications to complex IVR systems and multi-channel authentication. Critical focus on compliance, rate limits, and error handling. Use when: twilio, send SMS, text message, voice call, phone verification. | `skills/twilio-communications` |
| **typescript-expert** | ⚪ | TypeScript and JavaScript expert with deep knowledge of type-level programming, performance optimization, monorepo management, migration strategies, and modern tooling. Use PROACTIVELY for any TypeScript/JavaScript issues including complex type gymnastics, build performance, debugging, and architectural decisions. If a specialized expert is a better fit, I will recommend switching and stop. | `skills/typescript-expert` |
| **ui-ux-pro-max** | ⚪ | UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 9 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind, shadcn/ui). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient. Integrations: shadcn/ui MCP for component search and examples. | `skills/ui-ux-pro-max` |
| **upstash-qstash** | ⚪ | Upstash QStash expert for serverless message queues, scheduled jobs, and reliable HTTP-based task delivery without managing infrastructure. Use when: qstash, upstash queue, serverless cron, scheduled http, message queue serverless. | `skills/upstash-qstash` |
| **using-git-worktrees** | ⚪ | Use when starting feature work that needs isolation from current workspace or before executing implementation plans - creates isolated git worktrees with smart directory selection and safety verification | `skills/using-git-worktrees` |
| **using-superpowers** | ⚪ | Use when starting any conversation - establishes how to find and use skills, requiring Skill tool invocation before ANY response including clarifying questions | `skills/using-superpowers` |
| **vercel-deployment** | ⚪ | Expert knowledge for deploying to Vercel with Next.js Use when: vercel, deploy, deployment, hosting, production. | `skills/vercel-deployment` |
| **vercel-react-best-practices** | ⚪ | React and Next.js performance optimization guidelines from Vercel Engineering. This skill should be used when writing, reviewing, or refactoring React/Next.js code to ensure optimal performance patterns. Triggers on tasks involving React components, Next.js pages, data fetching, bundle optimization, or performance improvements. | `skills/react-best-practices` |
| **verification-before-completion** | ⚪ | Use when about to claim work is complete, fixed, or passing, before committing or creating PRs - requires running verification commands and confirming output before making any success claims; evidence before assertions always | `skills/verification-before-completion` |
| **viral-generator-builder** | ⚪ | Expert in building shareable generator tools that go viral - name generators, quiz makers, avatar creators, personality tests, and calculator tools. Covers the psychology of sharing, viral mechanics, and building tools people can't resist sharing with friends. Use when: generator tool, quiz maker, name generator, avatar creator, viral tool. | `skills/viral-generator-builder` |
| **voice-agents** | ⚪ | Voice agents represent the frontier of AI interaction - humans speaking naturally with AI systems. The challenge isn't just speech recognition and synthesis, it's achieving natural conversation flow with sub-800ms latency while handling interruptions, background noise, and emotional nuance. This skill covers two architectures: speech-to-speech (OpenAI Realtime API, lowest latency, most natural) and pipeline (STT→LLM→TTS, more control, easier to debug). Key insight: latency is the constraint. Hu | `skills/voice-agents` |
| **voice-ai-development** | ⚪ | Expert in building voice AI applications - from real-time voice agents to voice-enabled apps. Covers OpenAI Realtime API, Vapi for voice agents, Deepgram for transcription, ElevenLabs for synthesis, LiveKit for real-time infrastructure, and WebRTC fundamentals. Knows how to build low-latency, production-ready voice experiences. Use when: voice ai, voice agent, speech to text, text to speech, realtime voice. | `skills/voice-ai-development` |
| **voice-ai-engine-development** | ⚪ | Build real-time conversational AI voice engines using async worker pipelines, streaming transcription, LLM agents, and TTS synthesis with interrupt handling and multi-provider support | `skills/voice-ai-engine-development` |
| **vr-ar** | ⚪ | VR/AR development principles. Comfort, interaction, performance requirements. | `skills/game-development/vr-ar` |
| **vulnerability-scanner** | ⚪ | Advanced vulnerability analysis principles. OWASP 2025, Supply Chain Security, attack surface mapping, risk prioritization. | `skills/vulnerability-scanner` |
| **web-artifacts-builder** | ⚪ | Suite of tools for creating elaborate, multi-component claude.ai HTML artifacts using modern frontend web technologies (React, Tailwind CSS, shadcn/ui). Use for complex artifacts requiring state management, routing, or shadcn/ui components - not for simple single-file HTML/JSX artifacts. | `skills/web-artifacts-builder` |
| **web-design-guidelines** | ⚪ | Review UI code for Web Interface Guidelines compliance. Use when asked to "review my UI", "check accessibility", "audit design", "review UX", or "check my site against best practices". | `skills/web-design-guidelines` |
| **web-games** | ⚪ | Web browser game development principles. Framework selection, WebGPU, optimization, PWA. | `skills/game-development/web-games` |
| **web-performance-optimization** | ⚪ | Optimize website and web application performance including loading speed, Core Web Vitals, bundle size, caching strategies, and runtime performance | `skills/web-performance-optimization` |
| **webapp-testing** | ⚪ | Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs. | `skills/webapp-testing` |
| **Windows Privilege Escalation** | ⚪ | This skill should be used when the user asks to "escalate privileges on Windows," "find Windows privesc vectors," "enumerate Windows for privilege escalation," "exploit Windows misconfigurations," or "perform post-exploitation privilege escalation." It provides comprehensive guidance for discovering and exploiting privilege escalation vulnerabilities in Windows environments. | `skills/windows-privilege-escalation` |
| **Wireshark Network Traffic Analysis** | ⚪ | This skill should be used when the user asks to "analyze network traffic with Wireshark", "capture packets for troubleshooting", "filter PCAP files", "follow TCP/UDP streams", "detect network anomalies", "investigate suspicious traffic", or "perform protocol analysis". It provides comprehensive techniques for network packet capture, filtering, and analysis using Wireshark. | `skills/wireshark-analysis` |
| **WordPress Penetration Testing** | ⚪ | This skill should be used when the user asks to "pentest WordPress sites", "scan WordPress for vulnerabilities", "enumerate WordPress users, themes, or plugins", "exploit WordPress vulnerabilities", or "use WPScan". It provides comprehensive WordPress security assessment methodologies. | `skills/wordpress-penetration-testing` |
| **workflow-automation** | ⚪ | Workflow automation is the infrastructure that makes AI agents reliable. Without durable execution, a network hiccup during a 10-step payment flow means lost money and angry customers. With it, workflows resume exactly where they left off. This skill covers the platforms (n8n, Temporal, Inngest) and patterns (sequential, parallel, orchestrator-worker) that turn brittle scripts into production-grade automation. Key insight: The platforms make different tradeoffs. n8n optimizes for accessibility | `skills/workflow-automation` |
| **writing-plans** | ⚪ | Use when you have a spec or requirements for a multi-step task, before touching code | `skills/writing-plans` |
| **writing-skills** | ⚪ | Use when creating new skills, editing existing skills, or verifying skills work before deployment | `skills/writing-skills` |
| **xlsx** | ⚪ | Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas | `skills/xlsx-official` |
| **zapier-make-patterns** | ⚪ | No-code automation democratizes workflow building. Zapier and Make (formerly Integromat) let non-developers automate business processes without writing code. But no-code doesn't mean no-complexity - these platforms have their own patterns, pitfalls, and breaking points. This skill covers when to use which platform, how to build reliable automations, and when to graduate to code-based solutions. Key insight: Zapier optimizes for simplicity and integrations (7000+ apps), Make optimizes for power | `skills/zapier-make-patterns` |
## Installation
@@ -526,6 +521,13 @@ We officially thank the following contributors for their help in making this rep
- [zebbern](https://github.com/zebbern)
- [vuth-dogo](https://github.com/vuth-dogo)
## 👥 Repo Contributors
Thanks to the people who have contributed to this project:
- [@sickn33](https://github.com/sickn33) - Voice AI Engine (PR #33)
- [@community](https://github.com/community) - Categorization Initiative (PR #32)
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=sickn33/antigravity-awesome-skills&type=date&legend=top-left)](https://www.star-history.com/#sickn33/antigravity-awesome-skills&type=date&legend=top-left)

View File

@@ -0,0 +1,175 @@
# Voice AI Engine Development Skill
Build production-ready real-time conversational AI voice engines with async worker pipelines, streaming transcription, LLM agents, and TTS synthesis.
## Overview
This skill provides comprehensive guidance for building voice AI engines that enable natural, bidirectional conversations between users and AI agents. It covers the complete architecture from audio input to audio output, including:
- **Async Worker Pipeline Pattern** - Concurrent processing with queue-based communication
- **Streaming Transcription** - Real-time speech-to-text conversion
- **LLM-Powered Agents** - Conversational AI with context awareness
- **Text-to-Speech Synthesis** - Natural voice generation
- **Interrupt Handling** - Users can interrupt the bot mid-sentence
- **Multi-Provider Support** - Swap between different service providers easily
## Quick Start
```python
# Use the skill in your AI assistant
@voice-ai-engine-development I need to build a voice assistant that can handle real-time conversations with interrupts
```
## What's Included
### Main Skill File
- `SKILL.md` - Comprehensive guide to voice AI engine development
### Examples
- `complete_voice_engine.py` - Full working implementation
- `gemini_agent_example.py` - LLM agent with proper response buffering
- `interrupt_system_example.py` - Interrupt handling demonstration
### Templates
- `base_worker_template.py` - Template for creating new workers
- `multi_provider_factory_template.py` - Multi-provider factory pattern
### References
- `common_pitfalls.md` - Common issues and solutions
- `provider_comparison.md` - Comparison of transcription, LLM, and TTS providers
## Key Concepts
### The Worker Pipeline Pattern
Every voice AI engine follows this pipeline:
```
Audio In → Transcriber → Agent → Synthesizer → Audio Out
(Worker 1) (Worker 2) (Worker 3)
```
Each worker:
- Runs independently via asyncio
- Communicates through asyncio.Queue objects
- Can be stopped mid-stream for interrupts
- Handles errors gracefully
### Critical Implementation Details
1. **Buffer LLM Responses** - Always buffer the entire LLM response before sending to synthesizer to prevent audio jumping
2. **Mute Transcriber** - Mute the transcriber when bot speaks to prevent echo/feedback loops
3. **Rate-Limit Audio** - Send audio chunks at real-time speed to enable interrupts
4. **Proper Cleanup** - Always cleanup resources in finally blocks to prevent memory leaks
## Supported Providers
### Transcription
- Deepgram (fastest, best for real-time)
- AssemblyAI (highest accuracy)
- Azure Speech (enterprise-grade)
- Google Cloud Speech (multi-language)
### LLM
- OpenAI GPT-4 (highest quality)
- Google Gemini (cost-effective)
- Anthropic Claude (safety-focused)
### TTS
- ElevenLabs (most natural voices)
- Azure TTS (enterprise-grade)
- Google Cloud TTS (cost-effective)
- Amazon Polly (AWS integration)
- Play.ht (voice cloning)
## Common Use Cases
- Customer service voice bots
- Voice assistants
- Phone automation systems
- Voice-enabled applications
- Interactive voice response (IVR) systems
- Voice-based tutoring systems
## Architecture Highlights
### Async Worker Pattern
```python
class BaseWorker:
async def _run_loop(self):
while self.active:
item = await self.input_queue.get()
await self.process(item)
```
### Interrupt System
```python
# User interrupts bot mid-sentence
if stop_event.is_set():
partial_message = get_message_up_to(seconds_spoken)
return partial_message, True # cut_off = True
```
### Multi-Provider Factory
```python
factory = VoiceComponentFactory()
transcriber = factory.create_transcriber(config) # Deepgram, AssemblyAI, etc.
agent = factory.create_agent(config) # OpenAI, Gemini, etc.
synthesizer = factory.create_synthesizer(config) # ElevenLabs, Azure, etc.
```
## Testing
The skill includes examples for:
- Unit testing workers in isolation
- Integration testing the full pipeline
- Testing interrupt functionality
- Testing with different providers
## Best Practices
1. ✅ Always stream at every stage (transcription, LLM, synthesis)
2. ✅ Buffer entire LLM responses before synthesis
3. ✅ Mute transcriber during bot speech
4. ✅ Rate-limit audio chunks for interrupts
5. ✅ Maintain conversation history for context
6. ✅ Use proper error handling in worker loops
7. ✅ Cleanup resources in finally blocks
8. ✅ Use LINEAR16 PCM at 16kHz for audio
## Common Pitfalls
See `references/common_pitfalls.md` for detailed solutions to:
- Audio jumping/cutting off
- Echo/feedback loops
- Interrupts not working
- Memory leaks
- Lost conversation context
- High latency
- Poor audio quality
## Contributing
This skill is part of the Antigravity Awesome Skills repository. Contributions are welcome!
## Related Skills
- `@websocket-patterns` - WebSocket implementation
- `@async-python` - Asyncio patterns
- `@streaming-apis` - Streaming API integration
- `@audio-processing` - Audio format conversion
## License
MIT License - See repository LICENSE file
## Resources
- [Vocode Documentation](https://docs.vocode.dev/)
- [Deepgram API](https://developers.deepgram.com/)
- [OpenAI API](https://platform.openai.com/docs/)
- [ElevenLabs API](https://elevenlabs.io/docs/)
---
**Built with ❤️ for the Antigravity community**

View File

@@ -0,0 +1,721 @@
---
name: voice-ai-engine-development
description: "Build real-time conversational AI voice engines using async worker pipelines, streaming transcription, LLM agents, and TTS synthesis with interrupt handling and multi-provider support"
---
# Voice AI Engine Development
## Overview
This skill guides you through building production-ready voice AI engines with real-time conversation capabilities. Voice AI engines enable natural, bidirectional conversations between users and AI agents through streaming audio processing, speech-to-text transcription, LLM-powered responses, and text-to-speech synthesis.
The core architecture uses an async queue-based worker pipeline where each component runs independently and communicates via `asyncio.Queue` objects, enabling concurrent processing, interrupt handling, and real-time streaming at every stage.
## When to Use This Skill
Use this skill when:
- Building real-time voice conversation systems
- Implementing voice assistants or chatbots
- Creating voice-enabled customer service agents
- Developing voice AI applications with interrupt capabilities
- Integrating multiple transcription, LLM, or TTS providers
- Working with streaming audio processing pipelines
- The user mentions Vocode, voice engines, or conversational AI
## Core Architecture Principles
### The Worker Pipeline Pattern
Every voice AI engine follows this pipeline:
```
Audio In → Transcriber → Agent → Synthesizer → Audio Out
(Worker 1) (Worker 2) (Worker 3)
```
**Key Benefits:**
- **Decoupling**: Workers only know about their input/output queues
- **Concurrency**: All workers run simultaneously via asyncio
- **Backpressure**: Queues automatically handle rate differences
- **Interruptibility**: Everything can be stopped mid-stream
### Base Worker Pattern
Every worker follows this pattern:
```python
class BaseWorker:
def __init__(self, input_queue, output_queue):
self.input_queue = input_queue # asyncio.Queue to consume from
self.output_queue = output_queue # asyncio.Queue to produce to
self.active = False
def start(self):
"""Start the worker's processing loop"""
self.active = True
asyncio.create_task(self._run_loop())
async def _run_loop(self):
"""Main processing loop - runs forever until terminated"""
while self.active:
item = await self.input_queue.get() # Block until item arrives
await self.process(item) # Process the item
async def process(self, item):
"""Override this - does the actual work"""
raise NotImplementedError
def terminate(self):
"""Stop the worker"""
self.active = False
```
## Component Implementation Guide
### 1. Transcriber (Audio → Text)
**Purpose**: Converts incoming audio chunks to text transcriptions
**Interface Requirements**:
```python
class BaseTranscriber:
def __init__(self, transcriber_config):
self.input_queue = asyncio.Queue() # Audio chunks (bytes)
self.output_queue = asyncio.Queue() # Transcriptions
self.is_muted = False
def send_audio(self, chunk: bytes):
"""Client calls this to send audio"""
if not self.is_muted:
self.input_queue.put_nowait(chunk)
else:
# Send silence instead (prevents echo during bot speech)
self.input_queue.put_nowait(self.create_silent_chunk(len(chunk)))
def mute(self):
"""Called when bot starts speaking (prevents echo)"""
self.is_muted = True
def unmute(self):
"""Called when bot stops speaking"""
self.is_muted = False
```
**Output Format**:
```python
class Transcription:
message: str # "Hello, how are you?"
confidence: float # 0.95
is_final: bool # True = complete sentence, False = partial
is_interrupt: bool # Set by TranscriptionsWorker
```
**Supported Providers**:
- **Deepgram** - Fast, accurate, streaming
- **AssemblyAI** - High accuracy, good for accents
- **Azure Speech** - Enterprise-grade
- **Google Cloud Speech** - Multi-language support
**Critical Implementation Details**:
- Use WebSocket for bidirectional streaming
- Run sender and receiver tasks concurrently with `asyncio.gather()`
- Mute transcriber when bot speaks to prevent echo/feedback loops
- Handle both final and partial transcriptions
### 2. Agent (Text → Response)
**Purpose**: Processes user input and generates conversational responses
**Interface Requirements**:
```python
class BaseAgent:
def __init__(self, agent_config):
self.input_queue = asyncio.Queue() # TranscriptionAgentInput
self.output_queue = asyncio.Queue() # AgentResponse
self.transcript = None # Conversation history
async def generate_response(self, human_input, is_interrupt, conversation_id):
"""Override this - returns AsyncGenerator of responses"""
raise NotImplementedError
```
**Why Streaming Responses?**
- **Lower latency**: Start speaking as soon as first sentence is ready
- **Better interrupts**: Can stop mid-response
- **Sentence-by-sentence**: More natural conversation flow
**Supported Providers**:
- **OpenAI** (GPT-4, GPT-3.5) - High quality, fast
- **Google Gemini** - Multimodal, cost-effective
- **Anthropic Claude** - Long context, nuanced responses
**Critical Implementation Details**:
- Maintain conversation history in `Transcript` object
- Stream responses using `AsyncGenerator`
- **IMPORTANT**: Buffer entire LLM response before yielding to synthesizer (prevents audio jumping)
- Handle interrupts by canceling current generation task
- Update conversation history with partial messages on interrupt
### 3. Synthesizer (Text → Audio)
**Purpose**: Converts agent text responses to speech audio
**Interface Requirements**:
```python
class BaseSynthesizer:
async def create_speech(self, message: BaseMessage, chunk_size: int) -> SynthesisResult:
"""
Returns a SynthesisResult containing:
- chunk_generator: AsyncGenerator that yields audio chunks
- get_message_up_to: Function to get partial text (for interrupts)
"""
raise NotImplementedError
```
**SynthesisResult Structure**:
```python
class SynthesisResult:
chunk_generator: AsyncGenerator[ChunkResult, None]
get_message_up_to: Callable[[float], str] # seconds → partial text
class ChunkResult:
chunk: bytes # Raw PCM audio
is_last_chunk: bool
```
**Supported Providers**:
- **ElevenLabs** - Most natural voices, streaming
- **Azure TTS** - Enterprise-grade, many languages
- **Google Cloud TTS** - Cost-effective, good quality
- **Amazon Polly** - AWS integration
- **Play.ht** - Voice cloning
**Critical Implementation Details**:
- Stream audio chunks as they're generated
- Convert audio to LINEAR16 PCM format (16kHz sample rate)
- Implement `get_message_up_to()` for interrupt handling
- Handle audio format conversion (MP3 → PCM)
### 4. Output Device (Audio → Client)
**Purpose**: Sends synthesized audio back to the client
**CRITICAL: Rate Limiting for Interrupts**
```python
async def send_speech_to_output(self, message, synthesis_result,
stop_event, seconds_per_chunk):
chunk_idx = 0
async for chunk_result in synthesis_result.chunk_generator:
# Check for interrupt
if stop_event.is_set():
logger.debug(f"Interrupted after {chunk_idx} chunks")
message_sent = synthesis_result.get_message_up_to(
chunk_idx * seconds_per_chunk
)
return message_sent, True # cut_off = True
start_time = time.time()
# Send chunk to output device
self.output_device.consume_nonblocking(chunk_result.chunk)
# CRITICAL: Wait for chunk to play before sending next one
# This is what makes interrupts work!
speech_length = seconds_per_chunk
processing_time = time.time() - start_time
await asyncio.sleep(max(speech_length - processing_time, 0))
chunk_idx += 1
return message, False # cut_off = False
```
**Why Rate Limiting?**
Without rate limiting, all audio chunks would be sent immediately, which would:
- Buffer entire message on client side
- Make interrupts impossible (all audio already sent)
- Cause timing issues
By sending one chunk every N seconds:
- Real-time playback is maintained
- Interrupts can stop mid-sentence
- Natural conversation flow is preserved
## The Interrupt System
The interrupt system is critical for natural conversations.
### How Interrupts Work
**Scenario**: Bot is saying "I think the weather will be nice today and tomorrow and—" when user interrupts with "Stop".
**Step 1: User starts speaking**
```python
# TranscriptionsWorker detects new transcription while bot speaking
async def process(self, transcription):
if not self.conversation.is_human_speaking: # Bot was speaking!
# Broadcast interrupt to all in-flight events
interrupted = self.conversation.broadcast_interrupt()
transcription.is_interrupt = interrupted
```
**Step 2: broadcast_interrupt() stops everything**
```python
def broadcast_interrupt(self):
num_interrupts = 0
# Interrupt all queued events
while True:
try:
interruptible_event = self.interruptible_events.get_nowait()
if interruptible_event.interrupt(): # Sets interruption_event
num_interrupts += 1
except queue.Empty:
break
# Cancel current tasks
self.agent.cancel_current_task() # Stop generating text
self.agent_responses_worker.cancel_current_task() # Stop synthesizing
return num_interrupts > 0
```
**Step 3: SynthesisResultsWorker detects interrupt**
```python
async def send_speech_to_output(self, synthesis_result, stop_event, ...):
async for chunk_result in synthesis_result.chunk_generator:
# Check stop_event (this is the interruption_event)
if stop_event.is_set():
logger.debug("Interrupted! Stopping speech.")
# Calculate what was actually spoken
seconds_spoken = chunk_idx * seconds_per_chunk
partial_message = synthesis_result.get_message_up_to(seconds_spoken)
# e.g., "I think the weather will be nice today"
return partial_message, True # cut_off = True
```
**Step 4: Agent updates history**
```python
if cut_off:
# Update conversation history with partial message
self.agent.update_last_bot_message_on_cut_off(message_sent)
# History now shows:
# Bot: "I think the weather will be nice today" (incomplete)
```
### InterruptibleEvent Pattern
Every event in the pipeline is wrapped in an `InterruptibleEvent`:
```python
class InterruptibleEvent:
def __init__(self, payload, is_interruptible=True):
self.payload = payload
self.is_interruptible = is_interruptible
self.interruption_event = threading.Event() # Initially not set
self.interrupted = False
def interrupt(self) -> bool:
"""Interrupt this event"""
if not self.is_interruptible:
return False
if not self.interrupted:
self.interruption_event.set() # Signal to stop!
self.interrupted = True
return True
return False
def is_interrupted(self) -> bool:
return self.interruption_event.is_set()
```
## Multi-Provider Factory Pattern
Support multiple providers with a factory pattern:
```python
class VoiceHandler:
"""Multi-provider factory for voice components"""
def create_transcriber(self, agent_config: Dict):
"""Create transcriber based on transcriberProvider"""
provider = agent_config.get("transcriberProvider", "deepgram")
if provider == "deepgram":
return self._create_deepgram_transcriber(agent_config)
elif provider == "assemblyai":
return self._create_assemblyai_transcriber(agent_config)
elif provider == "azure":
return self._create_azure_transcriber(agent_config)
elif provider == "google":
return self._create_google_transcriber(agent_config)
else:
raise ValueError(f"Unknown transcriber provider: {provider}")
def create_agent(self, agent_config: Dict):
"""Create LLM agent based on llmProvider"""
provider = agent_config.get("llmProvider", "openai")
if provider == "openai":
return self._create_openai_agent(agent_config)
elif provider == "gemini":
return self._create_gemini_agent(agent_config)
else:
raise ValueError(f"Unknown LLM provider: {provider}")
def create_synthesizer(self, agent_config: Dict):
"""Create voice synthesizer based on voiceProvider"""
provider = agent_config.get("voiceProvider", "elevenlabs")
if provider == "elevenlabs":
return self._create_elevenlabs_synthesizer(agent_config)
elif provider == "azure":
return self._create_azure_synthesizer(agent_config)
elif provider == "google":
return self._create_google_synthesizer(agent_config)
elif provider == "polly":
return self._create_polly_synthesizer(agent_config)
elif provider == "playht":
return self._create_playht_synthesizer(agent_config)
else:
raise ValueError(f"Unknown voice provider: {provider}")
```
## WebSocket Integration
Voice AI engines typically use WebSocket for bidirectional audio streaming:
```python
@app.websocket("/conversation")
async def websocket_endpoint(websocket: WebSocket):
await websocket.accept()
# Create voice components
voice_handler = VoiceHandler()
transcriber = voice_handler.create_transcriber(agent_config)
agent = voice_handler.create_agent(agent_config)
synthesizer = voice_handler.create_synthesizer(agent_config)
# Create output device
output_device = WebsocketOutputDevice(
ws=websocket,
sampling_rate=16000,
audio_encoding=AudioEncoding.LINEAR16
)
# Create conversation orchestrator
conversation = StreamingConversation(
output_device=output_device,
transcriber=transcriber,
agent=agent,
synthesizer=synthesizer
)
# Start all workers
await conversation.start()
try:
# Receive audio from client
async for message in websocket.iter_bytes():
conversation.receive_audio(message)
except WebSocketDisconnect:
logger.info("Client disconnected")
finally:
await conversation.terminate()
```
## Common Pitfalls and Solutions
### 1. Audio Jumping/Cutting Off
**Problem**: Bot's audio jumps or cuts off mid-response.
**Cause**: Sending text to synthesizer in small chunks causes multiple TTS calls.
**Solution**: Buffer the entire LLM response before sending to synthesizer:
```python
# ❌ Bad: Yields sentence-by-sentence
async for sentence in llm_stream:
yield GeneratedResponse(message=BaseMessage(text=sentence))
# ✅ Good: Buffer entire response
full_response = ""
async for chunk in llm_stream:
full_response += chunk
yield GeneratedResponse(message=BaseMessage(text=full_response))
```
### 2. Echo/Feedback Loop
**Problem**: Bot hears itself speaking and responds to its own audio.
**Cause**: Transcriber not muted during bot speech.
**Solution**: Mute transcriber when bot starts speaking:
```python
# Before sending audio to output
self.transcriber.mute()
# After audio playback complete
self.transcriber.unmute()
```
### 3. Interrupts Not Working
**Problem**: User can't interrupt bot mid-sentence.
**Cause**: All audio chunks sent at once instead of rate-limited.
**Solution**: Rate-limit audio chunks to match real-time playback:
```python
async for chunk in synthesis_result.chunk_generator:
start_time = time.time()
# Send chunk
output_device.consume_nonblocking(chunk)
# Wait for chunk duration before sending next
processing_time = time.time() - start_time
await asyncio.sleep(max(seconds_per_chunk - processing_time, 0))
```
### 4. Memory Leaks from Unclosed Streams
**Problem**: Memory usage grows over time.
**Cause**: WebSocket connections or API streams not properly closed.
**Solution**: Always use context managers and cleanup:
```python
try:
async with websockets.connect(url) as ws:
# Use websocket
pass
finally:
# Cleanup
await conversation.terminate()
await transcriber.terminate()
```
## Production Considerations
### 1. Error Handling
```python
async def _run_loop(self):
while self.active:
try:
item = await self.input_queue.get()
await self.process(item)
except Exception as e:
logger.error(f"Worker error: {e}", exc_info=True)
# Don't crash the worker, continue processing
```
### 2. Graceful Shutdown
```python
async def terminate(self):
"""Gracefully shut down all workers"""
self.active = False
# Stop all workers
self.transcriber.terminate()
self.agent.terminate()
self.synthesizer.terminate()
# Wait for queues to drain
await asyncio.sleep(0.5)
# Close connections
if self.websocket:
await self.websocket.close()
```
### 3. Monitoring and Logging
```python
# Log key events
logger.info(f"🎤 [TRANSCRIBER] Received: '{transcription.message}'")
logger.info(f"🤖 [AGENT] Generating response...")
logger.info(f"🔊 [SYNTHESIZER] Synthesizing {len(text)} characters")
logger.info(f"⚠️ [INTERRUPT] User interrupted bot")
# Track metrics
metrics.increment("transcriptions.count")
metrics.timing("agent.response_time", duration)
metrics.gauge("active_conversations", count)
```
### 4. Rate Limiting and Quotas
```python
# Implement rate limiting for API calls
from aiolimiter import AsyncLimiter
rate_limiter = AsyncLimiter(max_rate=10, time_period=1) # 10 calls/second
async def call_api(self, data):
async with rate_limiter:
return await self.client.post(data)
```
## Key Design Patterns
### 1. Producer-Consumer with Queues
```python
# Producer
async def producer(queue):
while True:
item = await generate_item()
queue.put_nowait(item)
# Consumer
async def consumer(queue):
while True:
item = await queue.get()
await process_item(item)
```
### 2. Streaming Generators
Instead of returning complete results:
```python
# ❌ Bad: Wait for entire response
async def generate_response(prompt):
response = await openai.complete(prompt) # 5 seconds
return response
# ✅ Good: Stream chunks as they arrive
async def generate_response(prompt):
async for chunk in openai.complete(prompt, stream=True):
yield chunk # Yield after 0.1s, 0.2s, etc.
```
### 3. Conversation State Management
Maintain conversation history for context:
```python
class Transcript:
event_logs: List[Message] = []
def add_human_message(self, text):
self.event_logs.append(Message(sender=Sender.HUMAN, text=text))
def add_bot_message(self, text):
self.event_logs.append(Message(sender=Sender.BOT, text=text))
def to_openai_messages(self):
return [
{"role": "user" if msg.sender == Sender.HUMAN else "assistant",
"content": msg.text}
for msg in self.event_logs
]
```
## Testing Strategies
### 1. Unit Test Workers in Isolation
```python
async def test_transcriber():
transcriber = DeepgramTranscriber(config)
# Mock audio input
audio_chunk = b'\x00\x01\x02...'
transcriber.send_audio(audio_chunk)
# Check output
transcription = await transcriber.output_queue.get()
assert transcription.message == "expected text"
```
### 2. Integration Test Pipeline
```python
async def test_full_pipeline():
# Create all components
conversation = create_test_conversation()
# Send test audio
conversation.receive_audio(test_audio_chunk)
# Wait for response
response = await wait_for_audio_output(timeout=5)
assert response is not None
```
### 3. Test Interrupts
```python
async def test_interrupt():
conversation = create_test_conversation()
# Start bot speaking
await conversation.agent.generate_response("Tell me a long story")
# Interrupt mid-response
await asyncio.sleep(1) # Let it speak for 1 second
conversation.broadcast_interrupt()
# Verify partial message in transcript
last_message = conversation.transcript.event_logs[-1]
assert last_message.text != full_expected_message
```
## Implementation Workflow
When implementing a voice AI engine:
1. **Start with Base Workers**: Implement the base worker pattern first
2. **Add Transcriber**: Choose a provider and implement streaming transcription
3. **Add Agent**: Implement LLM integration with streaming responses
4. **Add Synthesizer**: Implement TTS with audio streaming
5. **Connect Pipeline**: Wire all workers together with queues
6. **Add Interrupts**: Implement the interrupt system
7. **Add WebSocket**: Create WebSocket endpoint for client communication
8. **Test Components**: Unit test each worker in isolation
9. **Test Integration**: Test the full pipeline end-to-end
10. **Add Error Handling**: Implement robust error handling and logging
11. **Optimize**: Add rate limiting, monitoring, and performance optimizations
## Related Skills
- `@websocket-patterns` - For WebSocket implementation details
- `@async-python` - For asyncio and async patterns
- `@streaming-apis` - For streaming API integration
- `@audio-processing` - For audio format conversion and processing
- `@systematic-debugging` - For debugging complex async pipelines
## Resources
**Libraries**:
- `asyncio` - Async programming
- `websockets` - WebSocket client/server
- `FastAPI` - WebSocket server framework
- `pydub` - Audio manipulation
- `numpy` - Audio data processing
**API Providers**:
- Transcription: Deepgram, AssemblyAI, Azure Speech, Google Cloud Speech
- LLM: OpenAI, Google Gemini, Anthropic Claude
- TTS: ElevenLabs, Azure TTS, Google Cloud TTS, Amazon Polly, Play.ht
## Summary
Building a voice AI engine requires:
- ✅ Async worker pipeline for concurrent processing
- ✅ Queue-based communication between components
- ✅ Streaming at every stage (transcription, LLM, synthesis)
- ✅ Interrupt system for natural conversations
- ✅ Rate limiting for real-time audio playback
- ✅ Multi-provider support for flexibility
- ✅ Proper error handling and graceful shutdown
**The key insight**: Everything must stream and everything must be interruptible for natural, real-time conversations.

View File

@@ -0,0 +1,423 @@
"""
Example: Complete Voice AI Engine Implementation
This example demonstrates a minimal but complete voice AI engine
with all core components: Transcriber, Agent, Synthesizer, and WebSocket integration.
"""
import asyncio
from typing import Dict, AsyncGenerator
from fastapi import FastAPI, WebSocket, WebSocketDisconnect
from dataclasses import dataclass
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
app = FastAPI()
# ============================================================================
# Data Models
# ============================================================================
@dataclass
class Transcription:
message: str
confidence: float
is_final: bool
is_interrupt: bool = False
@dataclass
class AgentResponse:
message: str
is_interruptible: bool = True
@dataclass
class SynthesisResult:
chunk_generator: AsyncGenerator[bytes, None]
get_message_up_to: callable
# ============================================================================
# Base Worker Pattern
# ============================================================================
class BaseWorker:
"""Base class for all workers in the pipeline"""
def __init__(self, input_queue: asyncio.Queue, output_queue: asyncio.Queue):
self.input_queue = input_queue
self.output_queue = output_queue
self.active = False
self._task = None
def start(self):
"""Start the worker's processing loop"""
self.active = True
self._task = asyncio.create_task(self._run_loop())
async def _run_loop(self):
"""Main processing loop - runs forever until terminated"""
while self.active:
try:
item = await self.input_queue.get()
await self.process(item)
except Exception as e:
logger.error(f"Worker error: {e}", exc_info=True)
async def process(self, item):
"""Override this - does the actual work"""
raise NotImplementedError
def terminate(self):
"""Stop the worker"""
self.active = False
if self._task:
self._task.cancel()
# ============================================================================
# Transcriber Component
# ============================================================================
class DeepgramTranscriber(BaseWorker):
"""Converts audio chunks to text transcriptions using Deepgram"""
def __init__(self, config: Dict):
super().__init__(asyncio.Queue(), asyncio.Queue())
self.config = config
self.is_muted = False
def send_audio(self, chunk: bytes):
"""Client calls this to send audio"""
if not self.is_muted:
self.input_queue.put_nowait(chunk)
else:
# Send silence instead (prevents echo during bot speech)
self.input_queue.put_nowait(self.create_silent_chunk(len(chunk)))
def create_silent_chunk(self, size: int) -> bytes:
"""Create a silent audio chunk"""
return b'\x00' * size
def mute(self):
"""Called when bot starts speaking (prevents echo)"""
self.is_muted = True
logger.info("🔇 [TRANSCRIBER] Muted")
def unmute(self):
"""Called when bot stops speaking"""
self.is_muted = False
logger.info("🔊 [TRANSCRIBER] Unmuted")
async def process(self, audio_chunk: bytes):
"""Process audio chunk and generate transcription"""
# In a real implementation, this would call Deepgram API
# For this example, we'll simulate a transcription
# Simulate API call delay
await asyncio.sleep(0.1)
# Mock transcription
transcription = Transcription(
message="Hello, how can I help you?",
confidence=0.95,
is_final=True
)
logger.info(f"🎤 [TRANSCRIBER] Received: '{transcription.message}'")
self.output_queue.put_nowait(transcription)
# ============================================================================
# Agent Component
# ============================================================================
class GeminiAgent(BaseWorker):
"""LLM-powered conversational agent using Google Gemini"""
def __init__(self, config: Dict):
super().__init__(asyncio.Queue(), asyncio.Queue())
self.config = config
self.conversation_history = []
async def process(self, transcription: Transcription):
"""Process transcription and generate response"""
# Add user message to history
self.conversation_history.append({
"role": "user",
"content": transcription.message
})
logger.info(f"🤖 [AGENT] Generating response for: '{transcription.message}'")
# Generate response (streaming)
async for response in self.generate_response(transcription.message):
self.output_queue.put_nowait(response)
async def generate_response(self, user_input: str) -> AsyncGenerator[AgentResponse, None]:
"""Generate streaming response from LLM"""
# In a real implementation, this would call Gemini API
# For this example, we'll simulate a streaming response
# Simulate streaming delay
await asyncio.sleep(0.5)
# IMPORTANT: Buffer entire response before yielding
# This prevents audio jumping/cutting off
full_response = f"I understand you said: {user_input}. How can I assist you further?"
# Add to conversation history
self.conversation_history.append({
"role": "assistant",
"content": full_response
})
logger.info(f"🤖 [AGENT] Generated: '{full_response}'")
# Yield complete response
yield AgentResponse(
message=full_response,
is_interruptible=True
)
# ============================================================================
# Synthesizer Component
# ============================================================================
class ElevenLabsSynthesizer:
"""Converts text to speech using ElevenLabs"""
def __init__(self, config: Dict):
self.config = config
async def create_speech(self, message: str, chunk_size: int = 1024) -> SynthesisResult:
"""
Generate speech audio from text
Returns SynthesisResult with:
- chunk_generator: AsyncGenerator yielding audio chunks
- get_message_up_to: Function to get partial text for interrupts
"""
# In a real implementation, this would call ElevenLabs API
# For this example, we'll simulate audio generation
logger.info(f"🔊 [SYNTHESIZER] Synthesizing {len(message)} characters")
async def chunk_generator():
# Simulate streaming audio chunks
num_chunks = len(message) // 10 + 1
for i in range(num_chunks):
# Simulate API delay
await asyncio.sleep(0.1)
# Mock audio chunk (in reality, this would be PCM audio)
chunk = b'\x00' * chunk_size
yield chunk
def get_message_up_to(seconds: float) -> str:
"""Calculate partial message based on playback time"""
# Estimate: ~150 words per minute = ~2.5 words per second
# Rough estimate: 5 characters per word
chars_per_second = 12.5
char_index = int(seconds * chars_per_second)
return message[:char_index]
return SynthesisResult(
chunk_generator=chunk_generator(),
get_message_up_to=get_message_up_to
)
# ============================================================================
# Output Device
# ============================================================================
class WebsocketOutputDevice:
"""Sends audio chunks to client via WebSocket"""
def __init__(self, websocket: WebSocket):
self.websocket = websocket
async def consume_nonblocking(self, chunk: bytes):
"""Send audio chunk to client"""
await self.websocket.send_bytes(chunk)
# ============================================================================
# Conversation Orchestrator
# ============================================================================
class StreamingConversation:
"""Orchestrates the entire voice conversation pipeline"""
def __init__(
self,
output_device: WebsocketOutputDevice,
transcriber: DeepgramTranscriber,
agent: GeminiAgent,
synthesizer: ElevenLabsSynthesizer
):
self.output_device = output_device
self.transcriber = transcriber
self.agent = agent
self.synthesizer = synthesizer
self.is_human_speaking = True
self.interrupt_event = asyncio.Event()
async def start(self):
"""Start all workers"""
logger.info("🚀 [CONVERSATION] Starting...")
# Start workers
self.transcriber.start()
self.agent.start()
# Start processing pipelines
asyncio.create_task(self._process_transcriptions())
asyncio.create_task(self._process_agent_responses())
async def _process_transcriptions(self):
"""Process transcriptions from transcriber"""
while True:
transcription = await self.transcriber.output_queue.get()
# Check if this is an interrupt
if not self.is_human_speaking:
logger.info("⚠️ [INTERRUPT] User interrupted bot")
self.interrupt_event.set()
transcription.is_interrupt = True
self.is_human_speaking = True
# Send to agent
await self.agent.input_queue.put(transcription)
async def _process_agent_responses(self):
"""Process responses from agent and synthesize"""
while True:
response = await self.agent.output_queue.get()
self.is_human_speaking = False
# Mute transcriber to prevent echo
self.transcriber.mute()
# Synthesize and play
synthesis_result = await self.synthesizer.create_speech(response.message)
await self._send_speech_to_output(synthesis_result, seconds_per_chunk=0.1)
# Unmute transcriber
self.transcriber.unmute()
self.is_human_speaking = True
async def _send_speech_to_output(self, synthesis_result: SynthesisResult, seconds_per_chunk: float):
"""
Send synthesized audio to output with rate limiting
CRITICAL: Rate limiting enables interrupts to work
"""
chunk_idx = 0
async for chunk in synthesis_result.chunk_generator:
# Check for interrupt
if self.interrupt_event.is_set():
logger.info(f"🛑 [INTERRUPT] Stopped after {chunk_idx} chunks")
# Calculate what was actually spoken
seconds_spoken = chunk_idx * seconds_per_chunk
partial_message = synthesis_result.get_message_up_to(seconds_spoken)
logger.info(f"📝 [INTERRUPT] Partial message: '{partial_message}'")
# Clear interrupt event
self.interrupt_event.clear()
return
start_time = asyncio.get_event_loop().time()
# Send chunk to output device
await self.output_device.consume_nonblocking(chunk)
# CRITICAL: Wait for chunk to play before sending next one
# This is what makes interrupts work!
processing_time = asyncio.get_event_loop().time() - start_time
await asyncio.sleep(max(seconds_per_chunk - processing_time, 0))
chunk_idx += 1
def receive_audio(self, audio_chunk: bytes):
"""Receive audio from client"""
self.transcriber.send_audio(audio_chunk)
async def terminate(self):
"""Gracefully shut down all workers"""
logger.info("🛑 [CONVERSATION] Terminating...")
self.transcriber.terminate()
self.agent.terminate()
# Wait for queues to drain
await asyncio.sleep(0.5)
# ============================================================================
# WebSocket Endpoint
# ============================================================================
@app.websocket("/conversation")
async def conversation_endpoint(websocket: WebSocket):
"""WebSocket endpoint for voice conversations"""
await websocket.accept()
logger.info("✅ [WEBSOCKET] Client connected")
# Configuration
config = {
"transcriberProvider": "deepgram",
"llmProvider": "gemini",
"voiceProvider": "elevenlabs",
"prompt": "You are a helpful AI assistant.",
}
# Create components
transcriber = DeepgramTranscriber(config)
agent = GeminiAgent(config)
synthesizer = ElevenLabsSynthesizer(config)
output_device = WebsocketOutputDevice(websocket)
# Create conversation
conversation = StreamingConversation(
output_device=output_device,
transcriber=transcriber,
agent=agent,
synthesizer=synthesizer
)
# Start conversation
await conversation.start()
try:
# Process incoming audio
async for message in websocket.iter_bytes():
conversation.receive_audio(message)
except WebSocketDisconnect:
logger.info("❌ [WEBSOCKET] Client disconnected")
except Exception as e:
logger.error(f"❌ [WEBSOCKET] Error: {e}", exc_info=True)
finally:
await conversation.terminate()
# ============================================================================
# Main Entry Point
# ============================================================================
if __name__ == "__main__":
import uvicorn
logger.info("🚀 Starting Voice AI Engine...")
uvicorn.run(app, host="0.0.0.0", port=8000)

View File

@@ -0,0 +1,239 @@
"""
Example: Gemini Agent Implementation with Streaming
This example shows how to implement a Gemini-powered agent
that properly buffers responses to prevent audio jumping.
"""
import asyncio
from typing import AsyncGenerator, List, Dict
from dataclasses import dataclass
import logging
logger = logging.getLogger(__name__)
@dataclass
class Message:
role: str # "user" or "assistant"
content: str
@dataclass
class GeneratedResponse:
message: str
is_interruptible: bool = True
class GeminiAgent:
"""
LLM-powered conversational agent using Google Gemini
Key Features:
- Maintains conversation history
- Streams responses from Gemini API
- Buffers entire response before yielding (prevents audio jumping)
- Handles interrupts gracefully
"""
def __init__(self, config: Dict):
self.config = config
self.conversation_history: List[Message] = []
self.system_prompt = config.get("prompt", "You are a helpful AI assistant.")
self.current_task = None
async def generate_response(
self,
user_input: str,
is_interrupt: bool = False
) -> AsyncGenerator[GeneratedResponse, None]:
"""
Generate streaming response from Gemini
IMPORTANT: This buffers the entire LLM response before yielding
to prevent audio jumping/cutting off.
Args:
user_input: The user's message
is_interrupt: Whether this is an interrupt
Yields:
GeneratedResponse with complete buffered message
"""
# Add user message to history
self.conversation_history.append(
Message(role="user", content=user_input)
)
logger.info(f"🤖 [AGENT] Generating response for: '{user_input}'")
# Build conversation context for Gemini
contents = self._build_gemini_contents()
# Stream response from Gemini and buffer it
full_response = ""
try:
# In a real implementation, this would call Gemini API
# async for chunk in self._create_gemini_stream(contents):
# if isinstance(chunk, str):
# full_response += chunk
# For this example, simulate streaming
async for chunk in self._simulate_gemini_stream(user_input):
full_response += chunk
# Log progress (optional)
if len(full_response) % 50 == 0:
logger.debug(f"🤖 [AGENT] Buffered {len(full_response)} chars...")
except Exception as e:
logger.error(f"❌ [AGENT] Error generating response: {e}")
full_response = "I apologize, but I encountered an error. Could you please try again?"
# CRITICAL: Only yield after buffering the ENTIRE response
# This prevents multiple TTS calls that cause audio jumping
if full_response.strip():
# Add to conversation history
self.conversation_history.append(
Message(role="assistant", content=full_response)
)
logger.info(f"✅ [AGENT] Generated complete response ({len(full_response)} chars)")
yield GeneratedResponse(
message=full_response.strip(),
is_interruptible=True
)
def _build_gemini_contents(self) -> List[Dict]:
"""
Build conversation contents for Gemini API
Format:
[
{"role": "user", "parts": [{"text": "System: ..."}]},
{"role": "model", "parts": [{"text": "Understood."}]},
{"role": "user", "parts": [{"text": "Hello"}]},
{"role": "model", "parts": [{"text": "Hi there!"}]},
...
]
"""
contents = []
# Add system prompt as first user message
if self.system_prompt:
contents.append({
"role": "user",
"parts": [{"text": f"System Instruction: {self.system_prompt}"}]
})
contents.append({
"role": "model",
"parts": [{"text": "Understood."}]
})
# Add conversation history
for message in self.conversation_history:
role = "user" if message.role == "user" else "model"
contents.append({
"role": role,
"parts": [{"text": message.content}]
})
return contents
async def _simulate_gemini_stream(self, user_input: str) -> AsyncGenerator[str, None]:
"""
Simulate Gemini streaming response
In a real implementation, this would be:
async def _create_gemini_stream(self, contents):
response = await genai.GenerativeModel('gemini-pro').generate_content_async(
contents,
stream=True
)
async for chunk in response:
if chunk.text:
yield chunk.text
"""
# Simulate response
response = f"I understand you said: {user_input}. How can I assist you further?"
# Simulate streaming by yielding chunks
chunk_size = 10
for i in range(0, len(response), chunk_size):
chunk = response[i:i + chunk_size]
await asyncio.sleep(0.05) # Simulate network delay
yield chunk
def update_last_bot_message_on_cut_off(self, partial_message: str):
"""
Update conversation history when bot is interrupted
This ensures the conversation history reflects what was actually spoken,
not what was planned to be spoken.
Args:
partial_message: The partial message that was actually spoken
"""
if self.conversation_history and self.conversation_history[-1].role == "assistant":
# Update the last bot message with the partial message
self.conversation_history[-1].content = partial_message
logger.info(f"📝 [AGENT] Updated history with partial message: '{partial_message}'")
def cancel_current_task(self):
"""Cancel the current generation task (for interrupts)"""
if self.current_task and not self.current_task.done():
self.current_task.cancel()
logger.info("🛑 [AGENT] Cancelled current generation task")
def get_conversation_history(self) -> List[Message]:
"""Get the full conversation history"""
return self.conversation_history.copy()
def clear_conversation_history(self):
"""Clear the conversation history"""
self.conversation_history.clear()
logger.info("🗑️ [AGENT] Cleared conversation history")
# ============================================================================
# Example Usage
# ============================================================================
async def example_usage():
"""Example of how to use the GeminiAgent"""
# Configure agent
config = {
"prompt": "You are a helpful AI assistant specializing in voice conversations.",
"llmProvider": "gemini"
}
# Create agent
agent = GeminiAgent(config)
# Simulate conversation
user_messages = [
"Hello, how are you?",
"What's the weather like today?",
"Thank you!"
]
for user_message in user_messages:
print(f"\n👤 User: {user_message}")
# Generate response
async for response in agent.generate_response(user_message):
print(f"🤖 Bot: {response.message}")
# Print conversation history
print("\n📜 Conversation History:")
for i, message in enumerate(agent.get_conversation_history(), 1):
print(f"{i}. {message.role}: {message.content}")
if __name__ == "__main__":
asyncio.run(example_usage())

View File

@@ -0,0 +1,334 @@
"""
Example: Interrupt System Implementation
This example demonstrates how to implement a robust interrupt system
that allows users to interrupt the bot mid-sentence.
"""
import asyncio
import threading
from typing import Any
from dataclasses import dataclass
import logging
logger = logging.getLogger(__name__)
# ============================================================================
# InterruptibleEvent Pattern
# ============================================================================
class InterruptibleEvent:
"""
Wrapper for events that can be interrupted
Every event in the pipeline is wrapped in an InterruptibleEvent,
allowing the system to stop processing mid-stream.
"""
def __init__(self, payload: Any, is_interruptible: bool = True):
self.payload = payload
self.is_interruptible = is_interruptible
self.interruption_event = threading.Event() # Initially not set
self.interrupted = False
def interrupt(self) -> bool:
"""
Interrupt this event
Returns:
True if the event was interrupted, False if it was not interruptible
"""
if not self.is_interruptible:
return False
if not self.interrupted:
self.interruption_event.set() # Signal to stop!
self.interrupted = True
logger.info("⚠️ [INTERRUPT] Event interrupted")
return True
return False
def is_interrupted(self) -> bool:
"""Check if this event has been interrupted"""
return self.interruption_event.is_set()
# ============================================================================
# Conversation with Interrupt Support
# ============================================================================
class ConversationWithInterrupts:
"""
Conversation orchestrator with interrupt support
Key Features:
- Tracks all in-flight interruptible events
- Broadcasts interrupts to all workers
- Cancels current tasks
- Updates conversation history with partial messages
"""
def __init__(self):
self.is_human_speaking = True
self.interruptible_events = asyncio.Queue()
self.agent = None # Set externally
self.synthesizer_worker = None # Set externally
def broadcast_interrupt(self) -> bool:
"""
Broadcast interrupt to all in-flight events
This is called when the user starts speaking while the bot is speaking.
Returns:
True if any events were interrupted
"""
num_interrupts = 0
# Interrupt all queued events
while True:
try:
interruptible_event = self.interruptible_events.get_nowait()
if interruptible_event.interrupt():
num_interrupts += 1
except asyncio.QueueEmpty:
break
# Cancel current tasks
if self.agent:
self.agent.cancel_current_task()
if self.synthesizer_worker:
self.synthesizer_worker.cancel_current_task()
logger.info(f"⚠️ [INTERRUPT] Interrupted {num_interrupts} events")
return num_interrupts > 0
def add_interruptible_event(self, event: InterruptibleEvent):
"""Add an event to the interruptible queue"""
self.interruptible_events.put_nowait(event)
# ============================================================================
# Synthesis Worker with Interrupt Support
# ============================================================================
class SynthesisWorkerWithInterrupts:
"""
Synthesis worker that supports interrupts
Key Features:
- Checks for interrupts before sending each audio chunk
- Calculates partial message when interrupted
- Updates agent's conversation history with partial message
"""
def __init__(self, agent, output_device):
self.agent = agent
self.output_device = output_device
self.current_task = None
async def send_speech_to_output(
self,
message: str,
synthesis_result,
stop_event: threading.Event,
seconds_per_chunk: float = 0.1
) -> tuple[str, bool]:
"""
Send synthesized speech to output with interrupt support
Args:
message: The full message being synthesized
synthesis_result: SynthesisResult with chunk_generator and get_message_up_to
stop_event: Event that signals when to stop (interrupt)
seconds_per_chunk: Duration of each audio chunk in seconds
Returns:
Tuple of (message_sent, was_cut_off)
- message_sent: The actual message sent (partial if interrupted)
- was_cut_off: True if interrupted, False if completed
"""
chunk_idx = 0
async for chunk_result in synthesis_result.chunk_generator:
# CRITICAL: Check for interrupt before sending each chunk
if stop_event.is_set():
logger.info(f"🛑 [SYNTHESIZER] Interrupted after {chunk_idx} chunks")
# Calculate what was actually spoken
seconds_spoken = chunk_idx * seconds_per_chunk
partial_message = synthesis_result.get_message_up_to(seconds_spoken)
logger.info(f"📝 [SYNTHESIZER] Partial message: '{partial_message}'")
return partial_message, True # cut_off = True
start_time = asyncio.get_event_loop().time()
# Send chunk to output device
await self.output_device.consume_nonblocking(chunk_result.chunk)
# CRITICAL: Wait for chunk to play before sending next one
# This is what makes interrupts work!
processing_time = asyncio.get_event_loop().time() - start_time
await asyncio.sleep(max(seconds_per_chunk - processing_time, 0))
chunk_idx += 1
# Completed without interruption
logger.info(f"✅ [SYNTHESIZER] Completed {chunk_idx} chunks")
return message, False # cut_off = False
def cancel_current_task(self):
"""Cancel the current synthesis task"""
if self.current_task and not self.current_task.done():
self.current_task.cancel()
logger.info("🛑 [SYNTHESIZER] Cancelled current task")
# ============================================================================
# Transcription Worker with Interrupt Detection
# ============================================================================
class TranscriptionWorkerWithInterrupts:
"""
Transcription worker that detects interrupts
Key Features:
- Detects when user speaks while bot is speaking
- Marks transcription as interrupt
- Triggers broadcast_interrupt()
"""
def __init__(self, conversation):
self.conversation = conversation
async def process(self, transcription):
"""
Process transcription and detect interrupts
If the user starts speaking while the bot is speaking,
this is an interrupt.
"""
# Check if this is an interrupt
if not self.conversation.is_human_speaking:
logger.info("⚠️ [TRANSCRIPTION] User interrupted bot!")
# Broadcast interrupt to all in-flight events
interrupted = self.conversation.broadcast_interrupt()
transcription.is_interrupt = interrupted
# Update speaking state
self.conversation.is_human_speaking = True
# Continue processing transcription...
logger.info(f"🎤 [TRANSCRIPTION] Received: '{transcription.message}'")
# ============================================================================
# Example Usage
# ============================================================================
@dataclass
class MockTranscription:
message: str
is_interrupt: bool = False
@dataclass
class MockSynthesisResult:
async def chunk_generator(self):
"""Generate mock audio chunks"""
for i in range(10):
await asyncio.sleep(0.1)
yield type('obj', (object,), {'chunk': b'\x00' * 1024})()
def get_message_up_to(self, seconds: float) -> str:
"""Get partial message up to specified seconds"""
full_message = "I think the weather will be nice today and tomorrow and the day after."
chars_per_second = len(full_message) / 1.0 # Assume 1 second total
char_index = int(seconds * chars_per_second)
return full_message[:char_index]
async def example_interrupt_scenario():
"""
Example scenario: User interrupts bot mid-sentence
"""
print("🎬 Scenario: User interrupts bot mid-sentence\n")
# Create conversation
conversation = ConversationWithInterrupts()
# Create mock components
class MockAgent:
def cancel_current_task(self):
print("🛑 [AGENT] Task cancelled")
def update_last_bot_message_on_cut_off(self, partial_message):
print(f"📝 [AGENT] Updated history: '{partial_message}'")
class MockOutputDevice:
async def consume_nonblocking(self, chunk):
pass
agent = MockAgent()
output_device = MockOutputDevice()
conversation.agent = agent
# Create synthesis worker
synthesis_worker = SynthesisWorkerWithInterrupts(agent, output_device)
conversation.synthesizer_worker = synthesis_worker
# Create interruptible event
stop_event = threading.Event()
interruptible_event = InterruptibleEvent(
payload="Bot is speaking...",
is_interruptible=True
)
conversation.add_interruptible_event(interruptible_event)
# Start bot speaking
print("🤖 Bot starts speaking: 'I think the weather will be nice today and tomorrow and the day after.'\n")
conversation.is_human_speaking = False
# Simulate synthesis in background
synthesis_result = MockSynthesisResult()
synthesis_task = asyncio.create_task(
synthesis_worker.send_speech_to_output(
message="I think the weather will be nice today and tomorrow and the day after.",
synthesis_result=synthesis_result,
stop_event=stop_event,
seconds_per_chunk=0.1
)
)
# Wait a bit, then interrupt
await asyncio.sleep(0.3)
print("👤 User interrupts: 'Stop!'\n")
# Trigger interrupt
conversation.broadcast_interrupt()
stop_event.set()
# Wait for synthesis to finish
message_sent, was_cut_off = await synthesis_task
print(f"\n✅ Result:")
print(f" - Message sent: '{message_sent}'")
print(f" - Was cut off: {was_cut_off}")
# Update agent history
if was_cut_off:
agent.update_last_bot_message_on_cut_off(message_sent)
if __name__ == "__main__":
asyncio.run(example_interrupt_scenario())

View File

@@ -0,0 +1,471 @@
# Common Pitfalls and Solutions
This document covers common issues encountered when building voice AI engines and their solutions.
## 1. Audio Jumping/Cutting Off
### Problem
The bot's audio jumps or cuts off mid-response, creating a jarring user experience.
### Symptoms
- Audio plays in fragments
- Sentences are incomplete
- Multiple audio streams overlap
- Unnatural pauses or gaps
### Root Cause
Sending text to the synthesizer in small chunks (sentence-by-sentence or word-by-word) causes multiple TTS API calls. Each call generates a separate audio stream, resulting in:
- Multiple audio files being played sequentially
- Timing issues between chunks
- Potential overlapping audio
- Inconsistent voice characteristics between chunks
### Solution
Buffer the entire LLM response before sending it to the synthesizer:
**❌ Bad: Yields sentence-by-sentence**
```python
async def generate_response(self, prompt):
async for sentence in llm_stream:
# This creates multiple TTS calls!
yield GeneratedResponse(message=BaseMessage(text=sentence))
```
**✅ Good: Buffer entire response**
```python
async def generate_response(self, prompt):
# Buffer the entire response
full_response = ""
async for chunk in llm_stream:
full_response += chunk
# Yield once with complete response
yield GeneratedResponse(message=BaseMessage(text=full_response))
```
### Why This Works
- Single TTS call for the entire response
- Consistent voice characteristics
- Proper timing and pacing
- No gaps or overlaps
---
## 2. Echo/Feedback Loop
### Problem
The bot hears itself speaking and responds to its own audio, creating an infinite loop.
### Symptoms
- Bot responds to its own speech
- Conversation becomes nonsensical
- Transcriptions include bot's own words
- System becomes unresponsive
### Root Cause
The transcriber continues to process audio while the bot is speaking. If the bot's audio is being played through speakers and captured by the microphone, the transcriber will transcribe the bot's own speech.
### Solution
Mute the transcriber when the bot starts speaking:
```python
# Before sending audio to output
self.transcriber.mute()
# Send audio...
await self.send_speech_to_output(synthesis_result)
# After audio playback complete
self.transcriber.unmute()
```
### Implementation in Transcriber
```python
class BaseTranscriber:
def __init__(self):
self.is_muted = False
def send_audio(self, chunk: bytes):
"""Client calls this to send audio"""
if not self.is_muted:
self.input_queue.put_nowait(chunk)
else:
# Send silence instead (prevents echo)
self.input_queue.put_nowait(self.create_silent_chunk(len(chunk)))
def mute(self):
"""Called when bot starts speaking"""
self.is_muted = True
def unmute(self):
"""Called when bot stops speaking"""
self.is_muted = False
def create_silent_chunk(self, size: int) -> bytes:
"""Create a silent audio chunk"""
return b'\x00' * size
```
### Why This Works
- Transcriber receives silence while bot speaks
- No transcription of bot's own speech
- Prevents feedback loop
- Maintains audio stream continuity
---
## 3. Interrupts Not Working
### Problem
Users cannot interrupt the bot mid-sentence. The bot continues speaking even when the user starts talking.
### Symptoms
- Bot speaks over user
- User must wait for bot to finish
- Unnatural conversation flow
- Poor user experience
### Root Cause
All audio chunks are sent to the client immediately, buffering the entire message on the client side. By the time an interrupt is detected, all audio has already been sent and is queued for playback.
### Solution
Rate-limit audio chunks to match real-time playback:
**❌ Bad: Send all chunks immediately**
```python
async for chunk in synthesis_result.chunk_generator:
# Sends all chunks as fast as possible
output_device.consume_nonblocking(chunk)
```
**✅ Good: Rate-limit chunks**
```python
async for chunk in synthesis_result.chunk_generator:
# Check for interrupt
if stop_event.is_set():
# Calculate partial message
partial_message = synthesis_result.get_message_up_to(
chunk_idx * seconds_per_chunk
)
return partial_message, True # cut_off = True
start_time = time.time()
# Send chunk
output_device.consume_nonblocking(chunk)
# CRITICAL: Wait for chunk duration before sending next
processing_time = time.time() - start_time
await asyncio.sleep(max(seconds_per_chunk - processing_time, 0))
chunk_idx += 1
```
### Why This Works
- Only one chunk is buffered on client at a time
- Interrupts can stop mid-sentence
- Natural conversation flow
- Real-time playback maintained
### Calculating `seconds_per_chunk`
```python
# For LINEAR16 PCM audio at 16kHz
sample_rate = 16000 # Hz
chunk_size = 1024 # bytes
bytes_per_sample = 2 # 16-bit = 2 bytes
samples_per_chunk = chunk_size / bytes_per_sample
seconds_per_chunk = samples_per_chunk / sample_rate
# = 1024 / 2 / 16000 = 0.032 seconds
```
---
## 4. Memory Leaks from Unclosed Streams
### Problem
Memory usage grows over time, eventually causing the application to crash.
### Symptoms
- Increasing memory usage
- Slow performance over time
- WebSocket connections not closing
- Resource exhaustion
### Root Cause
WebSocket connections, API streams, or async tasks are not properly closed when conversations end or errors occur.
### Solution
Always use context managers and cleanup:
**❌ Bad: No cleanup**
```python
async def handle_conversation(websocket):
conversation = create_conversation()
await conversation.start()
async for message in websocket.iter_bytes():
conversation.receive_audio(message)
# No cleanup! Resources leak
```
**✅ Good: Proper cleanup**
```python
async def handle_conversation(websocket):
conversation = None
try:
conversation = create_conversation()
await conversation.start()
async for message in websocket.iter_bytes():
conversation.receive_audio(message)
except WebSocketDisconnect:
logger.info("Client disconnected")
except Exception as e:
logger.error(f"Error: {e}", exc_info=True)
finally:
# Always cleanup
if conversation:
await conversation.terminate()
```
### Proper Termination
```python
async def terminate(self):
"""Gracefully shut down all workers"""
self.active = False
# Stop all workers
self.transcriber.terminate()
self.agent.terminate()
self.synthesizer.terminate()
# Wait for queues to drain
await asyncio.sleep(0.5)
# Close connections
if self.websocket:
await self.websocket.close()
# Cancel tasks
for task in self.tasks:
if not task.done():
task.cancel()
```
---
## 5. Conversation History Not Updating
### Problem
The agent doesn't remember previous messages or context is lost.
### Symptoms
- Agent repeats itself
- No context from previous messages
- Each response is independent
- Poor conversation quality
### Root Cause
Conversation history is not being maintained or updated correctly.
### Solution
Maintain conversation history in the agent:
```python
class Agent:
def __init__(self):
self.conversation_history = []
async def generate_response(self, user_input):
# Add user message to history
self.conversation_history.append({
"role": "user",
"content": user_input
})
# Generate response with full history
response = await self.llm.generate(self.conversation_history)
# Add bot response to history
self.conversation_history.append({
"role": "assistant",
"content": response
})
return response
```
### Handling Interrupts
When the bot is interrupted, update history with partial message:
```python
def update_last_bot_message_on_cut_off(self, partial_message):
"""Update history when bot is interrupted"""
if self.conversation_history and \
self.conversation_history[-1]["role"] == "assistant":
# Update with what was actually spoken
self.conversation_history[-1]["content"] = partial_message
```
---
## 6. WebSocket Connection Drops
### Problem
WebSocket connections drop unexpectedly, interrupting conversations.
### Symptoms
- Frequent disconnections
- Connection timeouts
- "Connection closed" errors
- Unstable conversations
### Root Cause
- No heartbeat/ping mechanism
- Idle timeout
- Network issues
- Server overload
### Solution
Implement heartbeat and reconnection:
```python
@app.websocket("/conversation")
async def conversation_endpoint(websocket: WebSocket):
await websocket.accept()
# Start heartbeat
async def heartbeat():
while True:
try:
await websocket.send_json({"type": "ping"})
await asyncio.sleep(30) # Ping every 30 seconds
except:
break
heartbeat_task = asyncio.create_task(heartbeat())
try:
async for message in websocket.iter_bytes():
# Process message
pass
finally:
heartbeat_task.cancel()
```
---
## 7. High Latency / Slow Responses
### Problem
Long delays between user speech and bot response.
### Symptoms
- Noticeable lag
- Poor user experience
- Conversation feels unnatural
- Users repeat themselves
### Root Causes & Solutions
**1. Not using streaming**
```python
# ❌ Bad: Wait for entire response
response = await llm.complete(prompt)
# ✅ Good: Stream response
async for chunk in llm.complete(prompt, stream=True):
yield chunk
```
**2. Sequential processing**
```python
# ❌ Bad: Sequential
transcription = await transcriber.transcribe(audio)
response = await agent.generate(transcription)
audio = await synthesizer.synthesize(response)
# ✅ Good: Concurrent with queues
# All workers run simultaneously
```
**3. Large chunk sizes**
```python
# ❌ Bad: Large chunks (high latency)
chunk_size = 8192 # 0.25 seconds
# ✅ Good: Small chunks (low latency)
chunk_size = 1024 # 0.032 seconds
```
---
## 8. Audio Quality Issues
### Problem
Poor audio quality, distortion, or artifacts.
### Symptoms
- Robotic voice
- Crackling or popping
- Distorted audio
- Inconsistent volume
### Root Causes & Solutions
**1. Wrong audio format**
```python
# ✅ Use LINEAR16 PCM at 16kHz
audio_encoding = AudioEncoding.LINEAR16
sample_rate = 16000
```
**2. Incorrect format conversion**
```python
# ✅ Proper MP3 to PCM conversion
from pydub import AudioSegment
import io
def mp3_to_pcm(mp3_bytes):
audio = AudioSegment.from_mp3(io.BytesIO(mp3_bytes))
audio = audio.set_frame_rate(16000)
audio = audio.set_channels(1)
audio = audio.set_sample_width(2) # 16-bit
return audio.raw_data
```
**3. Buffer underruns**
```python
# ✅ Ensure consistent chunk timing
await asyncio.sleep(max(seconds_per_chunk - processing_time, 0))
```
---
## Summary
| Problem | Root Cause | Solution |
|---------|-----------|----------|
| Audio jumping | Multiple TTS calls | Buffer entire response |
| Echo/feedback | Transcriber active during bot speech | Mute transcriber |
| Interrupts not working | All chunks sent immediately | Rate-limit chunks |
| Memory leaks | Unclosed streams | Proper cleanup |
| Lost context | History not maintained | Update conversation history |
| Connection drops | No heartbeat | Implement ping/pong |
| High latency | Sequential processing | Use streaming + queues |
| Poor audio quality | Wrong format/conversion | Use LINEAR16 PCM 16kHz |
---
## Best Practices
1. **Always buffer LLM responses** before sending to synthesizer
2. **Always mute transcriber** when bot is speaking
3. **Always rate-limit audio chunks** to enable interrupts
4. **Always cleanup resources** in finally blocks
5. **Always maintain conversation history** for context
6. **Always use streaming** for low latency
7. **Always use LINEAR16 PCM** at 16kHz for audio
8. **Always implement error handling** in worker loops

View File

@@ -0,0 +1,515 @@
# Provider Comparison Guide
This guide compares different providers for transcription, LLM, and TTS services to help you choose the best option for your voice AI engine.
## Transcription Providers
### Deepgram
**Strengths:**
- ✅ Fastest transcription speed (< 300ms latency)
- ✅ Excellent streaming support
- ✅ High accuracy (95%+ on clear audio)
- ✅ Good pricing ($0.0043/minute)
- ✅ Nova-2 model optimized for real-time
- ✅ Excellent documentation
**Weaknesses:**
- ❌ Less accurate with heavy accents
- ❌ Smaller company (potential reliability concerns)
**Best For:**
- Real-time voice conversations
- Low-latency applications
- English-language applications
- Startups and small businesses
**Configuration:**
```python
{
"transcriberProvider": "deepgram",
"deepgramApiKey": "your-api-key",
"deepgramModel": "nova-2",
"language": "en-US"
}
```
---
### AssemblyAI
**Strengths:**
- ✅ Very high accuracy (96%+ on clear audio)
- ✅ Excellent with accents and dialects
- ✅ Good speaker diarization
- ✅ Competitive pricing ($0.00025/second)
- ✅ Strong customer support
**Weaknesses:**
- ❌ Slightly higher latency than Deepgram
- ❌ Streaming support is newer
**Best For:**
- Applications requiring highest accuracy
- Multi-speaker scenarios
- Diverse user base with accents
- Enterprise applications
**Configuration:**
```python
{
"transcriberProvider": "assemblyai",
"assemblyaiApiKey": "your-api-key",
"language": "en"
}
```
---
### Azure Speech
**Strengths:**
- ✅ Enterprise-grade reliability
- ✅ Excellent multi-language support (100+ languages)
- ✅ Strong security and compliance
- ✅ Integration with Azure ecosystem
- ✅ Custom model training available
**Weaknesses:**
- ❌ Higher cost ($1/hour)
- ❌ More complex setup
- ❌ Slower than specialized providers
**Best For:**
- Enterprise applications
- Multi-language requirements
- Azure-based infrastructure
- Compliance-sensitive applications
**Configuration:**
```python
{
"transcriberProvider": "azure",
"azureSpeechKey": "your-key",
"azureSpeechRegion": "eastus",
"language": "en-US"
}
```
---
### Google Cloud Speech
**Strengths:**
- ✅ Excellent multi-language support (125+ languages)
- ✅ Good accuracy
- ✅ Integration with Google Cloud
- ✅ Automatic punctuation
- ✅ Speaker diarization
**Weaknesses:**
- ❌ Higher latency for streaming
- ❌ Complex pricing model
- ❌ Requires Google Cloud account
**Best For:**
- Multi-language applications
- Google Cloud infrastructure
- Applications needing speaker diarization
**Configuration:**
```python
{
"transcriberProvider": "google",
"googleCredentials": "path/to/credentials.json",
"language": "en-US"
}
```
---
## LLM Providers
### OpenAI (GPT-4, GPT-3.5)
**Strengths:**
- ✅ Highest quality responses
- ✅ Excellent instruction following
- ✅ Fast streaming
- ✅ Large context window (128k for GPT-4)
- ✅ Best-in-class reasoning
**Weaknesses:**
- ❌ Higher cost ($0.01-0.03/1k tokens)
- ❌ Rate limits can be restrictive
- ❌ No free tier
**Best For:**
- High-quality conversational AI
- Complex reasoning tasks
- Production applications
- Enterprise use cases
**Configuration:**
```python
{
"llmProvider": "openai",
"openaiApiKey": "your-api-key",
"openaiModel": "gpt-4-turbo",
"prompt": "You are a helpful AI assistant."
}
```
**Pricing:**
- GPT-4 Turbo: $0.01/1k input tokens, $0.03/1k output tokens
- GPT-3.5 Turbo: $0.0005/1k input tokens, $0.0015/1k output tokens
---
### Google Gemini
**Strengths:**
- ✅ Excellent cost-effectiveness (free tier available)
- ✅ Multimodal capabilities
- ✅ Good streaming support
- ✅ Large context window (1M tokens for Pro)
- ✅ Fast response times
**Weaknesses:**
- ❌ Slightly lower quality than GPT-4
- ❌ Less predictable behavior
- ❌ Newer, less battle-tested
**Best For:**
- Cost-sensitive applications
- Multimodal applications
- Startups and prototypes
- High-volume applications
**Configuration:**
```python
{
"llmProvider": "gemini",
"geminiApiKey": "your-api-key",
"geminiModel": "gemini-pro",
"prompt": "You are a helpful AI assistant."
}
```
**Pricing:**
- Gemini Pro: Free up to 60 requests/minute
- Gemini Pro (paid): $0.00025/1k input tokens, $0.0005/1k output tokens
---
### Anthropic Claude
**Strengths:**
- ✅ Excellent safety and alignment
- ✅ Very long context window (200k tokens)
- ✅ High-quality responses
- ✅ Good at following complex instructions
- ✅ Strong reasoning capabilities
**Weaknesses:**
- ❌ Higher cost than Gemini
- ❌ Slower streaming than OpenAI
- ❌ More conservative responses
**Best For:**
- Safety-critical applications
- Long-context applications
- Nuanced conversations
- Enterprise applications
**Configuration:**
```python
{
"llmProvider": "claude",
"claudeApiKey": "your-api-key",
"claudeModel": "claude-3-opus",
"prompt": "You are a helpful AI assistant."
}
```
**Pricing:**
- Claude 3 Opus: $0.015/1k input tokens, $0.075/1k output tokens
- Claude 3 Sonnet: $0.003/1k input tokens, $0.015/1k output tokens
---
## TTS Providers
### ElevenLabs
**Strengths:**
- ✅ Most natural-sounding voices
- ✅ Excellent emotional range
- ✅ Voice cloning capabilities
- ✅ Good streaming support
- ✅ Multiple languages
**Weaknesses:**
- ❌ Higher cost ($0.30/1k characters)
- ❌ Rate limits on lower tiers
- ❌ Occasional pronunciation errors
**Best For:**
- Premium voice experiences
- Customer-facing applications
- Voice cloning needs
- High-quality audio requirements
**Configuration:**
```python
{
"voiceProvider": "elevenlabs",
"elevenlabsApiKey": "your-api-key",
"elevenlabsVoiceId": "voice-id",
"elevenlabsModel": "eleven_monolingual_v1"
}
```
**Pricing:**
- Free: 10k characters/month
- Starter: $5/month, 30k characters
- Creator: $22/month, 100k characters
---
### Azure TTS
**Strengths:**
- ✅ Enterprise-grade reliability
- ✅ Many languages (100+)
- ✅ Neural voices available
- ✅ SSML support for fine control
- ✅ Good pricing ($4/1M characters)
**Weaknesses:**
- ❌ Less natural than ElevenLabs
- ❌ More complex setup
- ❌ Requires Azure account
**Best For:**
- Enterprise applications
- Multi-language requirements
- Azure-based infrastructure
- Cost-sensitive high-volume applications
**Configuration:**
```python
{
"voiceProvider": "azure",
"azureSpeechKey": "your-key",
"azureSpeechRegion": "eastus",
"azureVoiceName": "en-US-JennyNeural"
}
```
**Pricing:**
- Neural voices: $16/1M characters
- Standard voices: $4/1M characters
---
### Google Cloud TTS
**Strengths:**
- ✅ Good quality neural voices
- ✅ Many languages (40+)
- ✅ WaveNet voices available
- ✅ Competitive pricing ($4/1M characters)
- ✅ SSML support
**Weaknesses:**
- ❌ Less natural than ElevenLabs
- ❌ Requires Google Cloud account
- ❌ Complex setup
**Best For:**
- Multi-language applications
- Google Cloud infrastructure
- Cost-effective neural voices
**Configuration:**
```python
{
"voiceProvider": "google",
"googleCredentials": "path/to/credentials.json",
"googleVoiceName": "en-US-Neural2-F"
}
```
**Pricing:**
- WaveNet voices: $16/1M characters
- Neural2 voices: $16/1M characters
- Standard voices: $4/1M characters
---
### Amazon Polly
**Strengths:**
- ✅ AWS integration
- ✅ Good pricing ($4/1M characters)
- ✅ Neural voices available
- ✅ SSML support
- ✅ Reliable service
**Weaknesses:**
- ❌ Less natural than ElevenLabs
- ❌ Fewer voice options
- ❌ Requires AWS account
**Best For:**
- AWS-based infrastructure
- Cost-effective neural voices
- Enterprise applications
**Configuration:**
```python
{
"voiceProvider": "polly",
"awsAccessKey": "your-access-key",
"awsSecretKey": "your-secret-key",
"awsRegion": "us-east-1",
"pollyVoiceId": "Joanna"
}
```
**Pricing:**
- Neural voices: $16/1M characters
- Standard voices: $4/1M characters
---
### Play.ht
**Strengths:**
- ✅ Voice cloning capabilities
- ✅ Natural-sounding voices
- ✅ Good streaming support
- ✅ Easy to use API
- ✅ Multiple languages
**Weaknesses:**
- ❌ Higher cost than cloud providers
- ❌ Smaller company
- ❌ Less documentation
**Best For:**
- Voice cloning applications
- Premium voice experiences
- Startups and small businesses
**Configuration:**
```python
{
"voiceProvider": "playht",
"playhtApiKey": "your-api-key",
"playhtUserId": "your-user-id",
"playhtVoiceId": "voice-id"
}
```
**Pricing:**
- Free: 2.5k characters
- Creator: $31/month, 50k characters
- Pro: $79/month, 150k characters
---
## Recommended Combinations
### Budget-Conscious Startup
```python
{
"transcriberProvider": "deepgram", # Fast and affordable
"llmProvider": "gemini", # Free tier available
"voiceProvider": "google" # Cost-effective neural voices
}
```
**Estimated cost:** ~$0.01 per minute of conversation
---
### Premium Experience
```python
{
"transcriberProvider": "assemblyai", # Highest accuracy
"llmProvider": "openai", # Best quality responses
"voiceProvider": "elevenlabs" # Most natural voices
}
```
**Estimated cost:** ~$0.05 per minute of conversation
---
### Enterprise Application
```python
{
"transcriberProvider": "azure", # Enterprise reliability
"llmProvider": "openai", # Best quality
"voiceProvider": "azure" # Enterprise reliability
}
```
**Estimated cost:** ~$0.03 per minute of conversation
---
### Multi-Language Application
```python
{
"transcriberProvider": "google", # 125+ languages
"llmProvider": "gemini", # Good multi-language support
"voiceProvider": "google" # 40+ languages
}
```
**Estimated cost:** ~$0.02 per minute of conversation
---
## Decision Matrix
| Priority | Transcriber | LLM | TTS |
|----------|-------------|-----|-----|
| **Lowest Cost** | Deepgram | Gemini | Google |
| **Highest Quality** | AssemblyAI | OpenAI | ElevenLabs |
| **Fastest Speed** | Deepgram | OpenAI | ElevenLabs |
| **Enterprise** | Azure | OpenAI | Azure |
| **Multi-Language** | Google | Gemini | Google |
| **Voice Cloning** | N/A | N/A | ElevenLabs/Play.ht |
---
## Testing Recommendations
Before committing to providers, test with your specific use case:
1. **Create test conversations** with representative audio
2. **Measure latency** end-to-end
3. **Evaluate quality** with real users
4. **Calculate costs** based on expected volume
5. **Test edge cases** (accents, background noise, interrupts)
---
## Switching Providers
The multi-provider factory pattern makes switching easy:
```python
# Just change the configuration
config = {
"transcriberProvider": "deepgram", # Change to "assemblyai"
"llmProvider": "gemini", # Change to "openai"
"voiceProvider": "google" # Change to "elevenlabs"
}
# No code changes needed!
factory = VoiceComponentFactory()
transcriber = factory.create_transcriber(config)
agent = factory.create_agent(config)
synthesizer = factory.create_synthesizer(config)
```

View File

@@ -0,0 +1,193 @@
"""
Template: Base Worker Implementation
Use this template as a starting point for creating new workers
in your voice AI pipeline.
"""
import asyncio
from typing import Any
import logging
logger = logging.getLogger(__name__)
class BaseWorker:
"""
Base class for all workers in the voice AI pipeline
Workers follow the producer-consumer pattern:
- Consume items from input_queue
- Process items
- Produce results to output_queue
All workers run concurrently via asyncio.
"""
def __init__(self, input_queue: asyncio.Queue, output_queue: asyncio.Queue):
"""
Initialize the worker
Args:
input_queue: Queue to consume items from
output_queue: Queue to produce results to
"""
self.input_queue = input_queue
self.output_queue = output_queue
self.active = False
self._task = None
def start(self):
"""Start the worker's processing loop"""
self.active = True
self._task = asyncio.create_task(self._run_loop())
logger.info(f"✅ [{self.__class__.__name__}] Started")
async def _run_loop(self):
"""
Main processing loop - runs forever until terminated
This loop:
1. Waits for items from input_queue
2. Processes each item
3. Handles errors gracefully
"""
while self.active:
try:
# Block until item arrives
item = await self.input_queue.get()
# Process the item
await self.process(item)
except asyncio.CancelledError:
# Task was cancelled (normal during shutdown)
logger.info(f"🛑 [{self.__class__.__name__}] Task cancelled")
break
except Exception as e:
# Log error but don't crash the worker
logger.error(
f"❌ [{self.__class__.__name__}] Error processing item: {e}",
exc_info=True
)
# Continue processing next item
async def process(self, item: Any):
"""
Process a single item
Override this method in your worker implementation.
Args:
item: The item to process
"""
raise NotImplementedError(
f"{self.__class__.__name__} must implement process()"
)
def terminate(self):
"""
Stop the worker gracefully
This sets active=False and cancels the processing task.
"""
self.active = False
if self._task and not self._task.done():
self._task.cancel()
logger.info(f"🛑 [{self.__class__.__name__}] Terminated")
async def wait_for_completion(self):
"""Wait for the worker task to complete"""
if self._task:
try:
await self._task
except asyncio.CancelledError:
pass
# ============================================================================
# Example: Custom Worker Implementation
# ============================================================================
class ExampleWorker(BaseWorker):
"""
Example worker that demonstrates how to extend BaseWorker
This worker receives strings, converts them to uppercase,
and sends them to the output queue.
"""
def __init__(self, input_queue: asyncio.Queue, output_queue: asyncio.Queue):
super().__init__(input_queue, output_queue)
# Add any custom initialization here
self.processed_count = 0
async def process(self, item: str):
"""
Process a single item
Args:
item: String to convert to uppercase
"""
# Simulate some processing time
await asyncio.sleep(0.1)
# Process the item
result = item.upper()
# Send to output queue
self.output_queue.put_nowait(result)
# Update counter
self.processed_count += 1
logger.info(
f"✅ [{self.__class__.__name__}] "
f"Processed '{item}' -> '{result}' "
f"(total: {self.processed_count})"
)
# ============================================================================
# Example Usage
# ============================================================================
async def example_usage():
"""Example of how to use the worker"""
# Create queues
input_queue = asyncio.Queue()
output_queue = asyncio.Queue()
# Create worker
worker = ExampleWorker(input_queue, output_queue)
# Start worker
worker.start()
# Send items to process
items = ["hello", "world", "voice", "ai"]
for item in items:
input_queue.put_nowait(item)
# Wait for processing
await asyncio.sleep(0.5)
# Get results
results = []
while not output_queue.empty():
results.append(await output_queue.get())
print(f"\n✅ Results: {results}")
# Terminate worker
worker.terminate()
await worker.wait_for_completion()
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO)
asyncio.run(example_usage())

View File

@@ -0,0 +1,289 @@
"""
Template: Multi-Provider Factory
Use this template to create a factory that supports multiple providers
for transcription, LLM, and TTS services.
"""
from typing import Dict, Any
from abc import ABC, abstractmethod
import logging
logger = logging.getLogger(__name__)
# ============================================================================
# Provider Interfaces
# ============================================================================
class TranscriberProvider(ABC):
"""Abstract base class for transcriber providers"""
@abstractmethod
async def transcribe_stream(self, audio_stream):
"""Transcribe streaming audio"""
pass
class LLMProvider(ABC):
"""Abstract base class for LLM providers"""
@abstractmethod
async def generate_response(self, messages, stream=True):
"""Generate response from messages"""
pass
class TTSProvider(ABC):
"""Abstract base class for TTS providers"""
@abstractmethod
async def synthesize_speech(self, text):
"""Synthesize speech from text"""
pass
# ============================================================================
# Multi-Provider Factory
# ============================================================================
class VoiceComponentFactory:
"""
Factory for creating voice AI components with multiple provider support
Supports:
- Multiple transcription providers (Deepgram, AssemblyAI, Azure, Google)
- Multiple LLM providers (OpenAI, Gemini, Claude)
- Multiple TTS providers (ElevenLabs, Azure, Google, Polly, Play.ht)
"""
def __init__(self):
self.transcriber_providers = {
"deepgram": self._create_deepgram_transcriber,
"assemblyai": self._create_assemblyai_transcriber,
"azure": self._create_azure_transcriber,
"google": self._create_google_transcriber,
}
self.llm_providers = {
"openai": self._create_openai_agent,
"gemini": self._create_gemini_agent,
"claude": self._create_claude_agent,
}
self.tts_providers = {
"elevenlabs": self._create_elevenlabs_synthesizer,
"azure": self._create_azure_synthesizer,
"google": self._create_google_synthesizer,
"polly": self._create_polly_synthesizer,
"playht": self._create_playht_synthesizer,
}
def create_transcriber(self, config: Dict[str, Any]):
"""
Create transcriber based on configuration
Args:
config: Configuration dict with 'transcriberProvider' key
Returns:
Transcriber instance
Raises:
ValueError: If provider is not supported
"""
provider = config.get("transcriberProvider", "deepgram").lower()
if provider not in self.transcriber_providers:
raise ValueError(
f"Unknown transcriber provider: {provider}. "
f"Supported: {list(self.transcriber_providers.keys())}"
)
logger.info(f"🎤 Creating transcriber: {provider}")
return self.transcriber_providers[provider](config)
def create_agent(self, config: Dict[str, Any]):
"""
Create LLM agent based on configuration
Args:
config: Configuration dict with 'llmProvider' key
Returns:
Agent instance
Raises:
ValueError: If provider is not supported
"""
provider = config.get("llmProvider", "openai").lower()
if provider not in self.llm_providers:
raise ValueError(
f"Unknown LLM provider: {provider}. "
f"Supported: {list(self.llm_providers.keys())}"
)
logger.info(f"🤖 Creating agent: {provider}")
return self.llm_providers[provider](config)
def create_synthesizer(self, config: Dict[str, Any]):
"""
Create TTS synthesizer based on configuration
Args:
config: Configuration dict with 'voiceProvider' key
Returns:
Synthesizer instance
Raises:
ValueError: If provider is not supported
"""
provider = config.get("voiceProvider", "elevenlabs").lower()
if provider not in self.tts_providers:
raise ValueError(
f"Unknown voice provider: {provider}. "
f"Supported: {list(self.tts_providers.keys())}"
)
logger.info(f"🔊 Creating synthesizer: {provider}")
return self.tts_providers[provider](config)
# ========================================================================
# Transcriber Implementations
# ========================================================================
def _create_deepgram_transcriber(self, config: Dict[str, Any]):
"""Create Deepgram transcriber"""
# TODO: Implement Deepgram transcriber
# from .transcribers.deepgram import DeepgramTranscriber
# return DeepgramTranscriber(
# api_key=config.get("deepgramApiKey"),
# model=config.get("deepgramModel", "nova-2"),
# language=config.get("language", "en-US")
# )
raise NotImplementedError("Deepgram transcriber not implemented")
def _create_assemblyai_transcriber(self, config: Dict[str, Any]):
"""Create AssemblyAI transcriber"""
# TODO: Implement AssemblyAI transcriber
raise NotImplementedError("AssemblyAI transcriber not implemented")
def _create_azure_transcriber(self, config: Dict[str, Any]):
"""Create Azure Speech transcriber"""
# TODO: Implement Azure transcriber
raise NotImplementedError("Azure transcriber not implemented")
def _create_google_transcriber(self, config: Dict[str, Any]):
"""Create Google Cloud Speech transcriber"""
# TODO: Implement Google transcriber
raise NotImplementedError("Google transcriber not implemented")
# ========================================================================
# LLM Agent Implementations
# ========================================================================
def _create_openai_agent(self, config: Dict[str, Any]):
"""Create OpenAI agent"""
# TODO: Implement OpenAI agent
# from .agents.openai import OpenAIAgent
# return OpenAIAgent(
# api_key=config.get("openaiApiKey"),
# model=config.get("openaiModel", "gpt-4"),
# system_prompt=config.get("prompt", "You are a helpful assistant.")
# )
raise NotImplementedError("OpenAI agent not implemented")
def _create_gemini_agent(self, config: Dict[str, Any]):
"""Create Google Gemini agent"""
# TODO: Implement Gemini agent
# from .agents.gemini import GeminiAgent
# return GeminiAgent(
# api_key=config.get("geminiApiKey"),
# model=config.get("geminiModel", "gemini-pro"),
# system_prompt=config.get("prompt", "You are a helpful assistant.")
# )
raise NotImplementedError("Gemini agent not implemented")
def _create_claude_agent(self, config: Dict[str, Any]):
"""Create Anthropic Claude agent"""
# TODO: Implement Claude agent
raise NotImplementedError("Claude agent not implemented")
# ========================================================================
# TTS Synthesizer Implementations
# ========================================================================
def _create_elevenlabs_synthesizer(self, config: Dict[str, Any]):
"""Create ElevenLabs synthesizer"""
# TODO: Implement ElevenLabs synthesizer
# from .synthesizers.elevenlabs import ElevenLabsSynthesizer
# return ElevenLabsSynthesizer(
# api_key=config.get("elevenlabsApiKey"),
# voice_id=config.get("elevenlabsVoiceId"),
# model_id=config.get("elevenlabsModel", "eleven_monolingual_v1")
# )
raise NotImplementedError("ElevenLabs synthesizer not implemented")
def _create_azure_synthesizer(self, config: Dict[str, Any]):
"""Create Azure TTS synthesizer"""
# TODO: Implement Azure synthesizer
raise NotImplementedError("Azure synthesizer not implemented")
def _create_google_synthesizer(self, config: Dict[str, Any]):
"""Create Google Cloud TTS synthesizer"""
# TODO: Implement Google synthesizer
raise NotImplementedError("Google synthesizer not implemented")
def _create_polly_synthesizer(self, config: Dict[str, Any]):
"""Create Amazon Polly synthesizer"""
# TODO: Implement Polly synthesizer
raise NotImplementedError("Polly synthesizer not implemented")
def _create_playht_synthesizer(self, config: Dict[str, Any]):
"""Create Play.ht synthesizer"""
# TODO: Implement Play.ht synthesizer
raise NotImplementedError("Play.ht synthesizer not implemented")
# ============================================================================
# Example Usage
# ============================================================================
def example_usage():
"""Example of how to use the factory"""
# Configuration
config = {
"transcriberProvider": "deepgram",
"deepgramApiKey": "your-api-key",
"llmProvider": "gemini",
"geminiApiKey": "your-api-key",
"voiceProvider": "elevenlabs",
"elevenlabsApiKey": "your-api-key",
"elevenlabsVoiceId": "your-voice-id",
"prompt": "You are a helpful AI assistant."
}
# Create factory
factory = VoiceComponentFactory()
try:
# Create components
transcriber = factory.create_transcriber(config)
agent = factory.create_agent(config)
synthesizer = factory.create_synthesizer(config)
print("✅ All components created successfully!")
except ValueError as e:
print(f"❌ Configuration error: {e}")
except NotImplementedError as e:
print(f"⚠️ Not implemented: {e}")
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO)
example_usage()

View File

@@ -2159,6 +2159,15 @@
"risk": "unknown",
"source": "vibeship-spawner-skills (Apache 2.0)"
},
{
"id": "voice-ai-engine-development",
"path": "skills/voice-ai-engine-development",
"category": "uncategorized",
"name": "voice-ai-engine-development",
"description": "Build real-time conversational AI voice engines using async worker pipelines, streaming transcription, LLM agents, and TTS synthesis with interrupt handling and multi-provider support",
"risk": "unknown",
"source": "unknown"
},
{
"id": "vr-ar",
"path": "skills/game-development/vr-ar",