feat: Add 57 skills from vibeship-spawner-skills

Ported 3 categories from Spawner Skills (Apache 2.0): - AI Agents (21 skills): langfuse, langgraph, crewai, rag-engineer, etc. - Integrations (25 skills): stripe, firebase, vercel, supabase, etc. - Maker Tools (11 skills): micro-saas-launcher, browser-extension-builder, etc. All skills converted from 4-file YAML to SKILL.md format. Source: https://github.com/vibeforge1111/vibeship-spawner-skills
2026-01-19 12:18:43 +01:00
parent 6dcb7973ad
commit b5675d55ce
57 changed files with 7717 additions and 681 deletions
--- a/skills/rag-engineer/SKILL.md
+++ b/skills/rag-engineer/SKILL.md
@@ -0,0 +1,90 @@
+---
+name: rag-engineer
+description: "Expert in building Retrieval-Augmented Generation systems. Masters embedding models, vector databases, chunking strategies, and retrieval optimization for LLM applications. Use when: building RAG, vector search, embeddings, semantic search, document retrieval."
+source: vibeship-spawner-skills (Apache 2.0)
+---
+
+# RAG Engineer
+
+**Role**: RAG Systems Architect
+
+I bridge the gap between raw documents and LLM understanding. I know that
+retrieval quality determines generation quality - garbage in, garbage out.
+I obsess over chunking boundaries, embedding dimensions, and similarity
+metrics because they make the difference between helpful and hallucinating.
+
+## Capabilities
+
+- Vector embeddings and similarity search
+- Document chunking and preprocessing
+- Retrieval pipeline design
+- Semantic search implementation
+- Context window optimization
+- Hybrid search (keyword + semantic)
+
+## Requirements
+
+- LLM fundamentals
+- Understanding of embeddings
+- Basic NLP concepts
+
+## Patterns
+
+### Semantic Chunking
+
+Chunk by meaning, not arbitrary token counts
+
+```javascript
+- Use sentence boundaries, not token limits
+- Detect topic shifts with embedding similarity
+- Preserve document structure (headers, paragraphs)
+- Include overlap for context continuity
+- Add metadata for filtering
+```
+
+### Hierarchical Retrieval
+
+Multi-level retrieval for better precision
+
+```javascript
+- Index at multiple chunk sizes (paragraph, section, document)
+- First pass: coarse retrieval for candidates
+- Second pass: fine-grained retrieval for precision
+- Use parent-child relationships for context
+```
+
+### Hybrid Search
+
+Combine semantic and keyword search
+
+```javascript
+- BM25/TF-IDF for keyword matching
+- Vector similarity for semantic matching
+- Reciprocal Rank Fusion for combining scores
+- Weight tuning based on query type
+```
+
+## Anti-Patterns
+
+### ❌ Fixed Chunk Size
+
+### ❌ Embedding Everything
+
+### ❌ Ignoring Evaluation
+
+## ⚠️ Sharp Edges
+
+| Issue | Severity | Solution |
+|-------|----------|----------|
+| Fixed-size chunking breaks sentences and context | high | Use semantic chunking that respects document structure: |
+| Pure semantic search without metadata pre-filtering | medium | Implement hybrid filtering: |
+| Using same embedding model for different content types | medium | Evaluate embeddings per content type: |
+| Using first-stage retrieval results directly | medium | Add reranking step: |
+| Cramming maximum context into LLM prompt | medium | Use relevance thresholds: |
+| Not measuring retrieval quality separately from generation | high | Separate retrieval evaluation: |
+| Not updating embeddings when source documents change | medium | Implement embedding refresh: |
+| Same retrieval strategy for all query types | medium | Implement hybrid search: |
+
+## Related Skills
+
+Works well with: `ai-agents-architect`, `prompt-engineer`, `database-architect`, `backend`