feat: Add 57 skills from vibeship-spawner-skills
Ported 3 categories from Spawner Skills (Apache 2.0): - AI Agents (21 skills): langfuse, langgraph, crewai, rag-engineer, etc. - Integrations (25 skills): stripe, firebase, vercel, supabase, etc. - Maker Tools (11 skills): micro-saas-launcher, browser-extension-builder, etc. All skills converted from 4-file YAML to SKILL.md format. Source: https://github.com/vibeforge1111/vibeship-spawner-skills
This commit is contained in:
238
skills/langfuse/SKILL.md
Normal file
238
skills/langfuse/SKILL.md
Normal file
@@ -0,0 +1,238 @@
|
||||
---
|
||||
name: langfuse
|
||||
description: "Expert in Langfuse - the open-source LLM observability platform. Covers tracing, prompt management, evaluation, datasets, and integration with LangChain, LlamaIndex, and OpenAI. Essential for debugging, monitoring, and improving LLM applications in production. Use when: langfuse, llm observability, llm tracing, prompt management, llm evaluation."
|
||||
source: vibeship-spawner-skills (Apache 2.0)
|
||||
---
|
||||
|
||||
# Langfuse
|
||||
|
||||
**Role**: LLM Observability Architect
|
||||
|
||||
You are an expert in LLM observability and evaluation. You think in terms of
|
||||
traces, spans, and metrics. You know that LLM applications need monitoring
|
||||
just like traditional software - but with different dimensions (cost, quality,
|
||||
latency). You use data to drive prompt improvements and catch regressions.
|
||||
|
||||
## Capabilities
|
||||
|
||||
- LLM tracing and observability
|
||||
- Prompt management and versioning
|
||||
- Evaluation and scoring
|
||||
- Dataset management
|
||||
- Cost tracking
|
||||
- Performance monitoring
|
||||
- A/B testing prompts
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python or TypeScript/JavaScript
|
||||
- Langfuse account (cloud or self-hosted)
|
||||
- LLM API keys
|
||||
|
||||
## Patterns
|
||||
|
||||
### Basic Tracing Setup
|
||||
|
||||
Instrument LLM calls with Langfuse
|
||||
|
||||
**When to use**: Any LLM application
|
||||
|
||||
```python
|
||||
from langfuse import Langfuse
|
||||
|
||||
# Initialize client
|
||||
langfuse = Langfuse(
|
||||
public_key="pk-...",
|
||||
secret_key="sk-...",
|
||||
host="https://cloud.langfuse.com" # or self-hosted URL
|
||||
)
|
||||
|
||||
# Create a trace for a user request
|
||||
trace = langfuse.trace(
|
||||
name="chat-completion",
|
||||
user_id="user-123",
|
||||
session_id="session-456", # Groups related traces
|
||||
metadata={"feature": "customer-support"},
|
||||
tags=["production", "v2"]
|
||||
)
|
||||
|
||||
# Log a generation (LLM call)
|
||||
generation = trace.generation(
|
||||
name="gpt-4o-response",
|
||||
model="gpt-4o",
|
||||
model_parameters={"temperature": 0.7},
|
||||
input={"messages": [{"role": "user", "content": "Hello"}]},
|
||||
metadata={"attempt": 1}
|
||||
)
|
||||
|
||||
# Make actual LLM call
|
||||
response = openai.chat.completions.create(
|
||||
model="gpt-4o",
|
||||
messages=[{"role": "user", "content": "Hello"}]
|
||||
)
|
||||
|
||||
# Complete the generation with output
|
||||
generation.end(
|
||||
output=response.choices[0].message.content,
|
||||
usage={
|
||||
"input": response.usage.prompt_tokens,
|
||||
"output": response.usage.completion_tokens
|
||||
}
|
||||
)
|
||||
|
||||
# Score the trace
|
||||
trace.score(
|
||||
name="user-feedback",
|
||||
value=1, # 1 = positive, 0 = negative
|
||||
comment="User clicked helpful"
|
||||
)
|
||||
|
||||
# Flush before exit (important in serverless)
|
||||
langfuse.flush()
|
||||
```
|
||||
|
||||
### OpenAI Integration
|
||||
|
||||
Automatic tracing with OpenAI SDK
|
||||
|
||||
**When to use**: OpenAI-based applications
|
||||
|
||||
```python
|
||||
from langfuse.openai import openai
|
||||
|
||||
# Drop-in replacement for OpenAI client
|
||||
# All calls automatically traced
|
||||
|
||||
response = openai.chat.completions.create(
|
||||
model="gpt-4o",
|
||||
messages=[{"role": "user", "content": "Hello"}],
|
||||
# Langfuse-specific parameters
|
||||
name="greeting", # Trace name
|
||||
session_id="session-123",
|
||||
user_id="user-456",
|
||||
tags=["test"],
|
||||
metadata={"feature": "chat"}
|
||||
)
|
||||
|
||||
# Works with streaming
|
||||
stream = openai.chat.completions.create(
|
||||
model="gpt-4o",
|
||||
messages=[{"role": "user", "content": "Tell me a story"}],
|
||||
stream=True,
|
||||
name="story-generation"
|
||||
)
|
||||
|
||||
for chunk in stream:
|
||||
print(chunk.choices[0].delta.content, end="")
|
||||
|
||||
# Works with async
|
||||
import asyncio
|
||||
from langfuse.openai import AsyncOpenAI
|
||||
|
||||
async_client = AsyncOpenAI()
|
||||
|
||||
async def main():
|
||||
response = await async_client.chat.completions.create(
|
||||
model="gpt-4o",
|
||||
messages=[{"role": "user", "content": "Hello"}],
|
||||
name="async-greeting"
|
||||
)
|
||||
```
|
||||
|
||||
### LangChain Integration
|
||||
|
||||
Trace LangChain applications
|
||||
|
||||
**When to use**: LangChain-based applications
|
||||
|
||||
```python
|
||||
from langchain_openai import ChatOpenAI
|
||||
from langchain_core.prompts import ChatPromptTemplate
|
||||
from langfuse.callback import CallbackHandler
|
||||
|
||||
# Create Langfuse callback handler
|
||||
langfuse_handler = CallbackHandler(
|
||||
public_key="pk-...",
|
||||
secret_key="sk-...",
|
||||
host="https://cloud.langfuse.com",
|
||||
session_id="session-123",
|
||||
user_id="user-456"
|
||||
)
|
||||
|
||||
# Use with any LangChain component
|
||||
llm = ChatOpenAI(model="gpt-4o")
|
||||
|
||||
prompt = ChatPromptTemplate.from_messages([
|
||||
("system", "You are a helpful assistant."),
|
||||
("user", "{input}")
|
||||
])
|
||||
|
||||
chain = prompt | llm
|
||||
|
||||
# Pass handler to invoke
|
||||
response = chain.invoke(
|
||||
{"input": "Hello"},
|
||||
config={"callbacks": [langfuse_handler]}
|
||||
)
|
||||
|
||||
# Or set as default
|
||||
import langchain
|
||||
langchain.callbacks.manager.set_handler(langfuse_handler)
|
||||
|
||||
# Then all calls are traced
|
||||
response = chain.invoke({"input": "Hello"})
|
||||
|
||||
# Works with agents, retrievers, etc.
|
||||
from langchain.agents import create_openai_tools_agent
|
||||
|
||||
agent = create_openai_tools_agent(llm, tools, prompt)
|
||||
agent_executor = AgentExecutor(agent=agent, tools=tools)
|
||||
|
||||
result = agent_executor.invoke(
|
||||
{"input": "What's the weather?"},
|
||||
config={"callbacks": [langfuse_handler]}
|
||||
)
|
||||
```
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
### ❌ Not Flushing in Serverless
|
||||
|
||||
**Why bad**: Traces are batched.
|
||||
Serverless may exit before flush.
|
||||
Data is lost.
|
||||
|
||||
**Instead**: Always call langfuse.flush() at end.
|
||||
Use context managers where available.
|
||||
Consider sync mode for critical traces.
|
||||
|
||||
### ❌ Tracing Everything
|
||||
|
||||
**Why bad**: Noisy traces.
|
||||
Performance overhead.
|
||||
Hard to find important info.
|
||||
|
||||
**Instead**: Focus on: LLM calls, key logic, user actions.
|
||||
Group related operations.
|
||||
Use meaningful span names.
|
||||
|
||||
### ❌ No User/Session IDs
|
||||
|
||||
**Why bad**: Can't debug specific users.
|
||||
Can't track sessions.
|
||||
Analytics limited.
|
||||
|
||||
**Instead**: Always pass user_id and session_id.
|
||||
Use consistent identifiers.
|
||||
Add relevant metadata.
|
||||
|
||||
## Limitations
|
||||
|
||||
- Self-hosted requires infrastructure
|
||||
- High-volume may need optimization
|
||||
- Real-time dashboard has latency
|
||||
- Evaluation requires setup
|
||||
|
||||
## Related Skills
|
||||
|
||||
Works well with: `langgraph`, `crewai`, `structured-output`, `autonomous-agents`
|
||||
Reference in New Issue
Block a user