feat: add 4 universal skills from cli-ai-skills

- Add audio-transcriber skill (v1.2.0): Transform audio to Markdown with Whisper - Add youtube-summarizer skill (v1.2.0): Generate summaries from YouTube videos - Update prompt-engineer skill: Enhanced with 11 optimization frameworks - Update skill-creator skill: Improved automation workflow All skills are zero-config, cross-platform (Claude Code, Copilot CLI, Codex) and follow Quality Bar V4 standards. Source: https://github.com/ericgandrade/cli-ai-skills
2026-02-04 17:37:45 -03:00
parent 6070da6a63
commit 801c8fa475
21 changed files with 5012 additions and 579 deletions
--- a/skills/audio-transcriber/SKILL.md
+++ b/skills/audio-transcriber/SKILL.md
@@ -0,0 +1,558 @@
+---
+name: audio-transcriber
+description: "Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration"
+version: 1.2.0
+author: Eric Andrade
+created: 2025-02-01
+updated: 2026-02-04
+platforms: [github-copilot-cli, claude-code, codex]
+category: content
+tags: [audio, transcription, whisper, meeting-minutes, speech-to-text]
+risk: safe
+---
+
+## Purpose
+
+This skill automates audio-to-text transcription with professional Markdown output, extracting rich technical metadata (speakers, timestamps, language, file size, duration) and generating structured meeting minutes and executive summaries. It uses Faster-Whisper or Whisper with zero configuration, working universally across projects without hardcoded paths or API keys.
+
+Inspired by tools like Plaud, this skill transforms raw audio recordings into actionable documentation, making it ideal for meetings, interviews, lectures, and content analysis.
+
+## When to Use
+
+Invoke this skill when:
+
+- User needs to transcribe audio/video files to text
+- User wants meeting minutes automatically generated from recordings
+- User requires speaker identification (diarization) in conversations
+- User needs subtitles/captions (SRT, VTT formats)
+- User wants executive summaries of long audio content
+- User asks variations of "transcribe this audio", "convert audio to text", "generate meeting notes from recording"
+- User has audio files in common formats (MP3, WAV, M4A, OGG, FLAC, WEBM)
+
+## Workflow
+
+### Step 0: Discovery (Auto-detect Transcription Tools)
+
+**Objective:** Identify available transcription engines without user configuration.
+
+**Actions:**
+
+Run detection commands to find installed tools:
+
+```bash
+# Check for Faster-Whisper (preferred - 4-5x faster)
+if python3 -c "import faster_whisper" 2>/dev/null; then
+    TRANSCRIBER="faster-whisper"
+    echo "✅ Faster-Whisper detected (optimized)"
+# Fallback to original Whisper
+elif python3 -c "import whisper" 2>/dev/null; then
+    TRANSCRIBER="whisper"
+    echo "✅ OpenAI Whisper detected"
+else
+    TRANSCRIBER="none"
+    echo "⚠️  No transcription tool found"
+fi
+
+# Check for ffmpeg (audio format conversion)
+if command -v ffmpeg &>/dev/null; then
+    echo "✅ ffmpeg available (format conversion enabled)"
+else
+    echo "ℹ️  ffmpeg not found (limited format support)"
+fi
+```
+
+**If no transcriber found:**
+
+Offer automatic installation using the provided script:
+
+```bash
+echo "⚠️  No transcription tool found"
+echo ""
+echo "🔧 Auto-install dependencies? (Recommended)"
+read -p "Run installation script? [Y/n]: " AUTO_INSTALL
+
+if [[ ! "$AUTO_INSTALL" =~ ^[Nn] ]]; then
+    # Get skill directory (works for both repo and symlinked installations)
+    SKILL_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+    
+    # Run installation script
+    if [[ -f "$SKILL_DIR/scripts/install-requirements.sh" ]]; then
+        bash "$SKILL_DIR/scripts/install-requirements.sh"
+    else
+        echo "❌ Installation script not found"
+        echo ""
+        echo "📦 Manual installation:"
+        echo "  pip install faster-whisper  # Recommended"
+        echo "  pip install openai-whisper  # Alternative"
+        echo "  brew install ffmpeg         # Optional (macOS)"
+        exit 1
+    fi
+    
+    # Verify installation succeeded
+    if python3 -c "import faster_whisper" 2>/dev/null || python3 -c "import whisper" 2>/dev/null; then
+        echo "✅ Installation successful! Proceeding with transcription..."
+    else
+        echo "❌ Installation failed. Please install manually."
+        exit 1
+    fi
+else
+    echo ""
+    echo "📦 Manual installation required:"
+    echo ""
+    echo "Recommended (fastest):"
+    echo "  pip install faster-whisper"
+    echo ""
+    echo "Alternative (original):"
+    echo "  pip install openai-whisper"
+    echo ""
+    echo "Optional (format conversion):"
+    echo "  brew install ffmpeg  # macOS"
+    echo "  apt install ffmpeg   # Linux"
+    echo ""
+    exit 1
+fi
+```
+
+This ensures users can install dependencies with one confirmation, or opt for manual installation if preferred.
+
+**If transcriber found:**
+
+Proceed to Step 0b (CLI Detection).
+
+
+### Step 1: Validate Audio File
+
+**Objective:** Verify file exists, check format, and extract metadata.
+
+**Actions:**
+
+1. **Accept file path or URL** from user:
+   - Local file: `meeting.mp3`
+   - URL: `https://example.com/audio.mp3` (download to temp directory)
+
+2. **Verify file exists:**
+
+```bash
+if [[ ! -f "$AUDIO_FILE" ]]; then
+    echo "❌ File not found: $AUDIO_FILE"
+    exit 1
+fi
+```
+
+3. **Extract metadata** using ffprobe or file utilities:
+
+```bash
+# Get file size
+FILE_SIZE=$(du -h "$AUDIO_FILE" | cut -f1)
+
+# Get duration and format using ffprobe
+DURATION=$(ffprobe -v error -show_entries format=duration \
+    -of default=noprint_wrappers=1:nokey=1 "$AUDIO_FILE" 2>/dev/null)
+FORMAT=$(ffprobe -v error -select_streams a:0 -show_entries \
+    stream=codec_name -of default=noprint_wrappers=1:nokey=1 "$AUDIO_FILE" 2>/dev/null)
+
+# Convert duration to HH:MM:SS
+DURATION_HMS=$(date -u -r "$DURATION" +%H:%M:%S 2>/dev/null || echo "Unknown")
+```
+
+4. **Check file size** (warn if large for cloud APIs):
+
+```bash
+SIZE_MB=$(du -m "$AUDIO_FILE" | cut -f1)
+if [[ $SIZE_MB -gt 25 ]]; then
+    echo "⚠️  Large file ($FILE_SIZE) - processing may take several minutes"
+fi
+```
+
+5. **Validate format** (supported: MP3, WAV, M4A, OGG, FLAC, WEBM):
+
+```bash
+EXTENSION="${AUDIO_FILE##*.}"
+SUPPORTED_FORMATS=("mp3" "wav" "m4a" "ogg" "flac" "webm" "mp4")
+
+if [[ ! " ${SUPPORTED_FORMATS[@]} " =~ " ${EXTENSION,,} " ]]; then
+    echo "⚠️  Unsupported format: $EXTENSION"
+    if command -v ffmpeg &>/dev/null; then
+        echo "🔄 Converting to WAV..."
+        ffmpeg -i "$AUDIO_FILE" -ar 16000 "${AUDIO_FILE%.*}.wav" -y
+        AUDIO_FILE="${AUDIO_FILE%.*}.wav"
+    else
+        echo "❌ Install ffmpeg to convert formats: brew install ffmpeg"
+        exit 1
+    fi
+fi
+```
+
+
+### Step 3: Generate Markdown Output
+
+**Objective:** Create structured Markdown with metadata, transcription, meeting minutes, and summary.
+
+**Output Template:**
+
+```markdown
+# Audio Transcription Report
+
+## 📊 Metadata
+
+| Field | Value |
+|-------|-------|
+| **File Name** | {filename} |
+| **File Size** | {file_size} |
+| **Duration** | {duration_hms} |
+| **Language** | {language} ({language_code}) |
+| **Processed Date** | {process_date} |
+| **Speakers Identified** | {num_speakers} |
+| **Transcription Engine** | {engine} (model: {model}) |
+
+
+## 📋 Meeting Minutes
+
+### Participants
+- {speaker_1}
+- {speaker_2}
+- ...
+
+### Topics Discussed
+1. **{topic_1}** ({timestamp})
+   - {key_point_1}
+   - {key_point_2}
+
+2. **{topic_2}** ({timestamp})
+   - {key_point_1}
+
+### Decisions Made
+- ✅ {decision_1}
+- ✅ {decision_2}
+
+### Action Items
+- [ ] **{action_1}** - Assigned to: {speaker} - Due: {date_if_mentioned}
+- [ ] **{action_2}** - Assigned to: {speaker}
+
+
+*Generated by audio-transcriber skill v1.0.0*  
+*Transcription engine: {engine} | Processing time: {elapsed_time}s*
+```
+
+**Implementation:**
+
+Use Python or bash with AI model (Claude/GPT) for intelligent summarization:
+
+```python
+def generate_meeting_minutes(segments):
+    """Extract topics, decisions, action items from transcription."""
+    
+    # Group segments by topic (simple clustering by timestamps)
+    topics = cluster_by_topic(segments)
+    
+    # Identify action items (keywords: "should", "will", "need to", "action")
+    action_items = extract_action_items(segments)
+    
+    # Identify decisions (keywords: "decided", "agreed", "approved")
+    decisions = extract_decisions(segments)
+    
+    return {
+        "topics": topics,
+        "decisions": decisions,
+        "action_items": action_items
+    }
+
+def generate_summary(segments, max_paragraphs=5):
+    """Create executive summary using AI (Claude/GPT via API or local model)."""
+    
+    full_text = " ".join([s["text"] for s in segments])
+    
+    # Use Chain of Density approach (from prompt-engineer frameworks)
+    summary_prompt = f"""
+    Summarize the following transcription in {max_paragraphs} concise paragraphs.
+    Focus on key topics, decisions, and action items.
+    
+    Transcription:
+    {full_text}
+    """
+    
+    # Call AI model (placeholder - user can integrate Claude API or use local model)
+    summary = call_ai_model(summary_prompt)
+    
+    return summary
+```
+
+**Output file naming:**
+
+```bash
+# v1.1.0: Use timestamp para evitar sobrescrever
+TIMESTAMP=$(date +%Y%m%d-%H%M%S)
+TRANSCRIPT_FILE="transcript-${TIMESTAMP}.md"
+ATA_FILE="ata-${TIMESTAMP}.md"
+
+echo "$TRANSCRIPT_CONTENT" > "$TRANSCRIPT_FILE"
+echo "✅ Transcript salvo: $TRANSCRIPT_FILE"
+
+if [[ -n "$ATA_CONTENT" ]]; then
+    echo "$ATA_CONTENT" > "$ATA_FILE"
+    echo "✅ Ata salva: $ATA_FILE"
+fi
+```
+
+
+#### **SCENARIO A: User Provided Custom Prompt**
+
+**Workflow:**
+
+1. **Display user's prompt:**
+   ```
+   📝 Prompt fornecido pelo usuário:
+   ┌──────────────────────────────────┐
+   │ [User's prompt preview]          │
+   └──────────────────────────────────┘
+   ```
+
+2. **Automatically improve with prompt-engineer (if available):**
+   ```bash
+   🔧 Melhorando prompt com prompt-engineer...
+   [Invokes: gh copilot -p "melhore este prompt: {user_prompt}"]
+   ```
+
+3. **Show both versions:**
+   ```
+   ✨ Versão melhorada:
+   ┌──────────────────────────────────┐
+   │ Role: Você é um documentador...  │
+   │ Instructions: Transforme...      │
+   │ Steps: 1) ... 2) ...             │
+   │ End Goal: ...                    │
+   └──────────────────────────────────┘
+
+   📝 Versão original:
+   ┌──────────────────────────────────┐
+   │ [User's original prompt]         │
+   └──────────────────────────────────┘
+   ```
+
+4. **Ask which to use:**
+   ```bash
+   💡 Usar versão melhorada? [s/n] (default: s):
+   ```
+
+5. **Process with selected prompt:**
+   - If "s": use improved
+   - If "n": use original
+
+
+#### **LLM Processing (Both Scenarios)**
+
+Once prompt is finalized:
+
+```python
+from rich.progress import Progress, SpinnerColumn, TextColumn
+
+def process_with_llm(transcript, prompt, cli_tool='claude'):
+    full_prompt = f"{prompt}\n\n---\n\nTranscrição:\n\n{transcript}"
+    
+    with Progress(
+        SpinnerColumn(),
+        TextColumn("[progress.description]{task.description}"),
+        transient=True
+    ) as progress:
+        progress.add_task(
+            description=f"🤖 Processando com {cli_tool}...",
+            total=None
+        )
+        
+        if cli_tool == 'claude':
+            result = subprocess.run(
+                ['claude', '-'],
+                input=full_prompt,
+                capture_output=True,
+                text=True,
+                timeout=300  # 5 minutes
+            )
+        elif cli_tool == 'gh-copilot':
+            result = subprocess.run(
+                ['gh', 'copilot', 'suggest', '-t', 'shell', full_prompt],
+                capture_output=True,
+                text=True,
+                timeout=300
+            )
+    
+    if result.returncode == 0:
+        return result.stdout.strip()
+    else:
+        return None
+```
+
+**Progress output:**
+```
+🤖 Processando com claude... ⠋
+[After completion:]
+✅ Ata gerada com sucesso!
+```
+
+
+#### **Final Output**
+
+**Success (both files):**
+```bash
+💾 Salvando arquivos...
+
+✅ Arquivos criados:
+  - transcript-20260203-023045.md  (transcript puro)
+  - ata-20260203-023045.md         (processado com LLM)
+
+🧹 Removidos arquivos temporários: metadata.json, transcription.json
+
+✅ Concluído! Tempo total: 3m 45s
+```
+
+**Transcript only (user declined LLM):**
+```bash
+💾 Salvando arquivos...
+
+✅ Arquivo criado:
+  - transcript-20260203-023045.md
+
+ℹ️  Ata não gerada (processamento LLM recusado pelo usuário)
+
+🧹 Removidos arquivos temporários: metadata.json, transcription.json
+
+✅ Concluído!
+```
+
+
+### Step 5: Display Results Summary
+
+**Objective:** Show completion status and next steps.
+
+**Output:**
+
+```bash
+echo ""
+echo "✅ Transcription Complete!"
+echo ""
+echo "📊 Results:"
+echo "  File: $OUTPUT_FILE"
+echo "  Language: $LANGUAGE"
+echo "  Duration: $DURATION_HMS"
+echo "  Speakers: $NUM_SPEAKERS"
+echo "  Words: $WORD_COUNT"
+echo "  Processing time: ${ELAPSED_TIME}s"
+echo ""
+echo "📝 Generated:"
+echo "  - $OUTPUT_FILE (Markdown report)"
+[if alternative formats:]
+echo "  - ${OUTPUT_FILE%.*}.srt (Subtitles)"
+echo "  - ${OUTPUT_FILE%.*}.json (Structured data)"
+echo ""
+echo "🎯 Next steps:"
+echo "  1. Review meeting minutes and action items"
+echo "  2. Share report with participants"
+echo "  3. Track action items to completion"
+```
+
+
+## Example Usage
+
+### **Example 1: Basic Transcription**
+
+**User Input:**
+```bash
+copilot> transcribe audio to markdown: meeting-2026-02-02.mp3
+```
+
+**Skill Output:**
+
+```bash
+✅ Faster-Whisper detected (optimized)
+✅ ffmpeg available (format conversion enabled)
+
+📂 File: meeting-2026-02-02.mp3
+📊 Size: 12.3 MB
+⏱️  Duration: 00:45:32
+
+🎙️  Processing...
+[████████████████████] 100%
+
+✅ Language detected: Portuguese (pt-BR)
+👥 Speakers identified: 4
+📝 Generating Markdown output...
+
+✅ Transcription Complete!
+
+📊 Results:
+  File: meeting-2026-02-02.md
+  Language: pt-BR
+  Duration: 00:45:32
+  Speakers: 4
+  Words: 6,842
+  Processing time: 127s
+
+📝 Generated:
+  - meeting-2026-02-02.md (Markdown report)
+
+🎯 Next steps:
+  1. Review meeting minutes and action items
+  2. Share report with participants
+  3. Track action items to completion
+```
+
+
+### **Example 3: Batch Processing**
+
+**User Input:**
+```bash
+copilot> transcreva estes áudios: recordings/*.mp3
+```
+
+**Skill Output:**
+
+```bash
+📦 Batch mode: 5 files found
+  1. team-standup.mp3
+  2. client-call.mp3
+  3. brainstorm-session.mp3
+  4. product-demo.mp3
+  5. retrospective.mp3
+
+🎙️  Processing batch...
+
+[1/5] team-standup.mp3 ✅ (2m 34s)
+[2/5] client-call.mp3 ✅ (15m 12s)
+[3/5] brainstorm-session.mp3 ✅ (8m 47s)
+[4/5] product-demo.mp3 ✅ (22m 03s)
+[5/5] retrospective.mp3 ✅ (11m 28s)
+
+✅ Batch Complete!
+📝 Generated 5 Markdown reports
+⏱️  Total processing time: 6m 15s
+```
+
+
+### **Example 5: Large File Warning**
+
+**User Input:**
+```bash
+copilot> transcribe audio to markdown: conference-keynote.mp3
+```
+
+**Skill Output:**
+
+```bash
+✅ Faster-Whisper detected (optimized)
+
+📂 File: conference-keynote.mp3
+📊 Size: 87.2 MB
+⏱️  Duration: 02:15:47
+⚠️  Large file (87.2 MB) - processing may take several minutes
+
+Continue? [Y/n]:
+```
+
+**User:** `Y`
+
+```bash
+🎙️  Processing... (this may take 10-15 minutes)
+[████░░░░░░░░░░░░░░░░] 20% - Estimated time remaining: 12m
+```
+
+
+This skill is **platform-agnostic** and works in any terminal context where GitHub Copilot CLI is available. It does not depend on specific project configurations or external APIs, following the zero-configuration philosophy.