feat: Add Official Microsoft & Gemini Skills (845+ Total)

🚀 Impact Significantly expands the capabilities of **Antigravity Awesome Skills** by integrating official skill collections from **Microsoft** and **Google Gemini**. This update increases the total skill count to **845+**, making the library even more comprehensive for AI coding assistants. ✨ Key Changes 1. New Official Skills - **Microsoft Skills**: Added a massive collection of official skills from [microsoft/skills](https://github.com/microsoft/skills). - Includes Azure, .NET, Python, TypeScript, and Semantic Kernel skills. - Preserves the original directory structure under `skills/official/microsoft/`. - Includes plugin skills from the `.github/plugins` directory. - **Gemini Skills**: Added official Gemini API development skills under `skills/gemini-api-dev/`. 2. New Scripts & Tooling - **`scripts/sync_microsoft_skills.py`**: A robust synchronization script that: - Clones the official Microsoft repository. - Preserves the original directory heirarchy. - Handles symlinks and plugin locations. - Generates attribution metadata. - **`scripts/tests/inspect_microsoft_repo.py`**: Debug tool to inspect the remote repository structure. - **`scripts/tests/test_comprehensive_coverage.py`**: Verification script to ensure 100% of skills are captured during sync. 3. Core Improvements - **`scripts/generate_index.py`**: Enhanced frontmatter parsing to safely handle unquoted values containing `@` symbols and commas (fixing issues with some Microsoft skill descriptions). - **`package.json`**: Added `sync:microsoft` and `sync:all-official` scripts for easy maintenance. 4. Documentation - Updated `README.md` to reflect the new skill counts (845+) and added Microsoft/Gemini to the provider list. - Updated `CATALOG.md` and `skills_index.json` with the new skills. 🧪 Verification - Ran `scripts/tests/test_comprehensive_coverage.py` to verify all Microsoft skills are detected. - Validated `generate_index.py` fixes by successfully indexing the new skills.
2026-02-11 20:16:23 +05:00
parent 167d7c97c7
commit 17bce709de
145 changed files with 44081 additions and 72 deletions
--- a/skills/official/microsoft/python/foundry/voicelive/SKILL.md
+++ b/skills/official/microsoft/python/foundry/voicelive/SKILL.md
@@ -0,0 +1,309 @@
+---
+name: azure-ai-voicelive-py
+description: Build real-time voice AI applications using Azure AI Voice Live SDK (azure-ai-voicelive). Use this skill when creating Python applications that need real-time bidirectional audio communication with Azure AI, including voice assistants, voice-enabled chatbots, real-time speech-to-speech translation, voice-driven avatars, or any WebSocket-based audio streaming with AI models. Supports Server VAD (Voice Activity Detection), turn-based conversation, function calling, MCP tools, avatar integration, and transcription.
+package: azure-ai-voicelive
+---
+
+# Azure AI Voice Live SDK
+
+Build real-time voice AI applications with bidirectional WebSocket communication.
+
+## Installation
+
+```bash
+pip install azure-ai-voicelive aiohttp azure-identity
+```
+
+## Environment Variables
+
+```bash
+AZURE_COGNITIVE_SERVICES_ENDPOINT=https://<region>.api.cognitive.microsoft.com
+# For API key auth (not recommended for production)
+AZURE_COGNITIVE_SERVICES_KEY=<api-key>
+```
+
+## Authentication
+
+**DefaultAzureCredential (preferred)**:
+```python
+from azure.ai.voicelive.aio import connect
+from azure.identity.aio import DefaultAzureCredential
+
+async with connect(
+    endpoint=os.environ["AZURE_COGNITIVE_SERVICES_ENDPOINT"],
+    credential=DefaultAzureCredential(),
+    model="gpt-4o-realtime-preview",
+    credential_scopes=["https://cognitiveservices.azure.com/.default"]
+) as conn:
+    ...
+```
+
+**API Key**:
+```python
+from azure.ai.voicelive.aio import connect
+from azure.core.credentials import AzureKeyCredential
+
+async with connect(
+    endpoint=os.environ["AZURE_COGNITIVE_SERVICES_ENDPOINT"],
+    credential=AzureKeyCredential(os.environ["AZURE_COGNITIVE_SERVICES_KEY"]),
+    model="gpt-4o-realtime-preview"
+) as conn:
+    ...
+```
+
+## Quick Start
+
+```python
+import asyncio
+import os
+from azure.ai.voicelive.aio import connect
+from azure.identity.aio import DefaultAzureCredential
+
+async def main():
+    async with connect(
+        endpoint=os.environ["AZURE_COGNITIVE_SERVICES_ENDPOINT"],
+        credential=DefaultAzureCredential(),
+        model="gpt-4o-realtime-preview",
+        credential_scopes=["https://cognitiveservices.azure.com/.default"]
+    ) as conn:
+        # Update session with instructions
+        await conn.session.update(session={
+            "instructions": "You are a helpful assistant.",
+            "modalities": ["text", "audio"],
+            "voice": "alloy"
+        })
+        
+        # Listen for events
+        async for event in conn:
+            print(f"Event: {event.type}")
+            if event.type == "response.audio_transcript.done":
+                print(f"Transcript: {event.transcript}")
+            elif event.type == "response.done":
+                break
+
+asyncio.run(main())
+```
+
+## Core Architecture
+
+### Connection Resources
+
+The `VoiceLiveConnection` exposes these resources:
+
+| Resource | Purpose | Key Methods |
+|----------|---------|-------------|
+| `conn.session` | Session configuration | `update(session=...)` |
+| `conn.response` | Model responses | `create()`, `cancel()` |
+| `conn.input_audio_buffer` | Audio input | `append()`, `commit()`, `clear()` |
+| `conn.output_audio_buffer` | Audio output | `clear()` |
+| `conn.conversation` | Conversation state | `item.create()`, `item.delete()`, `item.truncate()` |
+| `conn.transcription_session` | Transcription config | `update(session=...)` |
+
+## Session Configuration
+
+```python
+from azure.ai.voicelive.models import RequestSession, FunctionTool
+
+await conn.session.update(session=RequestSession(
+    instructions="You are a helpful voice assistant.",
+    modalities=["text", "audio"],
+    voice="alloy",  # or "echo", "shimmer", "sage", etc.
+    input_audio_format="pcm16",
+    output_audio_format="pcm16",
+    turn_detection={
+        "type": "server_vad",
+        "threshold": 0.5,
+        "prefix_padding_ms": 300,
+        "silence_duration_ms": 500
+    },
+    tools=[
+        FunctionTool(
+            type="function",
+            name="get_weather",
+            description="Get current weather",
+            parameters={
+                "type": "object",
+                "properties": {
+                    "location": {"type": "string"}
+                },
+                "required": ["location"]
+            }
+        )
+    ]
+))
+```
+
+## Audio Streaming
+
+### Send Audio (Base64 PCM16)
+
+```python
+import base64
+
+# Read audio chunk (16-bit PCM, 24kHz mono)
+audio_chunk = await read_audio_from_microphone()
+b64_audio = base64.b64encode(audio_chunk).decode()
+
+await conn.input_audio_buffer.append(audio=b64_audio)
+```
+
+### Receive Audio
+
+```python
+async for event in conn:
+    if event.type == "response.audio.delta":
+        audio_bytes = base64.b64decode(event.delta)
+        await play_audio(audio_bytes)
+    elif event.type == "response.audio.done":
+        print("Audio complete")
+```
+
+## Event Handling
+
+```python
+async for event in conn:
+    match event.type:
+        # Session events
+        case "session.created":
+            print(f"Session: {event.session}")
+        case "session.updated":
+            print("Session updated")
+        
+        # Audio input events
+        case "input_audio_buffer.speech_started":
+            print(f"Speech started at {event.audio_start_ms}ms")
+        case "input_audio_buffer.speech_stopped":
+            print(f"Speech stopped at {event.audio_end_ms}ms")
+        
+        # Transcription events
+        case "conversation.item.input_audio_transcription.completed":
+            print(f"User said: {event.transcript}")
+        case "conversation.item.input_audio_transcription.delta":
+            print(f"Partial: {event.delta}")
+        
+        # Response events
+        case "response.created":
+            print(f"Response started: {event.response.id}")
+        case "response.audio_transcript.delta":
+            print(event.delta, end="", flush=True)
+        case "response.audio.delta":
+            audio = base64.b64decode(event.delta)
+        case "response.done":
+            print(f"Response complete: {event.response.status}")
+        
+        # Function calls
+        case "response.function_call_arguments.done":
+            result = handle_function(event.name, event.arguments)
+            await conn.conversation.item.create(item={
+                "type": "function_call_output",
+                "call_id": event.call_id,
+                "output": json.dumps(result)
+            })
+            await conn.response.create()
+        
+        # Errors
+        case "error":
+            print(f"Error: {event.error.message}")
+```
+
+## Common Patterns
+
+### Manual Turn Mode (No VAD)
+
+```python
+await conn.session.update(session={"turn_detection": None})
+
+# Manually control turns
+await conn.input_audio_buffer.append(audio=b64_audio)
+await conn.input_audio_buffer.commit()  # End of user turn
+await conn.response.create()  # Trigger response
+```
+
+### Interrupt Handling
+
+```python
+async for event in conn:
+    if event.type == "input_audio_buffer.speech_started":
+        # User interrupted - cancel current response
+        await conn.response.cancel()
+        await conn.output_audio_buffer.clear()
+```
+
+### Conversation History
+
+```python
+# Add system message
+await conn.conversation.item.create(item={
+    "type": "message",
+    "role": "system",
+    "content": [{"type": "input_text", "text": "Be concise."}]
+})
+
+# Add user message
+await conn.conversation.item.create(item={
+    "type": "message",
+    "role": "user", 
+    "content": [{"type": "input_text", "text": "Hello!"}]
+})
+
+await conn.response.create()
+```
+
+## Voice Options
+
+| Voice | Description |
+|-------|-------------|
+| `alloy` | Neutral, balanced |
+| `echo` | Warm, conversational |
+| `shimmer` | Clear, professional |
+| `sage` | Calm, authoritative |
+| `coral` | Friendly, upbeat |
+| `ash` | Deep, measured |
+| `ballad` | Expressive |
+| `verse` | Storytelling |
+
+Azure voices: Use `AzureStandardVoice`, `AzureCustomVoice`, or `AzurePersonalVoice` models.
+
+## Audio Formats
+
+| Format | Sample Rate | Use Case |
+|--------|-------------|----------|
+| `pcm16` | 24kHz | Default, high quality |
+| `pcm16-8000hz` | 8kHz | Telephony |
+| `pcm16-16000hz` | 16kHz | Voice assistants |
+| `g711_ulaw` | 8kHz | Telephony (US) |
+| `g711_alaw` | 8kHz | Telephony (EU) |
+
+## Turn Detection Options
+
+```python
+# Server VAD (default)
+{"type": "server_vad", "threshold": 0.5, "silence_duration_ms": 500}
+
+# Azure Semantic VAD (smarter detection)
+{"type": "azure_semantic_vad"}
+{"type": "azure_semantic_vad_en"}  # English optimized
+{"type": "azure_semantic_vad_multilingual"}
+```
+
+## Error Handling
+
+```python
+from azure.ai.voicelive.aio import ConnectionError, ConnectionClosed
+
+try:
+    async with connect(...) as conn:
+        async for event in conn:
+            if event.type == "error":
+                print(f"API Error: {event.error.code} - {event.error.message}")
+except ConnectionClosed as e:
+    print(f"Connection closed: {e.code} - {e.reason}")
+except ConnectionError as e:
+    print(f"Connection error: {e}")
+```
+
+## References
+
+- **Detailed API Reference**: See [references/api-reference.md](references/api-reference.md)
+- **Complete Examples**: See [references/examples.md](references/examples.md)
+- **All Models & Types**: See [references/models.md](references/models.md)