refactor: flatten Microsoft skills from nested to flat directory structure

Rewrote sync_microsoft_skills.py (v4) to use each SKILL.md's frontmatter 'name' field as the flat directory name under skills/, replacing the nested skills/official/microsoft/<lang>/<category>/<service>/ hierarchy. This fixes CI failures caused by the indexing, validation, and catalog scripts expecting skills/<id>/SKILL.md (depth 1). Changes: - Rewrite scripts/sync_microsoft_skills.py for flat output with collision detection - Update scripts/tests/inspect_microsoft_repo.py for flat name mapping - Update scripts/tests/test_comprehensive_coverage.py for name uniqueness checks - Delete skills/official/ nested directory - Add 129 Microsoft skills as flat directories (e.g. skills/azure-mgmt-botservice-dotnet/) - Move attribution files to docs/ (LICENSE-MICROSOFT, microsoft-skills-attribution.json) - Rebuild skills_index.json, CATALOG.md, README.md (845 total skills)
2026-02-12 00:07:15 +05:00
parent e06454dafd
commit e7ae616385
142 changed files with 5683 additions and 6097 deletions
--- a/skills/azure-ai-voicelive-ts/SKILL.md
+++ b/skills/azure-ai-voicelive-ts/SKILL.md
@@ -0,0 +1,465 @@
+---
+name: azure-ai-voicelive-ts
+description: |
+  Azure AI Voice Live SDK for JavaScript/TypeScript. Build real-time voice AI applications with bidirectional WebSocket communication. Use for voice assistants, conversational AI, real-time speech-to-speech, and voice-enabled chatbots in Node.js or browser environments. Triggers: "voice live", "real-time voice", "VoiceLiveClient", "VoiceLiveSession", "voice assistant TypeScript", "bidirectional audio", "speech-to-speech JavaScript".
+package: @azure/ai-voicelive
+---
+
+# @azure/ai-voicelive (JavaScript/TypeScript)
+
+Real-time voice AI SDK for building bidirectional voice assistants with Azure AI in Node.js and browser environments.
+
+## Installation
+
+```bash
+npm install @azure/ai-voicelive @azure/identity
+# TypeScript users
+npm install @types/node
+```
+
+**Current Version**: 1.0.0-beta.3
+
+**Supported Environments**:
+- Node.js LTS versions (20+)
+- Modern browsers (Chrome, Firefox, Safari, Edge)
+
+## Environment Variables
+
+```bash
+AZURE_VOICELIVE_ENDPOINT=https://<resource>.cognitiveservices.azure.com
+# Optional: API key if not using Entra ID
+AZURE_VOICELIVE_API_KEY=<your-api-key>
+# Optional: Logging
+AZURE_LOG_LEVEL=info
+```
+
+## Authentication
+
+### Microsoft Entra ID (Recommended)
+
+```typescript
+import { DefaultAzureCredential } from "@azure/identity";
+import { VoiceLiveClient } from "@azure/ai-voicelive";
+
+const credential = new DefaultAzureCredential();
+const endpoint = "https://your-resource.cognitiveservices.azure.com";
+
+const client = new VoiceLiveClient(endpoint, credential);
+```
+
+### API Key
+
+```typescript
+import { AzureKeyCredential } from "@azure/core-auth";
+import { VoiceLiveClient } from "@azure/ai-voicelive";
+
+const endpoint = "https://your-resource.cognitiveservices.azure.com";
+const credential = new AzureKeyCredential("your-api-key");
+
+const client = new VoiceLiveClient(endpoint, credential);
+```
+
+## Client Hierarchy
+
+```
+VoiceLiveClient
+└── VoiceLiveSession (WebSocket connection)
+    ├── updateSession()      → Configure session options
+    ├── subscribe()          → Event handlers (Azure SDK pattern)
+    ├── sendAudio()          → Stream audio input
+    ├── addConversationItem() → Add messages/function outputs
+    └── sendEvent()          → Send raw protocol events
+```
+
+## Quick Start
+
+```typescript
+import { DefaultAzureCredential } from "@azure/identity";
+import { VoiceLiveClient } from "@azure/ai-voicelive";
+
+const credential = new DefaultAzureCredential();
+const endpoint = process.env.AZURE_VOICELIVE_ENDPOINT!;
+
+// Create client and start session
+const client = new VoiceLiveClient(endpoint, credential);
+const session = await client.startSession("gpt-4o-mini-realtime-preview");
+
+// Configure session
+await session.updateSession({
+  modalities: ["text", "audio"],
+  instructions: "You are a helpful AI assistant. Respond naturally.",
+  voice: {
+    type: "azure-standard",
+    name: "en-US-AvaNeural",
+  },
+  turnDetection: {
+    type: "server_vad",
+    threshold: 0.5,
+    prefixPaddingMs: 300,
+    silenceDurationMs: 500,
+  },
+  inputAudioFormat: "pcm16",
+  outputAudioFormat: "pcm16",
+});
+
+// Subscribe to events
+const subscription = session.subscribe({
+  onResponseAudioDelta: async (event, context) => {
+    // Handle streaming audio output
+    const audioData = event.delta;
+    playAudioChunk(audioData);
+  },
+  onResponseTextDelta: async (event, context) => {
+    // Handle streaming text
+    process.stdout.write(event.delta);
+  },
+  onInputAudioTranscriptionCompleted: async (event, context) => {
+    console.log("User said:", event.transcript);
+  },
+});
+
+// Send audio from microphone
+function sendAudioChunk(audioBuffer: ArrayBuffer) {
+  session.sendAudio(audioBuffer);
+}
+```
+
+## Session Configuration
+
+```typescript
+await session.updateSession({
+  // Modalities
+  modalities: ["audio", "text"],
+  
+  // System instructions
+  instructions: "You are a customer service representative.",
+  
+  // Voice selection
+  voice: {
+    type: "azure-standard",  // or "azure-custom", "openai"
+    name: "en-US-AvaNeural",
+  },
+  
+  // Turn detection (VAD)
+  turnDetection: {
+    type: "server_vad",      // or "azure_semantic_vad"
+    threshold: 0.5,
+    prefixPaddingMs: 300,
+    silenceDurationMs: 500,
+  },
+  
+  // Audio formats
+  inputAudioFormat: "pcm16",
+  outputAudioFormat: "pcm16",
+  
+  // Tools (function calling)
+  tools: [
+    {
+      type: "function",
+      name: "get_weather",
+      description: "Get current weather",
+      parameters: {
+        type: "object",
+        properties: {
+          location: { type: "string" }
+        },
+        required: ["location"]
+      }
+    }
+  ],
+  toolChoice: "auto",
+});
+```
+
+## Event Handling (Azure SDK Pattern)
+
+The SDK uses a subscription-based event handling pattern:
+
+```typescript
+const subscription = session.subscribe({
+  // Connection lifecycle
+  onConnected: async (args, context) => {
+    console.log("Connected:", args.connectionId);
+  },
+  onDisconnected: async (args, context) => {
+    console.log("Disconnected:", args.code, args.reason);
+  },
+  onError: async (args, context) => {
+    console.error("Error:", args.error.message);
+  },
+  
+  // Session events
+  onSessionCreated: async (event, context) => {
+    console.log("Session created:", context.sessionId);
+  },
+  onSessionUpdated: async (event, context) => {
+    console.log("Session updated");
+  },
+  
+  // Audio input events (VAD)
+  onInputAudioBufferSpeechStarted: async (event, context) => {
+    console.log("Speech started at:", event.audioStartMs);
+  },
+  onInputAudioBufferSpeechStopped: async (event, context) => {
+    console.log("Speech stopped at:", event.audioEndMs);
+  },
+  
+  // Transcription events
+  onConversationItemInputAudioTranscriptionCompleted: async (event, context) => {
+    console.log("User said:", event.transcript);
+  },
+  onConversationItemInputAudioTranscriptionDelta: async (event, context) => {
+    process.stdout.write(event.delta);
+  },
+  
+  // Response events
+  onResponseCreated: async (event, context) => {
+    console.log("Response started");
+  },
+  onResponseDone: async (event, context) => {
+    console.log("Response complete");
+  },
+  
+  // Streaming text
+  onResponseTextDelta: async (event, context) => {
+    process.stdout.write(event.delta);
+  },
+  onResponseTextDone: async (event, context) => {
+    console.log("\n--- Text complete ---");
+  },
+  
+  // Streaming audio
+  onResponseAudioDelta: async (event, context) => {
+    const audioData = event.delta;
+    playAudioChunk(audioData);
+  },
+  onResponseAudioDone: async (event, context) => {
+    console.log("Audio complete");
+  },
+  
+  // Audio transcript (what assistant said)
+  onResponseAudioTranscriptDelta: async (event, context) => {
+    process.stdout.write(event.delta);
+  },
+  
+  // Function calling
+  onResponseFunctionCallArgumentsDone: async (event, context) => {
+    if (event.name === "get_weather") {
+      const args = JSON.parse(event.arguments);
+      const result = await getWeather(args.location);
+      
+      await session.addConversationItem({
+        type: "function_call_output",
+        callId: event.callId,
+        output: JSON.stringify(result),
+      });
+      
+      await session.sendEvent({ type: "response.create" });
+    }
+  },
+  
+  // Catch-all for debugging
+  onServerEvent: async (event, context) => {
+    console.log("Event:", event.type);
+  },
+});
+
+// Clean up when done
+await subscription.close();
+```
+
+## Function Calling
+
+```typescript
+// Define tools in session config
+await session.updateSession({
+  modalities: ["audio", "text"],
+  instructions: "Help users with weather information.",
+  tools: [
+    {
+      type: "function",
+      name: "get_weather",
+      description: "Get current weather for a location",
+      parameters: {
+        type: "object",
+        properties: {
+          location: {
+            type: "string",
+            description: "City and state or country",
+          },
+        },
+        required: ["location"],
+      },
+    },
+  ],
+  toolChoice: "auto",
+});
+
+// Handle function calls
+const subscription = session.subscribe({
+  onResponseFunctionCallArgumentsDone: async (event, context) => {
+    if (event.name === "get_weather") {
+      const args = JSON.parse(event.arguments);
+      const weatherData = await fetchWeather(args.location);
+      
+      // Send function result
+      await session.addConversationItem({
+        type: "function_call_output",
+        callId: event.callId,
+        output: JSON.stringify(weatherData),
+      });
+      
+      // Trigger response generation
+      await session.sendEvent({ type: "response.create" });
+    }
+  },
+});
+```
+
+## Voice Options
+
+| Voice Type | Config | Example |
+|------------|--------|---------|
+| Azure Standard | `{ type: "azure-standard", name: "..." }` | `"en-US-AvaNeural"` |
+| Azure Custom | `{ type: "azure-custom", name: "...", endpointId: "..." }` | Custom voice endpoint |
+| Azure Personal | `{ type: "azure-personal", speakerProfileId: "..." }` | Personal voice clone |
+| OpenAI | `{ type: "openai", name: "..." }` | `"alloy"`, `"echo"`, `"shimmer"` |
+
+## Supported Models
+
+| Model | Description | Use Case |
+|-------|-------------|----------|
+| `gpt-4o-realtime-preview` | GPT-4o with real-time audio | High-quality conversational AI |
+| `gpt-4o-mini-realtime-preview` | Lightweight GPT-4o | Fast, efficient interactions |
+| `phi4-mm-realtime` | Phi multimodal | Cost-effective applications |
+
+## Turn Detection Options
+
+```typescript
+// Server VAD (default)
+turnDetection: {
+  type: "server_vad",
+  threshold: 0.5,
+  prefixPaddingMs: 300,
+  silenceDurationMs: 500,
+}
+
+// Azure Semantic VAD (smarter detection)
+turnDetection: {
+  type: "azure_semantic_vad",
+}
+
+// Azure Semantic VAD (English optimized)
+turnDetection: {
+  type: "azure_semantic_vad_en",
+}
+
+// Azure Semantic VAD (Multilingual)
+turnDetection: {
+  type: "azure_semantic_vad_multilingual",
+}
+```
+
+## Audio Formats
+
+| Format | Sample Rate | Use Case |
+|--------|-------------|----------|
+| `pcm16` | 24kHz | Default, high quality |
+| `pcm16-8000hz` | 8kHz | Telephony |
+| `pcm16-16000hz` | 16kHz | Voice assistants |
+| `g711_ulaw` | 8kHz | Telephony (US) |
+| `g711_alaw` | 8kHz | Telephony (EU) |
+
+## Key Types Reference
+
+| Type | Purpose |
+|------|---------|
+| `VoiceLiveClient` | Main client for creating sessions |
+| `VoiceLiveSession` | Active WebSocket session |
+| `VoiceLiveSessionHandlers` | Event handler interface |
+| `VoiceLiveSubscription` | Active event subscription |
+| `ConnectionContext` | Context for connection events |
+| `SessionContext` | Context for session events |
+| `ServerEventUnion` | Union of all server events |
+
+## Error Handling
+
+```typescript
+import {
+  VoiceLiveError,
+  VoiceLiveConnectionError,
+  VoiceLiveAuthenticationError,
+  VoiceLiveProtocolError,
+} from "@azure/ai-voicelive";
+
+const subscription = session.subscribe({
+  onError: async (args, context) => {
+    const { error } = args;
+    
+    if (error instanceof VoiceLiveConnectionError) {
+      console.error("Connection error:", error.message);
+    } else if (error instanceof VoiceLiveAuthenticationError) {
+      console.error("Auth error:", error.message);
+    } else if (error instanceof VoiceLiveProtocolError) {
+      console.error("Protocol error:", error.message);
+    }
+  },
+  
+  onServerError: async (event, context) => {
+    console.error("Server error:", event.error?.message);
+  },
+});
+```
+
+## Logging
+
+```typescript
+import { setLogLevel } from "@azure/logger";
+
+// Enable verbose logging
+setLogLevel("info");
+
+// Or via environment variable
+// AZURE_LOG_LEVEL=info
+```
+
+## Browser Usage
+
+```typescript
+// Browser requires bundler (Vite, webpack, etc.)
+import { VoiceLiveClient } from "@azure/ai-voicelive";
+import { InteractiveBrowserCredential } from "@azure/identity";
+
+// Use browser-compatible credential
+const credential = new InteractiveBrowserCredential({
+  clientId: "your-client-id",
+  tenantId: "your-tenant-id",
+});
+
+const client = new VoiceLiveClient(endpoint, credential);
+
+// Request microphone access
+const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
+const audioContext = new AudioContext({ sampleRate: 24000 });
+
+// Process audio and send to session
+// ... (see samples for full implementation)
+```
+
+## Best Practices
+
+1. **Always use `DefaultAzureCredential`** — Never hardcode API keys
+2. **Set both modalities** — Include `["text", "audio"]` for voice assistants
+3. **Use Azure Semantic VAD** — Better turn detection than basic server VAD
+4. **Handle all error types** — Connection, auth, and protocol errors
+5. **Clean up subscriptions** — Call `subscription.close()` when done
+6. **Use appropriate audio format** — PCM16 at 24kHz for best quality
+
+## Reference Links
+
+| Resource | URL |
+|----------|-----|
+| npm Package | https://www.npmjs.com/package/@azure/ai-voicelive |
+| GitHub Source | https://github.com/Azure/azure-sdk-for-js/tree/main/sdk/ai/ai-voicelive |
+| Samples | https://github.com/Azure/azure-sdk-for-js/tree/main/sdk/ai/ai-voicelive/samples |
+| API Reference | https://learn.microsoft.com/javascript/api/@azure/ai-voicelive |