feat: Add Official Microsoft & Gemini Skills (845+ Total)
🚀 Impact Significantly expands the capabilities of **Antigravity Awesome Skills** by integrating official skill collections from **Microsoft** and **Google Gemini**. This update increases the total skill count to **845+**, making the library even more comprehensive for AI coding assistants. ✨ Key Changes 1. New Official Skills - **Microsoft Skills**: Added a massive collection of official skills from [microsoft/skills](https://github.com/microsoft/skills). - Includes Azure, .NET, Python, TypeScript, and Semantic Kernel skills. - Preserves the original directory structure under `skills/official/microsoft/`. - Includes plugin skills from the `.github/plugins` directory. - **Gemini Skills**: Added official Gemini API development skills under `skills/gemini-api-dev/`. 2. New Scripts & Tooling - **`scripts/sync_microsoft_skills.py`**: A robust synchronization script that: - Clones the official Microsoft repository. - Preserves the original directory heirarchy. - Handles symlinks and plugin locations. - Generates attribution metadata. - **`scripts/tests/inspect_microsoft_repo.py`**: Debug tool to inspect the remote repository structure. - **`scripts/tests/test_comprehensive_coverage.py`**: Verification script to ensure 100% of skills are captured during sync. 3. Core Improvements - **`scripts/generate_index.py`**: Enhanced frontmatter parsing to safely handle unquoted values containing `@` symbols and commas (fixing issues with some Microsoft skill descriptions). - **`package.json`**: Added `sync:microsoft` and `sync:all-official` scripts for easy maintenance. 4. Documentation - Updated `README.md` to reflect the new skill counts (845+) and added Microsoft/Gemini to the provider list. - Updated `CATALOG.md` and `skills_index.json` with the new skills. 🧪 Verification - Ran `scripts/tests/test_comprehensive_coverage.py` to verify all Microsoft skills are detected. - Validated `generate_index.py` fixes by successfully indexing the new skills.
This commit is contained in:
265
skills/official/microsoft/dotnet/foundry/voicelive/SKILL.md
Normal file
265
skills/official/microsoft/dotnet/foundry/voicelive/SKILL.md
Normal file
@@ -0,0 +1,265 @@
|
||||
---
|
||||
name: azure-ai-voicelive-dotnet
|
||||
description: |
|
||||
Azure AI Voice Live SDK for .NET. Build real-time voice AI applications with bidirectional WebSocket communication. Use for voice assistants, conversational AI, real-time speech-to-speech, and voice-enabled chatbots. Triggers: "voice live", "real-time voice", "VoiceLiveClient", "VoiceLiveSession", "voice assistant .NET", "bidirectional audio", "speech-to-speech".
|
||||
package: Azure.AI.VoiceLive
|
||||
---
|
||||
|
||||
# Azure.AI.VoiceLive (.NET)
|
||||
|
||||
Real-time voice AI SDK for building bidirectional voice assistants with Azure AI.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
dotnet add package Azure.AI.VoiceLive
|
||||
dotnet add package Azure.Identity
|
||||
dotnet add package NAudio # For audio capture/playback
|
||||
```
|
||||
|
||||
**Current Versions**: Stable v1.0.0, Preview v1.1.0-beta.1
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
AZURE_VOICELIVE_ENDPOINT=https://<resource>.services.ai.azure.com/
|
||||
AZURE_VOICELIVE_MODEL=gpt-4o-realtime-preview
|
||||
AZURE_VOICELIVE_VOICE=en-US-AvaNeural
|
||||
# Optional: API key if not using Entra ID
|
||||
AZURE_VOICELIVE_API_KEY=<your-api-key>
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
### Microsoft Entra ID (Recommended)
|
||||
|
||||
```csharp
|
||||
using Azure.Identity;
|
||||
using Azure.AI.VoiceLive;
|
||||
|
||||
Uri endpoint = new Uri("https://your-resource.cognitiveservices.azure.com");
|
||||
DefaultAzureCredential credential = new DefaultAzureCredential();
|
||||
VoiceLiveClient client = new VoiceLiveClient(endpoint, credential);
|
||||
```
|
||||
|
||||
**Required Role**: `Cognitive Services User` (assign in Azure Portal → Access control)
|
||||
|
||||
### API Key
|
||||
|
||||
```csharp
|
||||
Uri endpoint = new Uri("https://your-resource.cognitiveservices.azure.com");
|
||||
AzureKeyCredential credential = new AzureKeyCredential("your-api-key");
|
||||
VoiceLiveClient client = new VoiceLiveClient(endpoint, credential);
|
||||
```
|
||||
|
||||
## Client Hierarchy
|
||||
|
||||
```
|
||||
VoiceLiveClient
|
||||
└── VoiceLiveSession (WebSocket connection)
|
||||
├── ConfigureSessionAsync()
|
||||
├── GetUpdatesAsync() → SessionUpdate events
|
||||
├── AddItemAsync() → UserMessageItem, FunctionCallOutputItem
|
||||
├── SendAudioAsync()
|
||||
└── StartResponseAsync()
|
||||
```
|
||||
|
||||
## Core Workflow
|
||||
|
||||
### 1. Start Session and Configure
|
||||
|
||||
```csharp
|
||||
using Azure.Identity;
|
||||
using Azure.AI.VoiceLive;
|
||||
|
||||
var endpoint = new Uri(Environment.GetEnvironmentVariable("AZURE_VOICELIVE_ENDPOINT"));
|
||||
var client = new VoiceLiveClient(endpoint, new DefaultAzureCredential());
|
||||
|
||||
var model = "gpt-4o-mini-realtime-preview";
|
||||
|
||||
// Start session
|
||||
using VoiceLiveSession session = await client.StartSessionAsync(model);
|
||||
|
||||
// Configure session
|
||||
VoiceLiveSessionOptions sessionOptions = new()
|
||||
{
|
||||
Model = model,
|
||||
Instructions = "You are a helpful AI assistant. Respond naturally.",
|
||||
Voice = new AzureStandardVoice("en-US-AvaNeural"),
|
||||
TurnDetection = new AzureSemanticVadTurnDetection()
|
||||
{
|
||||
Threshold = 0.5f,
|
||||
PrefixPadding = TimeSpan.FromMilliseconds(300),
|
||||
SilenceDuration = TimeSpan.FromMilliseconds(500)
|
||||
},
|
||||
InputAudioFormat = InputAudioFormat.Pcm16,
|
||||
OutputAudioFormat = OutputAudioFormat.Pcm16
|
||||
};
|
||||
|
||||
// Set modalities (both text and audio for voice assistants)
|
||||
sessionOptions.Modalities.Clear();
|
||||
sessionOptions.Modalities.Add(InteractionModality.Text);
|
||||
sessionOptions.Modalities.Add(InteractionModality.Audio);
|
||||
|
||||
await session.ConfigureSessionAsync(sessionOptions);
|
||||
```
|
||||
|
||||
### 2. Process Events
|
||||
|
||||
```csharp
|
||||
await foreach (SessionUpdate serverEvent in session.GetUpdatesAsync())
|
||||
{
|
||||
switch (serverEvent)
|
||||
{
|
||||
case SessionUpdateResponseAudioDelta audioDelta:
|
||||
byte[] audioData = audioDelta.Delta.ToArray();
|
||||
// Play audio via NAudio or other audio library
|
||||
break;
|
||||
|
||||
case SessionUpdateResponseTextDelta textDelta:
|
||||
Console.Write(textDelta.Delta);
|
||||
break;
|
||||
|
||||
case SessionUpdateResponseFunctionCallArgumentsDone functionCall:
|
||||
// Handle function call (see Function Calling section)
|
||||
break;
|
||||
|
||||
case SessionUpdateError error:
|
||||
Console.WriteLine($"Error: {error.Error.Message}");
|
||||
break;
|
||||
|
||||
case SessionUpdateResponseDone:
|
||||
Console.WriteLine("\n--- Response complete ---");
|
||||
break;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Send User Message
|
||||
|
||||
```csharp
|
||||
await session.AddItemAsync(new UserMessageItem("Hello, can you help me?"));
|
||||
await session.StartResponseAsync();
|
||||
```
|
||||
|
||||
### 4. Function Calling
|
||||
|
||||
```csharp
|
||||
// Define function
|
||||
var weatherFunction = new VoiceLiveFunctionDefinition("get_current_weather")
|
||||
{
|
||||
Description = "Get the current weather for a given location",
|
||||
Parameters = BinaryData.FromString("""
|
||||
{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"location": {
|
||||
"type": "string",
|
||||
"description": "The city and state or country"
|
||||
}
|
||||
},
|
||||
"required": ["location"]
|
||||
}
|
||||
""")
|
||||
};
|
||||
|
||||
// Add to session options
|
||||
sessionOptions.Tools.Add(weatherFunction);
|
||||
|
||||
// Handle function call in event loop
|
||||
if (serverEvent is SessionUpdateResponseFunctionCallArgumentsDone functionCall)
|
||||
{
|
||||
if (functionCall.Name == "get_current_weather")
|
||||
{
|
||||
var parameters = JsonSerializer.Deserialize<Dictionary<string, string>>(functionCall.Arguments);
|
||||
string location = parameters?["location"] ?? "";
|
||||
|
||||
// Call external service
|
||||
string weatherInfo = $"The weather in {location} is sunny, 75°F.";
|
||||
|
||||
// Send response
|
||||
await session.AddItemAsync(new FunctionCallOutputItem(functionCall.CallId, weatherInfo));
|
||||
await session.StartResponseAsync();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Voice Options
|
||||
|
||||
| Voice Type | Class | Example |
|
||||
|------------|-------|---------|
|
||||
| Azure Standard | `AzureStandardVoice` | `"en-US-AvaNeural"` |
|
||||
| Azure HD | `AzureStandardVoice` | `"en-US-Ava:DragonHDLatestNeural"` |
|
||||
| Azure Custom | `AzureCustomVoice` | Custom voice with endpoint ID |
|
||||
|
||||
## Supported Models
|
||||
|
||||
| Model | Description |
|
||||
|-------|-------------|
|
||||
| `gpt-4o-realtime-preview` | GPT-4o with real-time audio |
|
||||
| `gpt-4o-mini-realtime-preview` | Lightweight, fast interactions |
|
||||
| `phi4-mm-realtime` | Cost-effective multimodal |
|
||||
|
||||
## Key Types Reference
|
||||
|
||||
| Type | Purpose |
|
||||
|------|---------|
|
||||
| `VoiceLiveClient` | Main client for creating sessions |
|
||||
| `VoiceLiveSession` | Active WebSocket session |
|
||||
| `VoiceLiveSessionOptions` | Session configuration |
|
||||
| `AzureStandardVoice` | Standard Azure voice provider |
|
||||
| `AzureSemanticVadTurnDetection` | Voice activity detection |
|
||||
| `VoiceLiveFunctionDefinition` | Function tool definition |
|
||||
| `UserMessageItem` | User text message |
|
||||
| `FunctionCallOutputItem` | Function call response |
|
||||
| `SessionUpdateResponseAudioDelta` | Audio chunk event |
|
||||
| `SessionUpdateResponseTextDelta` | Text chunk event |
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Always set both modalities** — Include `Text` and `Audio` for voice assistants
|
||||
2. **Use `AzureSemanticVadTurnDetection`** — Provides natural conversation flow
|
||||
3. **Configure appropriate silence duration** — 500ms typical to avoid premature cutoffs
|
||||
4. **Use `using` statement** — Ensures proper session disposal
|
||||
5. **Handle all event types** — Check for errors, audio, text, and function calls
|
||||
6. **Use DefaultAzureCredential** — Never hardcode API keys
|
||||
|
||||
## Error Handling
|
||||
|
||||
```csharp
|
||||
if (serverEvent is SessionUpdateError error)
|
||||
{
|
||||
if (error.Error.Message.Contains("Cancellation failed: no active response"))
|
||||
{
|
||||
// Benign error, can ignore
|
||||
}
|
||||
else
|
||||
{
|
||||
Console.WriteLine($"Error: {error.Error.Message}");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Audio Configuration
|
||||
|
||||
- **Input Format**: `InputAudioFormat.Pcm16` (16-bit PCM)
|
||||
- **Output Format**: `OutputAudioFormat.Pcm16`
|
||||
- **Sample Rate**: 24kHz recommended
|
||||
- **Channels**: Mono
|
||||
|
||||
## Related SDKs
|
||||
|
||||
| SDK | Purpose | Install |
|
||||
|-----|---------|---------|
|
||||
| `Azure.AI.VoiceLive` | Real-time voice (this SDK) | `dotnet add package Azure.AI.VoiceLive` |
|
||||
| `Microsoft.CognitiveServices.Speech` | Speech-to-text, text-to-speech | `dotnet add package Microsoft.CognitiveServices.Speech` |
|
||||
| `NAudio` | Audio capture/playback | `dotnet add package NAudio` |
|
||||
|
||||
## Reference Links
|
||||
|
||||
| Resource | URL |
|
||||
|----------|-----|
|
||||
| NuGet Package | https://www.nuget.org/packages/Azure.AI.VoiceLive |
|
||||
| API Reference | https://learn.microsoft.com/dotnet/api/azure.ai.voicelive |
|
||||
| GitHub Source | https://github.com/Azure/azure-sdk-for-net/tree/main/sdk/ai/Azure.AI.VoiceLive |
|
||||
| Quickstart | https://learn.microsoft.com/azure/ai-services/speech-service/voice-live-quickstart |
|
||||
Reference in New Issue
Block a user