refactor: flatten Microsoft skills from nested to flat directory structure
Rewrote sync_microsoft_skills.py (v4) to use each SKILL.md's frontmatter 'name' field as the flat directory name under skills/, replacing the nested skills/official/microsoft/<lang>/<category>/<service>/ hierarchy. This fixes CI failures caused by the indexing, validation, and catalog scripts expecting skills/<id>/SKILL.md (depth 1). Changes: - Rewrite scripts/sync_microsoft_skills.py for flat output with collision detection - Update scripts/tests/inspect_microsoft_repo.py for flat name mapping - Update scripts/tests/test_comprehensive_coverage.py for name uniqueness checks - Delete skills/official/ nested directory - Add 129 Microsoft skills as flat directories (e.g. skills/azure-mgmt-botservice-dotnet/) - Move attribution files to docs/ (LICENSE-MICROSOFT, microsoft-skills-attribution.json) - Rebuild skills_index.json, CATALOG.md, README.md (845 total skills)
This commit is contained in:
69
skills/azure-ai-transcription-py/SKILL.md
Normal file
69
skills/azure-ai-transcription-py/SKILL.md
Normal file
@@ -0,0 +1,69 @@
|
||||
---
|
||||
name: azure-ai-transcription-py
|
||||
description: |
|
||||
Azure AI Transcription SDK for Python. Use for real-time and batch speech-to-text transcription with timestamps and diarization.
|
||||
Triggers: "transcription", "speech to text", "Azure AI Transcription", "TranscriptionClient".
|
||||
package: azure-ai-transcription
|
||||
---
|
||||
|
||||
# Azure AI Transcription SDK for Python
|
||||
|
||||
Client library for Azure AI Transcription (speech-to-text) with real-time and batch transcription.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install azure-ai-transcription
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
TRANSCRIPTION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
|
||||
TRANSCRIPTION_KEY=<your-key>
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
Use subscription key authentication (DefaultAzureCredential is not supported for this client):
|
||||
|
||||
```python
|
||||
import os
|
||||
from azure.ai.transcription import TranscriptionClient
|
||||
|
||||
client = TranscriptionClient(
|
||||
endpoint=os.environ["TRANSCRIPTION_ENDPOINT"],
|
||||
credential=os.environ["TRANSCRIPTION_KEY"]
|
||||
)
|
||||
```
|
||||
|
||||
## Transcription (Batch)
|
||||
|
||||
```python
|
||||
job = client.begin_transcription(
|
||||
name="meeting-transcription",
|
||||
locale="en-US",
|
||||
content_urls=["https://<storage>/audio.wav"],
|
||||
diarization_enabled=True
|
||||
)
|
||||
result = job.result()
|
||||
print(result.status)
|
||||
```
|
||||
|
||||
## Transcription (Real-time)
|
||||
|
||||
```python
|
||||
stream = client.begin_stream_transcription(locale="en-US")
|
||||
stream.send_audio_file("audio.wav")
|
||||
for event in stream:
|
||||
print(event.text)
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Enable diarization** when multiple speakers are present
|
||||
2. **Use batch transcription** for long files stored in blob storage
|
||||
3. **Capture timestamps** for subtitle generation
|
||||
4. **Specify language** to improve recognition accuracy
|
||||
5. **Handle streaming backpressure** for real-time transcription
|
||||
6. **Close transcription sessions** when complete
|
||||
Reference in New Issue
Block a user