Rewrote sync_microsoft_skills.py (v4) to use each SKILL.md's frontmatter 'name' field as the flat directory name under skills/, replacing the nested skills/official/microsoft/<lang>/<category>/<service>/ hierarchy. This fixes CI failures caused by the indexing, validation, and catalog scripts expecting skills/<id>/SKILL.md (depth 1). Changes: - Rewrite scripts/sync_microsoft_skills.py for flat output with collision detection - Update scripts/tests/inspect_microsoft_repo.py for flat name mapping - Update scripts/tests/test_comprehensive_coverage.py for name uniqueness checks - Delete skills/official/ nested directory - Add 129 Microsoft skills as flat directories (e.g. skills/azure-mgmt-botservice-dotnet/) - Move attribution files to docs/ (LICENSE-MICROSOFT, microsoft-skills-attribution.json) - Rebuild skills_index.json, CATALOG.md, README.md (845 total skills)
1.7 KiB
1.7 KiB
name, description, package
| name | description | package |
|---|---|---|
| azure-ai-transcription-py | Azure AI Transcription SDK for Python. Use for real-time and batch speech-to-text transcription with timestamps and diarization. Triggers: "transcription", "speech to text", "Azure AI Transcription", "TranscriptionClient". | azure-ai-transcription |
Azure AI Transcription SDK for Python
Client library for Azure AI Transcription (speech-to-text) with real-time and batch transcription.
Installation
pip install azure-ai-transcription
Environment Variables
TRANSCRIPTION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
TRANSCRIPTION_KEY=<your-key>
Authentication
Use subscription key authentication (DefaultAzureCredential is not supported for this client):
import os
from azure.ai.transcription import TranscriptionClient
client = TranscriptionClient(
endpoint=os.environ["TRANSCRIPTION_ENDPOINT"],
credential=os.environ["TRANSCRIPTION_KEY"]
)
Transcription (Batch)
job = client.begin_transcription(
name="meeting-transcription",
locale="en-US",
content_urls=["https://<storage>/audio.wav"],
diarization_enabled=True
)
result = job.result()
print(result.status)
Transcription (Real-time)
stream = client.begin_stream_transcription(locale="en-US")
stream.send_audio_file("audio.wav")
for event in stream:
print(event.text)
Best Practices
- Enable diarization when multiple speakers are present
- Use batch transcription for long files stored in blob storage
- Capture timestamps for subtitle generation
- Specify language to improve recognition accuracy
- Handle streaming backpressure for real-time transcription
- Close transcription sessions when complete