refactor: flatten Microsoft skills from nested to flat directory structure

Rewrote sync_microsoft_skills.py (v4) to use each SKILL.md's frontmatter 'name' field as the flat directory name under skills/, replacing the nested skills/official/microsoft/<lang>/<category>/<service>/ hierarchy. This fixes CI failures caused by the indexing, validation, and catalog scripts expecting skills/<id>/SKILL.md (depth 1). Changes: - Rewrite scripts/sync_microsoft_skills.py for flat output with collision detection - Update scripts/tests/inspect_microsoft_repo.py for flat name mapping - Update scripts/tests/test_comprehensive_coverage.py for name uniqueness checks - Delete skills/official/ nested directory - Add 129 Microsoft skills as flat directories (e.g. skills/azure-mgmt-botservice-dotnet/) - Move attribution files to docs/ (LICENSE-MICROSOFT, microsoft-skills-attribution.json) - Rebuild skills_index.json, CATALOG.md, README.md (845 total skills)
2026-02-12 00:07:15 +05:00
parent e06454dafd
commit e7ae616385
142 changed files with 5683 additions and 6097 deletions
--- a/skills/azure-ai-transcription-py/SKILL.md
+++ b/skills/azure-ai-transcription-py/SKILL.md
@@ -0,0 +1,69 @@
+---
+name: azure-ai-transcription-py
+description: |
+  Azure AI Transcription SDK for Python. Use for real-time and batch speech-to-text transcription with timestamps and diarization.
+  Triggers: "transcription", "speech to text", "Azure AI Transcription", "TranscriptionClient".
+package: azure-ai-transcription
+---
+
+# Azure AI Transcription SDK for Python
+
+Client library for Azure AI Transcription (speech-to-text) with real-time and batch transcription.
+
+## Installation
+
+```bash
+pip install azure-ai-transcription
+```
+
+## Environment Variables
+
+```bash
+TRANSCRIPTION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
+TRANSCRIPTION_KEY=<your-key>
+```
+
+## Authentication
+
+Use subscription key authentication (DefaultAzureCredential is not supported for this client):
+
+```python
+import os
+from azure.ai.transcription import TranscriptionClient
+
+client = TranscriptionClient(
+    endpoint=os.environ["TRANSCRIPTION_ENDPOINT"],
+    credential=os.environ["TRANSCRIPTION_KEY"]
+)
+```
+
+## Transcription (Batch)
+
+```python
+job = client.begin_transcription(
+    name="meeting-transcription",
+    locale="en-US",
+    content_urls=["https://<storage>/audio.wav"],
+    diarization_enabled=True
+)
+result = job.result()
+print(result.status)
+```
+
+## Transcription (Real-time)
+
+```python
+stream = client.begin_stream_transcription(locale="en-US")
+stream.send_audio_file("audio.wav")
+for event in stream:
+    print(event.text)
+```
+
+## Best Practices
+
+1. **Enable diarization** when multiple speakers are present
+2. **Use batch transcription** for long files stored in blob storage
+3. **Capture timestamps** for subtitle generation
+4. **Specify language** to improve recognition accuracy
+5. **Handle streaming backpressure** for real-time transcription
+6. **Close transcription sessions** when complete