feat: add webhook support for /llm/job endpoint

Add comprehensive webhook notification support for the /llm/job endpoint,
following the same pattern as the existing /crawl/job implementation.

Changes:
- Add webhook_config field to LlmJobPayload model (job.py)
- Implement webhook notifications in process_llm_extraction() with 4
  notification points: success, provider validation failure, extraction
  failure, and general exceptions (api.py)
- Store webhook_config in Redis task data for job tracking
- Initialize WebhookDeliveryService with exponential backoff retry logic
Documentation:
- Add Example 6 to WEBHOOK_EXAMPLES.md showing LLM extraction with webhooks
- Update Flask webhook handler to support both crawl and llm_extraction tasks
- Add TypeScript client examples for LLM jobs
- Add comprehensive examples to docker_webhook_example.py with schema support
- Clarify data structure differences between webhook and API responses

Testing:
- Add test_llm_webhook_feature.py with 7 validation tests (all passing)
- Verify pattern consistency with /crawl/job implementation
- Add implementation guide (WEBHOOK_LLM_JOB_IMPLEMENTATION.md)
This commit is contained in:
ntohidi
2025-10-22 13:03:09 +02:00
parent f8606f6865
commit d670dcde0a
6 changed files with 770 additions and 31 deletions

View File

@@ -14,7 +14,8 @@ import json
from datetime import datetime, timezone
# Add deploy/docker to path to import modules
sys.path.insert(0, '/home/user/crawl4ai/deploy/docker')
# sys.path.insert(0, '/home/user/crawl4ai/deploy/docker')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'deploy', 'docker'))
def test_imports():
"""Test that all webhook-related modules can be imported"""
@@ -237,7 +238,8 @@ def test_api_integration():
try:
# Check if api.py can import webhook module
with open('/home/user/crawl4ai/deploy/docker/api.py', 'r') as f:
api_path = os.path.join(os.path.dirname(__file__), 'deploy', 'docker', 'api.py')
with open(api_path, 'r') as f:
api_content = f.read()
if 'from webhook import WebhookDeliveryService' in api_content: