Add comprehensive webhook notification support for the /llm/job endpoint,
following the same pattern as the existing /crawl/job implementation.
Changes:
- Add webhook_config field to LlmJobPayload model (job.py)
- Implement webhook notifications in process_llm_extraction() with 4
notification points: success, provider validation failure, extraction
failure, and general exceptions (api.py)
- Store webhook_config in Redis task data for job tracking
- Initialize WebhookDeliveryService with exponential backoff retry logic
Documentation:
- Add Example 6 to WEBHOOK_EXAMPLES.md showing LLM extraction with webhooks
- Update Flask webhook handler to support both crawl and llm_extraction tasks
- Add TypeScript client examples for LLM jobs
- Add comprehensive examples to docker_webhook_example.py with schema support
- Clarify data structure differences between webhook and API responses
Testing:
- Add test_llm_webhook_feature.py with 7 validation tests (all passing)
- Verify pattern consistency with /crawl/job implementation
- Add implementation guide (WEBHOOK_LLM_JOB_IMPLEMENTATION.md)
Use model_dump(mode='json') instead of deprecated dict() method to ensure
Pydantic special types (HttpUrl, UUID, etc.) are properly serialized to
JSON-compatible native Python types.
This fixes webhook delivery failures caused by HttpUrl objects remaining
as Pydantic types in the webhook_config dict, which caused JSON
serialization errors and httpx request failures.
Also update mcp requirement to >=1.18.0 for compatibility.
Added end-to-end test script that automates webhook feature testing:
Script Features (test_webhook_feature.sh):
- Automatic branch switching and dependency installation
- Redis and server startup/shutdown management
- Webhook receiver implementation
- Integration test for webhook notifications
- Comprehensive cleanup and error handling
- Returns to original branch after completion
Test Flow:
1. Fetch and checkout webhook feature branch
2. Activate venv and install dependencies
3. Start Redis and Crawl4AI server
4. Submit crawl job with webhook config
5. Verify webhook delivery and payload
6. Clean up all processes and return to original branch
Documentation:
- WEBHOOK_TEST_README.md with usage instructions
- Troubleshooting guide
- Exit codes and safety features
Usage: ./tests/test_webhook_feature.sh
Generated with Claude Code https://claude.com/claude-code
Co-Authored-By: Claude <noreply@anthropic.com>
Added comprehensive test suite to validate webhook implementation:
- Module import verification
- WebhookDeliveryService initialization
- Pydantic model validation (WebhookConfig)
- Payload construction logic
- Exponential backoff calculation
- API integration checks
All tests pass (6/6), confirming implementation is correct.
Generated with Claude Code https://claude.com/claude-code
Co-Authored-By: Claude <noreply@anthropic.com>
Added docker_webhook_example.py demonstrating:
- Submitting crawl jobs with webhook configuration
- Flask-based webhook receiver implementation
- Three usage patterns:
1. Webhook notification only (fetch data separately)
2. Webhook with full data in payload
3. Traditional polling approach for comparison
Includes comprehensive comments explaining:
- Webhook payload structure
- Authentication headers setup
- Error handling
- Production deployment tips
Example is fully functional and ready to run with Flask installed.
Generated with Claude Code https://claude.com/claude-code
Co-Authored-By: Claude <noreply@anthropic.com>
Added comprehensive webhook section to README.md including:
- Overview of asynchronous job queue with webhooks
- Benefits and use cases
- Quick start examples
- Webhook authentication
- Global webhook configuration
- Job status polling alternative
Updated table of contents and summary to include webhook feature.
Maintains consistent tone and style with rest of README.
Generated with Claude Code https://claude.com/claude-code
Co-Authored-By: Claude <noreply@anthropic.com>
Implements webhook support for the crawl job API to eliminate polling requirements.
Changes:
- Added WebhookConfig and WebhookPayload schemas to schemas.py
- Created webhook.py with WebhookDeliveryService class
- Integrated webhook notifications in api.py handle_crawl_job
- Updated job.py CrawlJobPayload to accept webhook_config
- Added webhook configuration section to config.yml
- Included comprehensive usage examples in WEBHOOK_EXAMPLES.md
Features:
- Webhook notifications on job completion (success/failure)
- Configurable data inclusion in webhook payload
- Custom webhook headers support
- Global default webhook URL configuration
- Exponential backoff retry logic (5 attempts: 1s, 2s, 4s, 8s, 16s)
- 30-second timeout per webhook call
Usage:
POST /crawl/job with optional webhook_config:
- webhook_url: URL to receive notifications
- webhook_data_in_payload: include full results (default: false)
- webhook_headers: custom headers for authentication
Generated with Claude Code https://claude.com/claude-code
Co-Authored-By: Claude <noreply@anthropic.com>
- Cross-checked every section against actual docs
- Fixed BM25ContentFilter parameters (user_query, bm25_threshold)
- Removed incorrect wait_for selector from basic example
- Added comprehensive test suite (4 test files)
- All examples now tested and verified working
- Tests validate: basic crawling, markdown generation, data extraction, advanced patterns
- Package size: 76.6 KB (includes tests for future validation)
- Create comprehensive skill package for AI coding assistants
- Include complete SDK reference (23K words, v0.7.4)
- Add three extraction scripts (basic, batch, pipeline)
- Implement version tracking in skill and scripts
- Add prominent download section on homepage
- Place skill in docs/assets for web distribution
The skill enables AI assistants like Claude, Cursor, and Windsurf
to effectively use Crawl4AI with optimized workflows for markdown
generation and data extraction.
Add hooks_to_string() utility function that converts Python function objects
to string representations for the Docker API, enabling developers to write hooks
as regular Python functions instead of strings.
Core Changes:
- New hooks_to_string() utility in crawl4ai/utils.py using inspect.getsource()
- Docker client now accepts both function objects and strings for hooks
- Automatic detection and conversion in Crawl4aiDockerClient._prepare_request()
- New hooks and hooks_timeout parameters in client.crawl() method
Documentation:
- Docker client examples with function-based hooks (docs/examples/docker_client_hooks_example.py)
- Updated main Docker deployment guide with comprehensive hooks section
- Added unit tests for hooks utility (tests/docker/test_hooks_utility.py)
Add hooks_to_string() utility function that converts Python function objects
to string representations for the Docker API, enabling developers to write hooks
as regular Python functions instead of strings.
Core Changes:
- New hooks_to_string() utility in crawl4ai/utils.py using inspect.getsource()
- Docker client now accepts both function objects and strings for hooks
- Automatic detection and conversion in Crawl4aiDockerClient._prepare_request()
- New hooks and hooks_timeout parameters in client.crawl() method
Documentation:
- Docker client examples with function-based hooks (docs/examples/docker_client_hooks_example.py)
- Updated main Docker deployment guide with comprehensive hooks section
- Added unit tests for hooks utility (tests/docker/test_hooks_utility.py)
- Fixed hero image to 200px height with min/max constraints
- Added object-fit: cover to hero-image img elements
- Changed secondary-featured align-items from stretch to flex-start
- Fixed secondary-card height to 118px (no flex: 1 stretching)
- Updated responsive grid layouts for wider screens
- Added flex: 1 to hero-content for better content distribution
These changes ensure a rigid, predictable layout that prevents:
1. Large images from pushing text content down
2. Single secondary cards from stretching to fill entire height
- Fixed JavaScript errors from missing HTML elements (install-code, usage-code, integration-code)
- Added missing CSS classes for tabs, overview layout, sidebar, and integration content
- Fixed tab navigation to display horizontally in single line
- Added proper padding to tab content sections (removed from container, added to content)
- Fixed tab selector from .nav-tab to .tab-btn to match HTML structure
- Added sidebar styling with stats grid and metadata display
- Improved responsive design with mobile-friendly tab scrolling
- Fixed code block positioning for copy buttons
- Removed margin from first headings to prevent extra spacing
- Added null checks for DOM elements in JavaScript to prevent errors
These changes resolve the routing issue where clicking on apps caused page redirects,
and fix the broken layout where CSS was not properly applied to the app detail page.
- Change API_BASE to relative '/api' for production
- Move marketplace to /marketplace instead of /marketplace/frontend
- Update MkDocs navigation
- Fix logo path in marketplace index