Add new features to enhance browser automation and HTML extraction:
- Add CDP browser launch capability with customizable ports and profiles
- Implement JsonLxmlExtractionStrategy for faster HTML parsing
- Add CLI command 'crwl cdp' for launching standalone CDP browsers
- Support connecting to external CDP browsers via URL
- Optimize selector caching and context-sensitive queries
BREAKING CHANGE: LLMConfig import path changed from crawl4ai.types to crawl4ai
Rename LlmConfig to LLMConfig across the codebase to follow consistent naming conventions.
Update all imports and usages to use the new name.
Update documentation and examples to reflect the change.
BREAKING CHANGE: LlmConfig has been renamed to LLMConfig. Users need to update their imports and usage.
* feature: Add LlmConfig to easily configure and pass LLM configs to different strategies
* pulled in next branch and resolved conflicts
* feat: Add gemini and deepseek providers. Make ignore_cache in llm content filter to true by default to avoid confusions
* Refactor: Update LlmConfig in LLMExtractionStrategy class and deprecate old params
* updated tests, docs and readme
Add JWT token-based authentication to Docker server and client.
Refactor server architecture for better code organization and error handling.
Move Dockerfile to root deploy directory and update configuration.
Add comprehensive documentation and examples.
BREAKING CHANGE: Docker server now requires authentication by default.
Endpoints require JWT tokens when security.jwt_enabled is true in config.
Improve configuration serialization with better handling of frozensets and slots.
Expand deep crawling module exports and documentation.
Add comprehensive API usage examples in Docker README.
- Add support for frozenset serialization
- Improve error handling in config loading
- Export additional deep crawling components
- Enhance Docker API documentation with detailed examples
- Fix ContentTypeFilter initialization
Changes cache mode from BYPASS to WRITE_ONLY when cache is disabled to ensure
results are still cached for future use. Also adds error handling for non-JSON
LLM responses and comprehensive API test suite.
- Changes default cache fallback from BYPASS to WRITE_ONLY
- Adds error handling for LLM JSON parsing
- Introduces new test suite for API endpoints
Add comprehensive Docker deployment configuration with:
- New .dockerignore and .llm.env.example files
- Enhanced Dockerfile with multi-stage build and optimizations
- Detailed README with setup instructions and environment configurations
- Improved requirements.txt with Gunicorn
- Better error handling in async_configs.py
BREAKING CHANGE: Docker deployment now requires .llm.env file for API keys
Add Docker service integration with FastAPI server and client implementation.
Implement serialization utilities for BrowserConfig and CrawlerRunConfig to support
Docker service communication. Clean up imports and improve error handling.
- Add Crawl4aiDockerClient class
- Implement config serialization/deserialization
- Add FastAPI server with streaming support
- Add health check endpoint
- Clean up imports and type hints