Commit Graph

1 Commits

Author SHA1 Message Date
Soham Kukreti
6eb3baed50 feat: Add ConfigHealthMonitor for automated crawler configuration health monitoring
Implement a comprehensive health monitoring system for crawler configurations
that automatically detects failures and applies resolution strategies.

Features

- **Continuous Health Monitoring**: Periodic health checks for multiple crawler
  configurations with configurable check intervals
- **Automatic Failure Detection**: Detects failures based on HTTP status codes,
  empty HTML responses, and logger errors
- **Resolution Strategies**: Built-in and custom resolution strategies that
  automatically attempt to fix failing configurations
- **Resolution Chains**: Support for sequential resolution strategies that
  validate each step before proceeding
- **Metrics Collection**: Comprehensive metrics tracking including success rates,
  response times, resolution attempts, and uptime statistics
- **Graceful Shutdown**: Robust cleanup mechanism that waits for active health
  checks to complete before shutting down
- **Error Tracking**: Integrated logger error tracking to detect non-critical
  errors that don't fail HTTP requests but indicate issues

Implementation Details

- New module `crawl4ai/config_health_monitor.py` containing:
  - `ConfigHealthMonitor`: Main monitoring class
  - `ConfigHealthState`: Health state tracking dataclass
  - `ResolutionResult`: Resolution strategy result dataclass
  - `ResolutionStrategy`: Type alias for resolution callables
  - `_ErrorTrackingLogger`: Proxy logger for error event tracking

- Key capabilities:
  - Register/unregister configurations for monitoring
  - Manual and automatic health checks
  - Config-specific or global resolution strategies
  - Thread-safe state management with asyncio locks
  - Per-config and global metrics reporting
  - Context manager support for automatic cleanup

Testing

- Comprehensive test suite in `tests/general/test_config_health_monitor.py`:
  - Basic functionality tests (initialization, registration)
  - Lifecycle management tests (start/stop, context manager)
  - Health checking tests (success/failure scenarios)
  - Resolution strategy tests
  - Metrics and status query tests
  - Property validation tests

Examples

- Example usage in `docs/examples/config_health_monitor_example.py`:
  - Demonstrates monitor initialization and configuration
  - Shows custom resolution strategies (incremental backoff, magic mode toggle)
  - Implements resolution chains with validation
  - Displays metrics reporting and status monitoring
  - Includes context manager usage pattern

Technical Notes

- Uses `copy.deepcopy()` for safe configuration mutation
- Implements `_ErrorTrackingLogger` to capture logger errors during health checks
- Tracks active health check tasks for graceful shutdown
- Uses `CacheMode.BYPASS` for health check configs to ensure fresh data
- Minimum check interval enforced at 10 seconds

This feature enables production-grade monitoring of crawler configurations,
automatically detecting and resolving issues before they impact crawling
operations.
2025-11-25 23:49:15 +05:30