feat(api): implement crawler pool manager for improved resource handling

Adds a new CrawlerManager class to handle browser instance pooling and failover:
- Implements auto-scaling based on system resources
- Adds primary/backup crawler management
- Integrates memory monitoring and throttling
- Adds streaming support with memory tracking
- Updates API endpoints to use pooled crawlers

BREAKING CHANGE: API endpoints now require CrawlerManager initialization
This commit is contained in:
UncleCode
2025-04-18 22:26:24 +08:00
parent 907cba194f
commit 16b2318242
9 changed files with 2082 additions and 59 deletions

View File

@@ -37,8 +37,8 @@ from crawl4ai import (
DEFAULT_SITE_PATH = "test_site"
DEFAULT_PORT = 8000
DEFAULT_MAX_SESSIONS = 16
DEFAULT_URL_COUNT = 100
DEFAULT_CHUNK_SIZE = 10 # Define chunk size for batch logging
DEFAULT_URL_COUNT = 1
DEFAULT_CHUNK_SIZE = 1 # Define chunk size for batch logging
DEFAULT_REPORT_PATH = "reports"
DEFAULT_STREAM_MODE = False
DEFAULT_MONITOR_MODE = "DETAILED"