Major refactoring to eliminate memory leaks and enable high-scale crawling: - **Smart 3-Tier Browser Pool**: - Permanent browser (always-ready default config) - Hot pool (configs used 3+ times, longer TTL) - Cold pool (new/rare configs, short TTL) - Auto-promotion: cold → hot after 3 uses - 100% pool reuse achieved in tests - **Container-Aware Memory Detection**: - Read cgroup v1/v2 memory limits (not host metrics) - Accurate memory pressure detection in Docker - Memory-based browser creation blocking - **Adaptive Janitor**: - Dynamic cleanup intervals (10s/30s/60s based on memory) - Tiered TTLs: cold 30-300s, hot 120-600s - Aggressive cleanup at high memory pressure - **Unified Pool Usage**: - All endpoints now use pool (/html, /screenshot, /pdf, /execute_js, /md, /llm) - Fixed config signature mismatch (permanent browser matches endpoints) - get_default_browser_config() helper for consistency - **Configuration**: - Reduced idle_ttl: 1800s → 300s (30min → 5min) - Fixed port: 11234 → 11235 (match Gunicorn) **Performance Results** (from stress tests): - Memory: 10x reduction (500-700MB × N → 270MB permanent) - Latency: 30-50x faster (<100ms pool hits vs 3-5s startup) - Reuse: 100% for default config, 60%+ for variants - Capacity: 100+ concurrent requests (vs ~20 before) - Leak: 0 MB/cycle (stable across tests) **Test Infrastructure**: - 7-phase sequential test suite (tests/) - Docker stats integration + log analysis - Pool promotion verification - Memory leak detection - Full endpoint coverage Fixes memory issues reported in production deployments.
6.8 KiB
Executable File
6.8 KiB
Executable File