perf(crawler): major performance improvements & raw HTML support
- Switch to lxml parser (~4x speedup) - Add raw HTML & local file crawling support - Fix cache headers & async cleanup - Add browser process monitoring - Optimize BeautifulSoup operations - Pre-compile regex patterns Breaking: Raw HTML handling requires new URL prefixes Fixes: #256, #253
This commit is contained in:
2179
tests/async/sample_wikipedia.html
Normal file
2179
tests/async/sample_wikipedia.html
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user