feat(crawler): optimize single URL handling and add performance comparison

Add special handling for single URL requests in Docker API to use arun() instead of arun_many()
Add new example script demonstrating performance differences between sequential and parallel crawling
Update cache mode from aggressive to bypass in examples and tests
Remove unused dependencies (zstandard, msgpack)

BREAKING CHANGE: Changed default cache_mode from aggressive to bypass in examples
This commit is contained in:
UncleCode
2025-03-13 22:15:15 +08:00
parent dc36997a08
commit b750542e6d
6 changed files with 95 additions and 10 deletions

View File

@@ -73,7 +73,7 @@ async def test_stream_crawl(session, token: str):
# "https://news.ycombinator.com/news"
],
"browser_config": {"headless": True, "viewport": {"width": 1200}},
"crawler_config": {"stream": True, "cache_mode": "aggressive"}
"crawler_config": {"stream": True, "cache_mode": "bypass"}
}
headers = {"Authorization": f"Bearer {token}"}
print(f"\nTesting Streaming Crawl: {url}")