feat: Add HTTP-only crawling endpoints and related models

- Introduced HTTPCrawlRequest and HTTPCrawlRequestWithHooks models for HTTP-only crawling.
- Implemented /crawl/http and /crawl/http/stream endpoints for fast, lightweight crawling without browser rendering.
- Enhanced server.py to handle HTTP crawl requests and streaming responses.
- Updated utils.py to disable memory wait timeout for testing.
- Expanded API documentation to include new HTTP crawling features.
- Added tests for HTTP crawling endpoints, including error handling and streaming responses.
This commit is contained in:
AHMET YILMAZ
2025-10-15 17:45:58 +08:00
parent aebf5a3694
commit 674d0741da
8 changed files with 1091 additions and 45 deletions

View File

@@ -59,7 +59,7 @@ DISPATCHER_DEFAULTS = {
"check_interval": 1.0,
"max_session_permit": 20,
"fairness_timeout": 600.0,
"memory_wait_timeout": 600.0,
"memory_wait_timeout": None, # Disable memory timeout for testing
},
"semaphore": {
"semaphore_count": 5,