feat: Add HTTP-only crawling endpoints and related models

- Introduced HTTPCrawlRequest and HTTPCrawlRequestWithHooks models for HTTP-only crawling. - Implemented /crawl/http and /crawl/http/stream endpoints for fast, lightweight crawling without browser rendering. - Enhanced server.py to handle HTTP crawl requests and streaming responses. - Updated utils.py to disable memory wait timeout for testing. - Expanded API documentation to include new HTTP crawling features. - Added tests for HTTP crawling endpoints, including error handling and streaming responses.
2025-10-15 17:45:58 +08:00
parent aebf5a3694
commit 674d0741da
8 changed files with 1091 additions and 45 deletions
--- a/deploy/docker/utils.py
+++ b/deploy/docker/utils.py
@@ -59,7 +59,7 @@ DISPATCHER_DEFAULTS = {
        "check_interval": 1.0,
        "max_session_permit": 20,
        "fairness_timeout": 600.0,
-        "memory_wait_timeout": 600.0,
+        "memory_wait_timeout": None,  # Disable memory timeout for testing
    },
    "semaphore": {
        "semaphore_count": 5,