diff --git a/deploy/docker/ARCHITECTURE.md b/deploy/docker/ARCHITECTURE.md new file mode 100644 index 00000000..e4446aba --- /dev/null +++ b/deploy/docker/ARCHITECTURE.md @@ -0,0 +1,822 @@ +# Crawl4AI Docker Architecture - AI Context Map + +**Purpose:** Dense technical reference for AI agents to understand complete system architecture. +**Format:** Symbolic, compressed, high-information-density documentation. + +--- + +## System Overview + +``` +┌─────────────────────────────────────────────────────────────┐ +│ CRAWL4AI DOCKER ORCHESTRATION SYSTEM │ +├─────────────────────────────────────────────────────────────┤ +│ Modes: Single (N=1) | Swarm (N>1) | Compose+Nginx (N>1) │ +│ Entry: cnode CLI → deploy/docker/cnode_cli.py │ +│ Core: deploy/docker/server_manager.py │ +│ Server: deploy/docker/server.py (FastAPI) │ +│ API: deploy/docker/api.py (crawl endpoints) │ +│ Monitor: deploy/docker/monitor.py + monitor_routes.py │ +└─────────────────────────────────────────────────────────────┘ +``` + +--- + +## Directory Structure & File Map + +``` +deploy/ +├── docker/ # Server runtime & orchestration +│ ├── server.py # FastAPI app entry [CRITICAL] +│ ├── api.py # /crawl, /screenshot, /pdf endpoints +│ ├── server_manager.py # Docker orchestration logic [CORE] +│ ├── cnode_cli.py # CLI interface (Click-based) +│ ├── monitor.py # Real-time metrics collector +│ ├── monitor_routes.py # /monitor dashboard routes +│ ├── crawler_pool.py # Browser pool management +│ ├── hook_manager.py # Pre/post crawl hooks +│ ├── job.py # Job queue schema +│ ├── utils.py # Helpers (port check, health) +│ ├── auth.py # API key authentication +│ ├── schemas.py # Pydantic models +│ ├── mcp_bridge.py # MCP protocol bridge +│ ├── supervisord.conf # Process manager config +│ ├── config.yml # Server config template +│ ├── requirements.txt # Python deps +│ ├── static/ # Web assets +│ │ ├── monitor/ # Dashboard UI +│ │ └── playground/ # API playground +│ └── tests/ # Test suite +│ +└── installer/ # User-facing installation + ├── cnode_pkg/ # Standalone package + │ ├── cli.py # Copy of cnode_cli.py + │ ├── server_manager.py # Copy of server_manager.py + │ └── requirements.txt # click, rich, anyio, pyyaml + ├── install-cnode.sh # Remote installer (git sparse-checkout) + ├── sync-cnode.sh # Dev tool (source→pkg sync) + ├── USER_GUIDE.md # Human-readable guide + ├── README.md # Developer documentation + └── QUICKSTART.md # Cheat sheet +``` + +--- + +## Core Components Deep Dive + +### 1. `server_manager.py` - Orchestration Engine + +**Role:** Manages Docker container lifecycle, auto-detects deployment mode. + +**Key Classes:** +- `ServerManager` - Main orchestrator + - `start(replicas, mode, port, env_file, image)` → Deploy server + - `stop(remove_volumes)` → Teardown + - `status()` → Health check + - `scale(replicas)` → Live scaling + - `logs(follow, tail)` → Stream logs + - `cleanup(force)` → Emergency cleanup + +**State Management:** +- File: `~/.crawl4ai/server_state.yml` +- Schema: `{mode, replicas, port, image, started_at, containers[]}` +- Atomic writes with lock file + +**Deployment Modes:** +```python +if replicas == 1: + mode = "single" # docker run +elif swarm_available(): + mode = "swarm" # docker stack deploy +else: + mode = "compose" # docker-compose + nginx +``` + +**Container Naming:** +- Single: `crawl4ai-server` +- Swarm: `crawl4ai-stack_crawl4ai` +- Compose: `crawl4ai-server-{1..N}`, `crawl4ai-nginx` + +**Networks:** +- `crawl4ai-network` (bridge mode for all) + +**Volumes:** +- `crawl4ai-redis-data` - Persistent queue +- `crawl4ai-profiles` - Browser profiles + +**Health Checks:** +- Endpoint: `http://localhost:{port}/health` +- Timeout: 30s startup +- Retry: 3 attempts + +--- + +### 2. `server.py` - FastAPI Application + +**Role:** HTTP server exposing crawl API + monitoring. + +**Startup Flow:** +```python +app = FastAPI() +@app.on_event("startup") +async def startup(): + init_crawler_pool() # Pre-warm browsers + init_redis_connection() # Job queue + start_monitor_collector() # Metrics +``` + +**Key Endpoints:** +``` +POST /crawl → api.py:crawl_endpoint() +POST /crawl/stream → api.py:crawl_stream_endpoint() +POST /screenshot → api.py:screenshot_endpoint() +POST /pdf → api.py:pdf_endpoint() +GET /health → server.py:health_check() +GET /monitor → monitor_routes.py:dashboard() +WS /monitor/ws → monitor_routes.py:websocket_endpoint() +GET /playground → static/playground/index.html +``` + +**Process Manager:** +- Uses `supervisord` to manage: + - FastAPI server (port 11235) + - Redis (port 6379) + - Background workers + +**Environment:** +```bash +CRAWL4AI_PORT=11235 +REDIS_URL=redis://localhost:6379 +MAX_CONCURRENT_CRAWLS=5 +BROWSER_POOL_SIZE=3 +``` + +--- + +### 3. `api.py` - Crawl Endpoints + +**Main Endpoint:** `POST /crawl` + +**Request Schema:** +```json +{ + "urls": ["https://example.com"], + "priority": 10, + "browser_config": { + "type": "BrowserConfig", + "params": {"headless": true, "viewport_width": 1920} + }, + "crawler_config": { + "type": "CrawlerRunConfig", + "params": {"cache_mode": "bypass", "extraction_strategy": {...}} + } +} +``` + +**Processing Flow:** +``` +1. Validate request (Pydantic) +2. Queue job → Redis +3. Get browser from pool → crawler_pool.py +4. Execute crawl → AsyncWebCrawler +5. Apply hooks → hook_manager.py +6. Return result (JSON) +7. Release browser to pool +``` + +**Memory Management:** +- Browser pool: Max 3 instances +- LRU eviction when pool full +- Explicit cleanup: `browser.close()` in finally block +- Redis TTL: 1 hour for completed jobs + +**Error Handling:** +```python +try: + result = await crawler.arun(url, config) +except PlaywrightError as e: + # Browser crash - release & recreate + await pool.invalidate(browser_id) +except TimeoutError as e: + # Timeout - kill & retry + await crawler.kill() +except Exception as e: + # Unknown - log & fail gracefully + logger.error(f"Crawl failed: {e}") +``` + +--- + +### 4. `crawler_pool.py` - Browser Pool Manager + +**Role:** Manage persistent browser instances to avoid startup overhead. + +**Class:** `CrawlerPool` +- `get_crawler()` → Lease browser (async with context manager) +- `release_crawler(id)` → Return to pool +- `warm_up(count)` → Pre-launch browsers +- `cleanup()` → Close all browsers + +**Pool Strategy:** +```python +pool = { + "browser_1": {"crawler": AsyncWebCrawler(), "in_use": False}, + "browser_2": {"crawler": AsyncWebCrawler(), "in_use": False}, + "browser_3": {"crawler": AsyncWebCrawler(), "in_use": False}, +} + +async with pool.get_crawler() as crawler: + result = await crawler.arun(url) + # Auto-released on context exit +``` + +**Anti-Leak Mechanisms:** +1. Context managers enforce cleanup +2. Watchdog thread kills stale browsers (>10min idle) +3. Max lifetime: 1 hour per browser +4. Force GC after browser close + +--- + +### 5. `monitor.py` + `monitor_routes.py` - Real-time Dashboard + +**Architecture:** +``` +[Browser] <--WebSocket--> [monitor_routes.py] <--Events--> [monitor.py] + ↓ + [Redis Pub/Sub] + ↓ + [Metrics Collector] +``` + +**Metrics Collected:** +- Requests/sec (sliding window) +- Active crawls (real-time count) +- Response times (p50, p95, p99) +- Error rate (5min rolling) +- Memory usage (RSS, heap) +- Browser pool utilization + +**WebSocket Protocol:** +```json +// Server → Client +{ + "type": "metrics", + "data": { + "rps": 45.3, + "active_crawls": 12, + "p95_latency": 1234, + "error_rate": 0.02 + } +} + +// Client → Server +{ + "type": "subscribe", + "channels": ["metrics", "logs"] +} +``` + +**Dashboard Route:** `/monitor` +- Real-time graphs (Chart.js) +- Request log stream +- Container health status +- Resource utilization + +--- + +### 6. `cnode_cli.py` - CLI Interface + +**Framework:** Click (Python CLI framework) + +**Command Structure:** +``` +cnode +├── start [--replicas N] [--port P] [--mode M] [--image I] +├── stop [--remove-volumes] +├── status +├── scale N +├── logs [--follow] [--tail N] +├── restart [--replicas N] +└── cleanup [--force] +``` + +**Execution Flow:** +```python +@cli.command("start") +def start_cmd(replicas, mode, port, env_file, image): + manager = ServerManager() + result = anyio.run(manager.start(...)) # Async bridge + if result["success"]: + console.print(success_panel) +``` + +**User Feedback:** +- Rich library for colors/tables +- Progress spinners during operations +- Error messages with hints +- Status tables with health indicators + +**State Persistence:** +- Saves deployment config to `~/.crawl4ai/server_state.yml` +- Enables stateless commands (status, scale, restart) + +--- + +### 7. Docker Orchestration Details + +**Single Container Mode (N=1):** +```bash +docker run -d \ + --name crawl4ai-server \ + --network crawl4ai-network \ + -p 11235:11235 \ + -v crawl4ai-redis-data:/data \ + unclecode/crawl4ai:latest +``` + +**Docker Swarm Mode (N>1, Swarm available):** +```yaml +# docker-compose.swarm.yml +version: '3.8' +services: + crawl4ai: + image: unclecode/crawl4ai:latest + deploy: + replicas: 5 + update_config: + parallelism: 2 + delay: 10s + restart_policy: + condition: on-failure + ports: + - "11235:11235" + networks: + - crawl4ai-network +``` + +Deploy: `docker stack deploy -c docker-compose.swarm.yml crawl4ai-stack` + +**Docker Compose + Nginx Mode (N>1, fallback):** +```yaml +# docker-compose.yml +services: + crawl4ai-1: + image: unclecode/crawl4ai:latest + networks: [crawl4ai-network] + + crawl4ai-2: + image: unclecode/crawl4ai:latest + networks: [crawl4ai-network] + + nginx: + image: nginx:alpine + ports: ["11235:80"] + volumes: + - ./nginx.conf:/etc/nginx/nginx.conf + networks: [crawl4ai-network] +``` + +Nginx config (round-robin load balancing): +```nginx +upstream crawl4ai_backend { + server crawl4ai-1:11235; + server crawl4ai-2:11235; + server crawl4ai-3:11235; +} + +server { + listen 80; + location / { + proxy_pass http://crawl4ai_backend; + proxy_set_header Host $host; + } +} +``` + +--- + +## Memory Leak Prevention Strategy + +### Problem Areas & Solutions + +**1. Browser Instances** +```python +# ❌ BAD - Leak risk +crawler = AsyncWebCrawler() +result = await crawler.arun(url) +# Browser never closed! + +# ✅ GOOD - Guaranteed cleanup +async with AsyncWebCrawler() as crawler: + result = await crawler.arun(url) + # Auto-closed on exit +``` + +**2. WebSocket Connections** +```python +# monitor_routes.py +active_connections = set() + +@app.websocket("/monitor/ws") +async def websocket_endpoint(websocket): + await websocket.accept() + active_connections.add(websocket) + try: + while True: + await websocket.send_json(get_metrics()) + finally: + active_connections.remove(websocket) # Critical! +``` + +**3. Redis Connections** +```python +# Use connection pooling +redis_pool = aioredis.ConnectionPool( + host="localhost", + port=6379, + max_connections=10, + decode_responses=True +) + +# Reuse connections +async def get_job(job_id): + async with redis_pool.get_connection() as conn: + data = await conn.get(f"job:{job_id}") + # Connection auto-returned to pool +``` + +**4. Async Task Cleanup** +```python +# Track background tasks +background_tasks = set() + +async def crawl_task(url): + try: + result = await crawl(url) + finally: + background_tasks.discard(asyncio.current_task()) + +# On shutdown +async def shutdown(): + tasks = list(background_tasks) + for task in tasks: + task.cancel() + await asyncio.gather(*tasks, return_exceptions=True) +``` + +**5. File Descriptor Leaks** +```python +# Use context managers for files +async def save_screenshot(url): + async with aiofiles.open(f"{job_id}.png", "wb") as f: + await f.write(screenshot_bytes) + # File auto-closed +``` + +--- + +## Installation & Distribution + +### User Installation Flow + +**Script:** `deploy/installer/install-cnode.sh` + +**Steps:** +1. Check Python 3.8+ exists +2. Check pip available +3. Check Docker installed (warn if missing) +4. Create temp dir: `mktemp -d` +5. Git sparse-checkout: + ```bash + git init + git remote add origin https://github.com/unclecode/crawl4ai.git + git config core.sparseCheckout true + echo "deploy/installer/cnode_pkg/*" > .git/info/sparse-checkout + git pull --depth=1 origin main + ``` +6. Install deps: `pip install click rich anyio pyyaml` +7. Copy package: `cnode_pkg/ → /usr/local/lib/cnode/` +8. Create wrapper: `/usr/local/bin/cnode` + ```bash + #!/usr/bin/env bash + export PYTHONPATH="/usr/local/lib/cnode:$PYTHONPATH" + exec python3 -m cnode_pkg.cli "$@" + ``` +9. Cleanup temp dir + +**Result:** +- Binary-like experience (fast startup: ~0.1s) +- No need for PyInstaller (49x faster) +- Platform-independent (any OS with Python) + +--- + +## Development Workflow + +### Source Code Sync (Auto) + +**Git Hook:** `.githooks/pre-commit` + +**Trigger:** When committing `deploy/docker/cnode_cli.py` or `server_manager.py` + +**Action:** +```bash +1. Diff source vs package +2. If different: + - Run sync-cnode.sh + - Copy cnode_cli.py → cnode_pkg/cli.py + - Fix imports: s/deploy.docker/cnode_pkg/g + - Copy server_manager.py → cnode_pkg/ + - Stage synced files +3. Continue commit +``` + +**Setup:** `./setup-hooks.sh` (configures `git config core.hooksPath .githooks`) + +**Smart Behavior:** +- Silent when no sync needed +- Only syncs if content differs +- Minimal output: `✓ cnode synced` + +--- + +## API Request/Response Flow + +### Example: POST /crawl + +**Request:** +```bash +curl -X POST http://localhost:11235/crawl \ + -H "Content-Type: application/json" \ + -d '{ + "urls": ["https://example.com"], + "browser_config": { + "type": "BrowserConfig", + "params": {"headless": true} + }, + "crawler_config": { + "type": "CrawlerRunConfig", + "params": {"cache_mode": "bypass"} + } + }' +``` + +**Processing:** +``` +1. FastAPI receives request → api.py:crawl_endpoint() +2. Validate schema → Pydantic models in schemas.py +3. Create job → job.py:Job(id=uuid4(), urls=[...]) +4. Queue to Redis → LPUSH crawl_queue {job_json} +5. Get browser from pool → crawler_pool.py:get_crawler() +6. Execute crawl: + a. Launch page → browser.new_page() + b. Navigate → page.goto(url) + c. Extract → extraction_strategy.extract() + d. Generate markdown → markdown_generator.generate() +7. Store result → Redis SETEX result:{job_id} 3600 {result_json} +8. Release browser → pool.release(browser_id) +9. Return response: + { + "success": true, + "result": { + "url": "https://example.com", + "markdown": "# Example Domain...", + "metadata": {"title": "Example Domain"}, + "extracted_content": {...} + } + } +``` + +**Error Cases:** +- 400: Invalid request schema +- 429: Rate limit exceeded +- 500: Internal error (browser crash, timeout) +- 503: Service unavailable (all browsers busy) + +--- + +## Scaling Behavior + +### Scale-Up (1 → 10 replicas) + +**Command:** `cnode scale 10` + +**Swarm Mode:** +```bash +docker service scale crawl4ai-stack_crawl4ai=10 +# Docker handles: +# - Container creation +# - Network attachment +# - Load balancer update +# - Rolling deployment +``` + +**Compose Mode:** +```bash +# Update docker-compose.yml +# Change replica count in all service definitions +docker-compose up -d --scale crawl4ai=10 +# Regenerate nginx.conf with 10 upstreams +docker exec nginx nginx -s reload +``` + +**Load Distribution:** +- Swarm: Built-in ingress network (VIP-based round-robin) +- Compose: Nginx upstream (round-robin, can configure least_conn) + +**Zero-Downtime:** +- Swarm: Yes (rolling update, parallelism=2) +- Compose: Partial (nginx reload is graceful, but brief spike) + +--- + +## Configuration Files + +### `config.yml` - Server Configuration + +```yaml +server: + port: 11235 + host: "0.0.0.0" + workers: 4 + +crawler: + max_concurrent: 5 + timeout: 30 + retries: 3 + +browser: + pool_size: 3 + headless: true + args: + - "--no-sandbox" + - "--disable-dev-shm-usage" + +redis: + host: "localhost" + port: 6379 + db: 0 + +monitoring: + enabled: true + metrics_interval: 5 # seconds +``` + +### `supervisord.conf` - Process Management + +```ini +[supervisord] +nodaemon=true + +[program:redis] +command=redis-server --port 6379 +autorestart=true + +[program:fastapi] +command=uvicorn server:app --host 0.0.0.0 --port 11235 +autorestart=true +stdout_logfile=/var/log/crawl4ai/api.log + +[program:monitor] +command=python monitor.py +autorestart=true +``` + +--- + +## Testing & Quality + +### Test Structure + +``` +deploy/docker/tests/ +├── cli/ # CLI command tests +│ └── test_commands.py # start, stop, scale, status +├── monitor/ # Dashboard tests +│ └── test_websocket.py # WS connection, metrics +└── codebase_test/ # Integration tests + └── test_api.py # End-to-end crawl tests +``` + +### Key Test Cases + +**CLI Tests:** +- `test_start_single()` - Starts 1 replica +- `test_start_cluster()` - Starts N replicas +- `test_scale_up()` - Scales 1→5 +- `test_scale_down()` - Scales 5→2 +- `test_status()` - Reports correct state +- `test_logs()` - Streams logs + +**API Tests:** +- `test_crawl_success()` - Basic crawl works +- `test_crawl_timeout()` - Handles slow sites +- `test_concurrent_crawls()` - Parallel requests +- `test_browser_pool()` - Reuses browsers +- `test_memory_cleanup()` - No leaks after 100 crawls + +**Monitor Tests:** +- `test_websocket_connect()` - WS handshake +- `test_metrics_stream()` - Receives updates +- `test_multiple_clients()` - Handles N connections + +--- + +## Critical File Cross-Reference + +| Component | Primary File | Dependencies | +|-----------|--------------|--------------| +| **CLI Entry** | `cnode_cli.py:482` | `server_manager.py`, `click`, `rich` | +| **Orchestrator** | `server_manager.py:45` | `docker`, `yaml`, `anyio` | +| **API Server** | `server.py:120` | `api.py`, `monitor_routes.py` | +| **Crawl Logic** | `api.py:78` | `crawler_pool.py`, `AsyncWebCrawler` | +| **Browser Pool** | `crawler_pool.py:23` | `AsyncWebCrawler`, `asyncio` | +| **Monitoring** | `monitor.py:156` | `redis`, `psutil` | +| **Dashboard** | `monitor_routes.py:89` | `monitor.py`, `websockets` | +| **Hooks** | `hook_manager.py:12` | `api.py`, custom user hooks | + +**Startup Chain:** +``` +cnode start + └→ cnode_cli.py:start_cmd() + └→ server_manager.py:start() + └→ docker run/stack/compose + └→ supervisord + ├→ redis-server + ├→ server.py + │ └→ api.py (routes) + │ └→ crawler_pool.py (init) + └→ monitor.py (collector) +``` + +--- + +## Symbolic Notation Summary + +``` +⊕ Addition/Creation ⊖ Removal/Cleanup +⊗ Multiplication/Scale ⊘ Division/Split +→ Flow/Dependency ← Reverse flow +⇄ Bidirectional ⇵ Up/Down scale +✓ Success/Complete ✗ Failure/Error +⚠ Warning ⚡ Performance critical +🔒 Lock/Exclusive 🔓 Unlock/Shared +📊 Metrics 📝 Logs +🌐 Network 💾 Storage +🧠 Memory 🐳 Docker +``` + +**State Machine:** +``` +[STOPPED] →start→ [STARTING] →ready→ [RUNNING] +[RUNNING] →scale→ [SCALING] →done→ [RUNNING] +[RUNNING] →stop→ [STOPPING] →done→ [STOPPED] +[ANY] →error→ [ERROR] →cleanup→ [STOPPED] +``` + +--- + +## Quick Reference Commands + +```bash +# Install +curl -sSL https://raw.githubusercontent.com/unclecode/crawl4ai/main/deploy/installer/install-cnode.sh | bash + +# Deploy +cnode start --replicas 5 --port 11235 + +# Monitor +open http://localhost:11235/monitor + +# Scale +cnode scale 10 + +# Test +curl -X POST http://localhost:11235/crawl -H "Content-Type: application/json" -d '{"urls":["https://example.com"]}' + +# Logs +cnode logs -f + +# Teardown +cnode stop --remove-volumes +``` + +--- + +## Future AI Agent Instructions + +**When debugging:** Read this file first, then check specific source files mentioned. + +**When extending:** Follow existing patterns (context managers, async/await, type hints). + +**When optimizing:** Profile first (monitor.py metrics), then optimize hot paths (crawler_pool.py, api.py). + +**When troubleshooting memory:** Check browser pool cleanup, WebSocket connection tracking, Redis connection pooling. + +**When scaling issues:** Verify Docker mode (swarm vs compose), check nginx config if compose, review load balancer logs. + +--- + +**END OF ARCHITECTURE MAP** +*Version: 1.0.0 | Last Updated: 2025-10-21 | Token-Optimized for AI Consumption* diff --git a/deploy/installer/QUICKSTART.md b/deploy/installer/QUICKSTART.md new file mode 100644 index 00000000..4ff8ee3f --- /dev/null +++ b/deploy/installer/QUICKSTART.md @@ -0,0 +1,147 @@ +# Crawl4AI cnode - Quick Start Cheat Sheet + +Fast reference for getting started with cnode. + +--- + +## 📥 Install + +```bash +# Install cnode +curl -sSL https://raw.githubusercontent.com/unclecode/crawl4ai/main/deploy/installer/install-cnode.sh | bash +``` + +--- + +## 🚀 Launch Cluster + +```bash +# Single server (development) +cnode start + +# Production cluster with 5 replicas +cnode start --replicas 5 + +# Custom port +cnode start --replicas 3 --port 8080 +``` + +--- + +## 📊 Check Status + +```bash +# View server status +cnode status + +# View logs +cnode logs -f +``` + +--- + +## ⚙️ Scale Cluster + +```bash +# Scale to 10 replicas (live, no downtime) +cnode scale 10 + +# Scale down to 2 +cnode scale 2 +``` + +--- + +## 🔄 Restart/Stop + +```bash +# Restart server +cnode restart + +# Stop server +cnode stop +``` + +--- + +## 🌐 Test the API + +```bash +# Simple test - crawl example.com +curl -X POST http://localhost:11235/crawl \ + -H "Content-Type: application/json" \ + -d '{ + "urls": ["https://example.com"], + "priority": 10 + }' + +# Pretty print with jq +curl -X POST http://localhost:11235/crawl \ + -H "Content-Type: application/json" \ + -d '{ + "urls": ["https://example.com"], + "priority": 10 + }' | jq '.result.markdown' -r + +# Health check +curl http://localhost:11235/health +``` + +--- + +## 📱 Monitor Dashboard + +```bash +# Open in browser +open http://localhost:11235/monitor + +# Or playground +open http://localhost:11235/playground +``` + +--- + +## 🐍 Python Example + +```python +import requests + +response = requests.post( + "http://localhost:11235/crawl", + json={ + "urls": ["https://example.com"], + "priority": 10 + } +) + +result = response.json() +print(result['result']['markdown']) +``` + +--- + +## 🎯 Common Commands + +| Command | Description | +|---------|-------------| +| `cnode start` | Start server | +| `cnode start -r 5` | Start with 5 replicas | +| `cnode status` | Check status | +| `cnode scale 10` | Scale to 10 replicas | +| `cnode logs -f` | Follow logs | +| `cnode restart` | Restart server | +| `cnode stop` | Stop server | +| `cnode --help` | Show all commands | + +--- + +## 📚 Full Documentation + +- **User Guide:** `deploy/installer/USER_GUIDE.md` +- **Developer Docs:** `deploy/installer/README.md` +- **Docker Guide:** `deploy/docker/README.md` +- **Agent Context:** `deploy/docker/AGENT.md` + +--- + +**That's it!** You're ready to crawl at scale 🚀