unclecode
aba4036ab6
Add demo and test scripts for monitor dashboard activity
...
- Introduced a demo script (`demo_monitor_dashboard.py`) to showcase various monitoring features through simulated activity.
- Implemented a test script (`test_monitor_demo.py`) to generate dashboard activity and verify monitor health and endpoint statistics.
- Added a logo image to the static assets for branding purposes.
2025-10-17 22:43:06 +08:00
unclecode
e2af031b09
feat(monitor): add real-time monitoring dashboard with Redis persistence
...
Complete observability solution for production deployments with terminal-style UI.
**Backend Implementation:**
- `monitor.py`: Stats manager tracking requests, browsers, errors, timeline data
- `monitor_routes.py`: REST API endpoints for all monitor functionality
- GET /monitor/health - System health snapshot
- GET /monitor/requests - Active & completed requests
- GET /monitor/browsers - Browser pool details
- GET /monitor/endpoints/stats - Aggregated endpoint analytics
- GET /monitor/timeline - Time-series data (memory, requests, browsers)
- GET /monitor/logs/{janitor,errors} - Event logs
- POST /monitor/actions/{cleanup,kill_browser,restart_browser} - Control actions
- POST /monitor/stats/reset - Reset counters
- Redis persistence for endpoint stats (survives restart)
- Timeline tracking (5min window, 5s resolution, 60 data points)
**Frontend Dashboard** (`/dashboard`):
- **System Health Bar**: CPU%, Memory%, Network I/O, Uptime
- **Pool Status**: Live counts (permanent/hot/cold browsers + memory)
- **Live Activity Tabs**:
- Requests: Active (realtime) + recent completed (last 100)
- Browsers: Detailed table with actions (kill/restart)
- Janitor: Cleanup event log with timestamps
- Errors: Recent errors with stack traces
- **Endpoint Analytics**: Count, avg latency, success%, pool hit%
- **Resource Timeline**: SVG charts (memory/requests/browsers) with terminal aesthetics
- **Control Actions**: Force cleanup, restart permanent, reset stats
- **Auto-refresh**: 5s polling (toggleable)
**Integration:**
- Janitor events tracked (close_cold, close_hot, promote)
- Crawler pool promotion events logged
- Timeline updater background task (5s interval)
- Lifespan hooks for monitor initialization
**UI Design:**
- Terminal vibe matching Crawl4AI theme
- Dark background, cyan/pink accents, monospace font
- Neon glow effects on charts
- Responsive layout, hover interactions
- Cross-navigation: Playground ↔ Monitor
**Key Features:**
- Zero-config: Works out of the box with existing Redis
- Real-time visibility into pool efficiency
- Manual browser management (kill/restart)
- Historical data persistence
- DevOps-friendly UX
Routes:
- API: `/monitor/*` (backend endpoints)
- UI: `/dashboard` (static HTML)
2025-10-17 21:36:25 +08:00
unclecode
b97eaeea4c
feat(docker): implement smart browser pool with 10x memory efficiency
...
Major refactoring to eliminate memory leaks and enable high-scale crawling:
- **Smart 3-Tier Browser Pool**:
- Permanent browser (always-ready default config)
- Hot pool (configs used 3+ times, longer TTL)
- Cold pool (new/rare configs, short TTL)
- Auto-promotion: cold → hot after 3 uses
- 100% pool reuse achieved in tests
- **Container-Aware Memory Detection**:
- Read cgroup v1/v2 memory limits (not host metrics)
- Accurate memory pressure detection in Docker
- Memory-based browser creation blocking
- **Adaptive Janitor**:
- Dynamic cleanup intervals (10s/30s/60s based on memory)
- Tiered TTLs: cold 30-300s, hot 120-600s
- Aggressive cleanup at high memory pressure
- **Unified Pool Usage**:
- All endpoints now use pool (/html, /screenshot, /pdf, /execute_js, /md, /llm)
- Fixed config signature mismatch (permanent browser matches endpoints)
- get_default_browser_config() helper for consistency
- **Configuration**:
- Reduced idle_ttl: 1800s → 300s (30min → 5min)
- Fixed port: 11234 → 11235 (match Gunicorn)
**Performance Results** (from stress tests):
- Memory: 10x reduction (500-700MB × N → 270MB permanent)
- Latency: 30-50x faster (<100ms pool hits vs 3-5s startup)
- Reuse: 100% for default config, 60%+ for variants
- Capacity: 100+ concurrent requests (vs ~20 before)
- Leak: 0 MB/cycle (stable across tests)
**Test Infrastructure**:
- 7-phase sequential test suite (tests/)
- Docker stats integration + log analysis
- Pool promotion verification
- Memory leak detection
- Full endpoint coverage
Fixes memory issues reported in production deployments.
2025-10-17 20:38:39 +08:00
unclecode
216019f29a
fix(marketplace): prevent hero image overflow and secondary card stretching
...
- Fixed hero image to 200px height with min/max constraints
- Added object-fit: cover to hero-image img elements
- Changed secondary-featured align-items from stretch to flex-start
- Fixed secondary-card height to 118px (no flex: 1 stretching)
- Updated responsive grid layouts for wider screens
- Added flex: 1 to hero-content for better content distribution
These changes ensure a rigid, predictable layout that prevents:
1. Large images from pushing text content down
2. Single secondary cards from stretching to fill entire height
2025-10-11 12:52:04 +08:00
unclecode
abe8a92561
fix(marketplace): resolve app detail page routing and styling issues
...
- Fixed JavaScript errors from missing HTML elements (install-code, usage-code, integration-code)
- Added missing CSS classes for tabs, overview layout, sidebar, and integration content
- Fixed tab navigation to display horizontally in single line
- Added proper padding to tab content sections (removed from container, added to content)
- Fixed tab selector from .nav-tab to .tab-btn to match HTML structure
- Added sidebar styling with stats grid and metadata display
- Improved responsive design with mobile-friendly tab scrolling
- Fixed code block positioning for copy buttons
- Removed margin from first headings to prevent extra spacing
- Added null checks for DOM elements in JavaScript to prevent errors
These changes resolve the routing issue where clicking on apps caused page redirects,
and fix the broken layout where CSS was not properly applied to the app detail page.
2025-10-11 11:51:22 +08:00
unclecode
5a4f21fad9
fix(marketplace): isolate api under marketplace prefix
2025-10-09 22:26:15 +08:00
unclecode
2c373f0642
fix(marketplace): align admin api with backend endpoints
2025-10-08 18:42:19 +08:00
unclecode
d2c7f345ab
feat(docs): add chatgpt quick link to page actions
2025-10-07 11:59:25 +08:00
unclecode
8c62277718
feat(marketplace): add sponsor logo uploads
...
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
2025-10-06 20:58:35 +08:00
unclecode
5145d42df7
fix(docs): hide copy menu on non-markdown pages
2025-10-03 20:11:20 +08:00
Nasrin
80aa6c11d9
Merge pull request #1530 from Sjoeborg/fix/arun-many-returns-none
...
Fix: run_urls() returns None, crashing arun_many()
2025-10-03 12:57:06 +08:00
unclecode
749d200866
fix(marketplace): Update URLs to use /marketplace path and relative API endpoints
...
- Change API_BASE to relative '/api' for production
- Move marketplace to /marketplace instead of /marketplace/frontend
- Update MkDocs navigation
- Fix logo path in marketplace index
2025-10-02 17:08:50 +08:00
unclecode
408ad1b750
feat(marketplace): Add Crawl4AI marketplace with secure configuration
...
- Implement marketplace frontend and admin dashboard
- Add FastAPI backend with environment-based configuration
- Use .env file for secrets management
- Include data generation scripts
- Add proper CORS configuration
- Remove hardcoded password from admin login
- Update gitignore for security
2025-10-02 16:41:11 +08:00
Martin Sjöborg
35dd206925
fix: always return a list, even if we catch an exception
2025-10-02 09:21:44 +02:00
Martin Sjöborg
8d30662647
fix: remove this import as it causes python to treat "json" as a variable in the except block
2025-10-02 09:19:15 +02:00
unclecode
ef46df10da
Update gitignore add local scripts folder
2025-09-30 18:31:57 +08:00
unclecode
0d8d043109
feat(docs): add brand book and page copy functionality
...
- Add comprehensive brand book with color system, typography, components
- Add page copy dropdown with markdown copy/view functionality
- Update mkdocs.yml with new assets and branding navigation
- Use terminal-style ASCII icons and condensed menu design
2025-09-30 18:28:05 +08:00
ntohidi
3fe49a766c
fix(docker-deployment): replace console.log with print for metadata extraction
2025-09-25 14:12:59 +08:00
ntohidi
fef715a891
Merge branch 'feature/docker-hooks' into develop
2025-09-25 14:11:46 +08:00
Nasrin
69e8ca3d0d
Merge pull request #1508 from unclecode/docker/base_config_overrides
...
#1505 fix(api): update config handling to only set base config if not provided by user
2025-09-22 18:02:14 +08:00
AHMET YILMAZ
a1950afd98
#1505 fix(api): update config handling to only set base config if not provided by user
2025-09-22 17:19:27 +08:00
Nasrin
d0eb5a6ffe
Merge pull request #1501 from unclecode/fix/n-playwright-stealth
...
feat(StealthAdapter): fix stealth features for Playwright integration
2025-09-19 14:17:35 +08:00
ntohidi
77559f3373
feat(StealthAdapter): fix stealth features for Playwright integration. ref #1481
2025-09-18 15:39:06 +08:00
Nasrin
3899ac3d3b
Merge pull request #1464 from unclecode/fix/proxy_deprecation
...
Fix/proxy deprecation
2025-09-16 15:48:45 +08:00
Nasrin
23431d8109
Merge pull request #1389 from unclecode/fix/deep-crawl-scoring
...
fix(deep-crawl): BestFirst priority inversion
2025-09-16 15:45:54 +08:00
AHMET YILMAZ
1717827732
refactor(BrowserConfig): change deprecation warning for 'proxy' parameter to UserWarning
2025-09-12 11:10:38 +08:00
Nasrin
f8eaf01ed1
Merge pull request #1467 from unclecode/fix/request-crawl-stream
...
Fix: request /crawl with stream: true issue
2025-09-11 17:40:43 +08:00
Nasrin
14b42b1f9a
Merge pull request #1471 from unclecode/fix/adaptive-crawler-llm-config
...
Fix: allow custom LLM providers for adaptive crawler embedding config…
2025-09-09 12:56:33 +08:00
ntohidi
3bc56dd028
fix: allow custom LLM providers for adaptive crawler embedding config. ref: #1291
...
- Change embedding_llm_config from Dict to Union[LLMConfig, Dict] for type safety
- Add backward-compatible conversion property _embedding_llm_config_dict
- Replace all hardcoded OpenAI embedding configs with configurable options
- Fix LLMConfig object attribute access in query expansion logic
- Add comprehensive example demonstrating multiple provider configurations
- Update documentation with both LLMConfig object and dictionary usage patterns
Users can now specify any LLM provider for query expansion in embedding strategy:
- New: embedding_llm_config=LLMConfig(provider='anthropic/claude-3', api_token='key')
- Old: embedding_llm_config={'provider': 'openai/gpt-4', 'api_token': 'key'} (still works)
2025-09-09 12:49:55 +08:00
AHMET YILMAZ
1874a7b8d2
fix: update option labels in request builder for clarity
2025-09-05 17:06:25 +08:00
Nasrin
0482c1eafc
Merge pull request #1469 from unclecode/fix/docker-jwt
...
Fix(auth): Fixed Docker JWT authentication
2025-09-04 15:00:15 +08:00
AHMET YILMAZ
6a3b3e9d38
Commit without API
2025-09-03 17:02:40 +08:00
Nasrin
1eacea1d2d
Merge pull request #1432 from unclecode/example/web2api-example
...
feat: Add comprehensive website to API example with frontend
2025-09-03 16:30:39 +08:00
Nasrin
bc6d8147d2
Merge pull request #1451 from unclecode/fix/remove-python3.9-version
...
Remove python 3.9 from supported versions and require Python >= 3.10
2025-09-02 16:50:40 +08:00
ntohidi
487839640f
fix: raise error on last attempt failure in perform_completion_with_backoff. ref #989
2025-09-02 16:49:01 +08:00
ntohidi
6772134a3a
remove: delete unused yoyo snapshot subproject
2025-09-02 12:07:08 +08:00
Nasrin
ae67d66b81
Merge pull request #1454 from nafeqq-1306/docstring-changes
...
issue #1329 : Docs are not detected due to triplequotes not being first line
2025-09-02 11:59:59 +08:00
Nasrin
af28e84a21
Merge pull request #1441 from unclecode/fix/improve-docker-error-handling
...
Improve docker error handling
2025-09-02 11:56:01 +08:00
Nasrin
5e7fcb17e1
Merge pull request #1448 from unclecode/fix/https-reditrect
...
feat: add preserve_https_for_internal_links flag to maintain HTTPS during crawling
2025-09-01 16:11:25 +08:00
ntohidi
6e728096fa
fix(auth): fixed Docker JWT authentication. ref #1442
2025-09-01 12:48:16 +08:00
Nasrin
2de200c1ba
Merge pull request #1433 from Thermofish/fix/excluded_selector
...
fix(deps): reintroduce cssselect to restore excluded_selector support (#1405 )
2025-08-29 16:08:24 +08:00
nafeqq-1306
9749e2832d
issue #1329 refactor(crawler): move unwanted properties to CrawlerRunConfig class
2025-08-29 10:20:47 +08:00
Soham Kukreti
70f473b84d
fix: drop Python 3.9 support and require Python >=3.10.
...
The library no longer supports Python 3.9 and so it was important to drop all references to python 3.9.
Following changes have been made:
- pyproject.toml: set requires-python to ">=3.10"; remove 3.9 classifier
- setup.py: set python_requires to ">=3.10"; remove 3.9 classifier
- docs: update Python version mentions
- deploy/docker/c4ai-doc-context.md: options -> 3.10, 3.11, 3.12, 3.13
2025-08-28 19:31:19 +05:30
ntohidi
bdacf61ca9
feat: update documentation for preserve_https_for_internal_links. ref #1410
2025-08-28 17:48:12 +08:00
ntohidi
f566c5a376
feat: add preserve_https_for_internal_links flag to maintain HTTPS during crawling. Ref #1410
...
Added a new `preserve_https_for_internal_links` configuration flag that preserves the original HTTPS scheme for same-domain links even when the server redirects to HTTP.
2025-08-28 17:38:40 +08:00
AHMET YILMAZ
4ed33fce9e
Remove deprecated test for 'proxy' parameter in BrowserConfig and update .gitignore to include test_scripts directory.
2025-08-28 17:26:10 +08:00
AHMET YILMAZ
f7a3366f72
#1375 : refactor(proxy) Deprecate 'proxy' parameter in BrowserConfig and enhance proxy string parsing
...
- Updated ProxyConfig.from_string to support multiple proxy formats, including URLs with credentials.
- Deprecated the 'proxy' parameter in BrowserConfig, replacing it with 'proxy_config' for better flexibility.
- Added warnings for deprecated usage and clarified behavior when both parameters are provided.
- Updated documentation and tests to reflect changes in proxy configuration handling.
2025-08-28 17:21:49 +08:00
Nasrin
4e1c4bd24e
Merge pull request #1436 from unclecode/fix/docker-filter
...
fix(docker): resolve filter serialization and JSON encoding errors in deep crawl strategy
2025-08-27 11:08:42 +08:00
Soham Kukreti
2ad3fb5fc8
feat(docker): improve docker error handling
...
- Return comprehensive error messages along with status codes for api internal errors.
- Fix fit_html property serialization issue in both /crawl and /crawl/stream endpoints
- Add sanitization to ensure fit_html is always JSON-serializable (string or None)
- Add comprehensive error handling test suite.
2025-08-26 23:18:35 +05:30
Nasrin
cce3390a2d
Merge pull request #1426 from unclecode/fix/update-quickstart-and-adaptive-strategies-docs
...
Update Quickstart and Adaptive Strategies documentation
2025-08-26 16:53:47 +08:00