unclecode
78120df47e
chore: update .gitignore from main
2025-11-09 19:19:52 +08:00
unclecode
b79311b3f6
feat(agent): migrate from Claude SDK to OpenAI Agents SDK with enhanced UI
...
Major architectural changes:
- Migrate from Claude Agent SDK to OpenAI Agents SDK for better performance and reliability
- Complete rewrite of core agent system with improved conversation memory
- Enhanced terminal UI with Claude Code-inspired design
Core Changes:
1. SDK Migration
- Replace Claude SDK (@tool decorator) with OpenAI SDK (@function_tool)
- Simplify tool response format (direct returns vs wrapped content)
- Remove ClaudeSDKClient, use Agent + Runner pattern
- Add conversation history tracking for context retention across turns
- Set max_turns=100 for complex multi-step tasks
2. Tool System (crawl_tools.py)
- Convert all 7 tools to @function_tool decorator
- Simplify return types (JSON strings vs content blocks)
- Type-safe parameters with proper annotations
- Maintain browser singleton pattern for efficiency
3. Chat Mode Improvements
- Add persistent conversation history for better context
- Fix streaming response display (extract from message_output_item)
- Tool visibility: show name and key arguments during execution
- Remove duplicate tips (moved to header)
4. Terminal UI Overhaul
- Claude Code-inspired header with vertical divider
- Left panel: Crawl4AI logo (cyan), version, current directory
- Right panel: Tips, session info
- Proper styling: white headers, dim text, cyan highlights
- Centered logo and text alignment using Rich Table
5. Input Handling Enhancement
- Reverse keybindings: Enter=submit, Option+Enter/Ctrl+J=newline
- Support multiple newline methods (Option+Enter, Esc+Enter, Ctrl+J)
- Remove redundant tip messages
- Better iTerm2 compatibility with Option key
6. Module Organization
- Rename c4ai_tools.py → crawl_tools.py
- Rename c4ai_prompts.py → crawl_prompts.py
- Update __init__.py exports (remove CrawlAgent to fix import warning)
- Generate unique session IDs (session_<timestamp>)
7. Bug Fixes
- Fix module import warning when running with python -m
- Fix text extraction from OpenAI message_output_item
- Fix tool name extraction from raw_item.name
- Remove leftover old file references
Performance Improvements:
- 20x faster startup (no CLI subprocess)
- Direct API calls vs spawning claude process
- Cleaner async patterns with Runner.run_streamed()
Files Changed:
- crawl4ai/agent/__init__.py - Update exports
- crawl4ai/agent/agent_crawl.py - Rewrite with OpenAI SDK
- crawl4ai/agent/chat_mode.py - Add conversation memory, fix streaming
- crawl4ai/agent/terminal_ui.py - Complete UI redesign
- crawl4ai/agent/crawl_tools.py - New (renamed from c4ai_tools.py)
- crawl4ai/agent/crawl_prompts.py - New (renamed from c4ai_prompts.py)
Breaking Changes:
- Requires openai-agents-sdk (pip install git+https://github.com/openai/openai-agents-python.git )
- Tool response format changed (affects custom tools)
- OPENAI_API_KEY required instead of ANTHROPIC_API_KEY
Version: 0.1.0
2025-10-17 21:51:43 +08:00
unclecode
7667cd146f
failed agent sdk using claude code
2025-10-17 16:38:59 +08:00
unclecode
31741e571a
feat(agent): implement Claude Code SDK agent with chat mode and persistent browser
...
Implementation:
- Singleton browser pattern (BrowserManager) - one instance for entire session
- 7 MCP tools for Crawl4AI (quick_crawl, sessions, navigation, extraction, JS execution, screenshots)
- Interactive chat mode with streaming I/O using Claude SDK message generator
- Rich-based terminal UI with markdown rendering and syntax highlighting
- Single-shot and chat modes (--chat flag)
- Comprehensive test suite: component tests, tool tests, 9 multi-turn scenarios
Architecture:
- agent_crawl.py: CLI entry point with SessionStorage (JSONL logging)
- browser_manager.py: Singleton pattern for persistent AsyncWebCrawler
- c4ai_tools.py: MCP tools using @tool decorator, integrated with BrowserManager
- chat_mode.py: Streaming input mode per Claude SDK spec
- terminal_ui.py: Rich-based beautiful terminal output
- test_scenarios.py: Automated multi-turn conversation tests (simple/medium/complex)
- TECH_SPEC.md: Complete AI-to-AI knowledge transfer document
Key fixes:
- Use result.markdown (not deprecated result.markdown_v2)
- Handle both str and MarkdownGenerationResult types
- Track current URL per session for extract_data/execute_js/screenshot tools
- Manual browser lifecycle (start/close) instead of context managers
Tools enabled:
- Crawl4AI: quick_crawl, start_session, navigate, extract_data, execute_js, screenshot, close_session
- Claude SDK built-in: Read, Write, Edit, Glob, Grep, Bash, NotebookEdit
Total: 12 files, 2820 lines
2025-10-17 12:25:45 +08:00
unclecode
216019f29a
fix(marketplace): prevent hero image overflow and secondary card stretching
...
- Fixed hero image to 200px height with min/max constraints
- Added object-fit: cover to hero-image img elements
- Changed secondary-featured align-items from stretch to flex-start
- Fixed secondary-card height to 118px (no flex: 1 stretching)
- Updated responsive grid layouts for wider screens
- Added flex: 1 to hero-content for better content distribution
These changes ensure a rigid, predictable layout that prevents:
1. Large images from pushing text content down
2. Single secondary cards from stretching to fill entire height
2025-10-11 12:52:04 +08:00
unclecode
abe8a92561
fix(marketplace): resolve app detail page routing and styling issues
...
- Fixed JavaScript errors from missing HTML elements (install-code, usage-code, integration-code)
- Added missing CSS classes for tabs, overview layout, sidebar, and integration content
- Fixed tab navigation to display horizontally in single line
- Added proper padding to tab content sections (removed from container, added to content)
- Fixed tab selector from .nav-tab to .tab-btn to match HTML structure
- Added sidebar styling with stats grid and metadata display
- Improved responsive design with mobile-friendly tab scrolling
- Fixed code block positioning for copy buttons
- Removed margin from first headings to prevent extra spacing
- Added null checks for DOM elements in JavaScript to prevent errors
These changes resolve the routing issue where clicking on apps caused page redirects,
and fix the broken layout where CSS was not properly applied to the app detail page.
2025-10-11 11:51:22 +08:00
unclecode
5a4f21fad9
fix(marketplace): isolate api under marketplace prefix
2025-10-09 22:26:15 +08:00
unclecode
2c373f0642
fix(marketplace): align admin api with backend endpoints
2025-10-08 18:42:19 +08:00
unclecode
d2c7f345ab
feat(docs): add chatgpt quick link to page actions
2025-10-07 11:59:25 +08:00
unclecode
8c62277718
feat(marketplace): add sponsor logo uploads
...
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
2025-10-06 20:58:35 +08:00
unclecode
5145d42df7
fix(docs): hide copy menu on non-markdown pages
2025-10-03 20:11:20 +08:00
Nasrin
80aa6c11d9
Merge pull request #1530 from Sjoeborg/fix/arun-many-returns-none
...
Fix: run_urls() returns None, crashing arun_many()
2025-10-03 12:57:06 +08:00
unclecode
749d200866
fix(marketplace): Update URLs to use /marketplace path and relative API endpoints
...
- Change API_BASE to relative '/api' for production
- Move marketplace to /marketplace instead of /marketplace/frontend
- Update MkDocs navigation
- Fix logo path in marketplace index
2025-10-02 17:08:50 +08:00
unclecode
408ad1b750
feat(marketplace): Add Crawl4AI marketplace with secure configuration
...
- Implement marketplace frontend and admin dashboard
- Add FastAPI backend with environment-based configuration
- Use .env file for secrets management
- Include data generation scripts
- Add proper CORS configuration
- Remove hardcoded password from admin login
- Update gitignore for security
2025-10-02 16:41:11 +08:00
Martin Sjöborg
35dd206925
fix: always return a list, even if we catch an exception
2025-10-02 09:21:44 +02:00
Martin Sjöborg
8d30662647
fix: remove this import as it causes python to treat "json" as a variable in the except block
2025-10-02 09:19:15 +02:00
unclecode
ef46df10da
Update gitignore add local scripts folder
2025-09-30 18:31:57 +08:00
unclecode
0d8d043109
feat(docs): add brand book and page copy functionality
...
- Add comprehensive brand book with color system, typography, components
- Add page copy dropdown with markdown copy/view functionality
- Update mkdocs.yml with new assets and branding navigation
- Use terminal-style ASCII icons and condensed menu design
2025-09-30 18:28:05 +08:00
ntohidi
3fe49a766c
fix(docker-deployment): replace console.log with print for metadata extraction
2025-09-25 14:12:59 +08:00
ntohidi
fef715a891
Merge branch 'feature/docker-hooks' into develop
2025-09-25 14:11:46 +08:00
Nasrin
69e8ca3d0d
Merge pull request #1508 from unclecode/docker/base_config_overrides
...
#1505 fix(api): update config handling to only set base config if not provided by user
2025-09-22 18:02:14 +08:00
AHMET YILMAZ
a1950afd98
#1505 fix(api): update config handling to only set base config if not provided by user
2025-09-22 17:19:27 +08:00
Nasrin
d0eb5a6ffe
Merge pull request #1501 from unclecode/fix/n-playwright-stealth
...
feat(StealthAdapter): fix stealth features for Playwright integration
2025-09-19 14:17:35 +08:00
ntohidi
77559f3373
feat(StealthAdapter): fix stealth features for Playwright integration. ref #1481
2025-09-18 15:39:06 +08:00
Nasrin
3899ac3d3b
Merge pull request #1464 from unclecode/fix/proxy_deprecation
...
Fix/proxy deprecation
2025-09-16 15:48:45 +08:00
Nasrin
23431d8109
Merge pull request #1389 from unclecode/fix/deep-crawl-scoring
...
fix(deep-crawl): BestFirst priority inversion
2025-09-16 15:45:54 +08:00
AHMET YILMAZ
1717827732
refactor(BrowserConfig): change deprecation warning for 'proxy' parameter to UserWarning
2025-09-12 11:10:38 +08:00
Nasrin
f8eaf01ed1
Merge pull request #1467 from unclecode/fix/request-crawl-stream
...
Fix: request /crawl with stream: true issue
2025-09-11 17:40:43 +08:00
Nasrin
14b42b1f9a
Merge pull request #1471 from unclecode/fix/adaptive-crawler-llm-config
...
Fix: allow custom LLM providers for adaptive crawler embedding config…
2025-09-09 12:56:33 +08:00
ntohidi
3bc56dd028
fix: allow custom LLM providers for adaptive crawler embedding config. ref: #1291
...
- Change embedding_llm_config from Dict to Union[LLMConfig, Dict] for type safety
- Add backward-compatible conversion property _embedding_llm_config_dict
- Replace all hardcoded OpenAI embedding configs with configurable options
- Fix LLMConfig object attribute access in query expansion logic
- Add comprehensive example demonstrating multiple provider configurations
- Update documentation with both LLMConfig object and dictionary usage patterns
Users can now specify any LLM provider for query expansion in embedding strategy:
- New: embedding_llm_config=LLMConfig(provider='anthropic/claude-3', api_token='key')
- Old: embedding_llm_config={'provider': 'openai/gpt-4', 'api_token': 'key'} (still works)
2025-09-09 12:49:55 +08:00
AHMET YILMAZ
1874a7b8d2
fix: update option labels in request builder for clarity
2025-09-05 17:06:25 +08:00
Nasrin
0482c1eafc
Merge pull request #1469 from unclecode/fix/docker-jwt
...
Fix(auth): Fixed Docker JWT authentication
2025-09-04 15:00:15 +08:00
AHMET YILMAZ
6a3b3e9d38
Commit without API
2025-09-03 17:02:40 +08:00
Nasrin
1eacea1d2d
Merge pull request #1432 from unclecode/example/web2api-example
...
feat: Add comprehensive website to API example with frontend
2025-09-03 16:30:39 +08:00
Nasrin
bc6d8147d2
Merge pull request #1451 from unclecode/fix/remove-python3.9-version
...
Remove python 3.9 from supported versions and require Python >= 3.10
2025-09-02 16:50:40 +08:00
ntohidi
487839640f
fix: raise error on last attempt failure in perform_completion_with_backoff. ref #989
2025-09-02 16:49:01 +08:00
ntohidi
6772134a3a
remove: delete unused yoyo snapshot subproject
2025-09-02 12:07:08 +08:00
Nasrin
ae67d66b81
Merge pull request #1454 from nafeqq-1306/docstring-changes
...
issue #1329 : Docs are not detected due to triplequotes not being first line
2025-09-02 11:59:59 +08:00
Nasrin
af28e84a21
Merge pull request #1441 from unclecode/fix/improve-docker-error-handling
...
Improve docker error handling
2025-09-02 11:56:01 +08:00
Nasrin
5e7fcb17e1
Merge pull request #1448 from unclecode/fix/https-reditrect
...
feat: add preserve_https_for_internal_links flag to maintain HTTPS during crawling
2025-09-01 16:11:25 +08:00
ntohidi
6e728096fa
fix(auth): fixed Docker JWT authentication. ref #1442
2025-09-01 12:48:16 +08:00
Nasrin
2de200c1ba
Merge pull request #1433 from Thermofish/fix/excluded_selector
...
fix(deps): reintroduce cssselect to restore excluded_selector support (#1405 )
2025-08-29 16:08:24 +08:00
nafeqq-1306
9749e2832d
issue #1329 refactor(crawler): move unwanted properties to CrawlerRunConfig class
2025-08-29 10:20:47 +08:00
Soham Kukreti
70f473b84d
fix: drop Python 3.9 support and require Python >=3.10.
...
The library no longer supports Python 3.9 and so it was important to drop all references to python 3.9.
Following changes have been made:
- pyproject.toml: set requires-python to ">=3.10"; remove 3.9 classifier
- setup.py: set python_requires to ">=3.10"; remove 3.9 classifier
- docs: update Python version mentions
- deploy/docker/c4ai-doc-context.md: options -> 3.10, 3.11, 3.12, 3.13
2025-08-28 19:31:19 +05:30
ntohidi
bdacf61ca9
feat: update documentation for preserve_https_for_internal_links. ref #1410
2025-08-28 17:48:12 +08:00
ntohidi
f566c5a376
feat: add preserve_https_for_internal_links flag to maintain HTTPS during crawling. Ref #1410
...
Added a new `preserve_https_for_internal_links` configuration flag that preserves the original HTTPS scheme for same-domain links even when the server redirects to HTTP.
2025-08-28 17:38:40 +08:00
AHMET YILMAZ
4ed33fce9e
Remove deprecated test for 'proxy' parameter in BrowserConfig and update .gitignore to include test_scripts directory.
2025-08-28 17:26:10 +08:00
AHMET YILMAZ
f7a3366f72
#1375 : refactor(proxy) Deprecate 'proxy' parameter in BrowserConfig and enhance proxy string parsing
...
- Updated ProxyConfig.from_string to support multiple proxy formats, including URLs with credentials.
- Deprecated the 'proxy' parameter in BrowserConfig, replacing it with 'proxy_config' for better flexibility.
- Added warnings for deprecated usage and clarified behavior when both parameters are provided.
- Updated documentation and tests to reflect changes in proxy configuration handling.
2025-08-28 17:21:49 +08:00
Nasrin
4e1c4bd24e
Merge pull request #1436 from unclecode/fix/docker-filter
...
fix(docker): resolve filter serialization and JSON encoding errors in deep crawl strategy
2025-08-27 11:08:42 +08:00
Soham Kukreti
2ad3fb5fc8
feat(docker): improve docker error handling
...
- Return comprehensive error messages along with status codes for api internal errors.
- Fix fit_html property serialization issue in both /crawl and /crawl/stream endpoints
- Add sanitization to ensure fit_html is always JSON-serializable (string or None)
- Add comprehensive error handling test suite.
2025-08-26 23:18:35 +05:30