Commit Graph

  • 2bbcb1dc7d "Claude PR Assistant workflow" UncleCode 2025-10-04 12:45:18 +08:00
  • 7dfe528d43 fix(docs): standardize C4A-Script tutorial, add CLI identity-based crawling, and add sponsorship CTA Soham Kukreti 2025-10-03 22:00:46 +05:30
  • 5145d42df7 fix(docs): hide copy menu on non-markdown pages unclecode 2025-10-03 20:09:48 +08:00
  • 5dc34dd210 feat: enhance crawling functionality with anti-bot strategies and headless mode options (Browser adapters , 12.Undetected/stealth browser) AHMET YILMAZ 2025-10-03 18:02:10 +08:00
  • 9900f63f97 Merge pull request #1531 from unclecode/develop Nasrin 2025-10-03 13:24:51 +08:00
  • 9292b265fc Merge branch 'develop' of https://github.com/unclecode/crawl4ai into develop ntohidi 2025-10-03 12:57:23 +08:00
  • 80aa6c11d9 Merge pull request #1530 from Sjoeborg/fix/arun-many-returns-none Nasrin 2025-10-03 12:57:06 +08:00
  • 749d200866 fix(marketplace): Update URLs to use /marketplace path and relative API endpoints unclecode 2025-10-02 17:08:50 +08:00
  • 408ad1b750 feat(marketplace): Add Crawl4AI marketplace with secure configuration unclecode 2025-10-02 16:41:11 +08:00
  • 35dd206925 fix: always return a list, even if we catch an exception Martin Sjöborg 2025-10-02 09:20:59 +02:00
  • 8d30662647 fix: remove this import as it causes python to treat "json" as a variable in the except block Martin Sjöborg 2025-10-02 09:17:32 +02:00
  • a599db8f7b feat(docker): add routers directory to Dockerfile AHMET YILMAZ 2025-10-01 16:21:24 +08:00
  • 1a8e0236af feat(adaptive-crawling): implement adaptive crawling endpoints and integrate with server AHMET YILMAZ 2025-10-01 15:53:56 +08:00
  • ef46df10da Update gitignore add local scripts folder unclecode 2025-09-30 18:31:57 +08:00
  • 0d8d043109 feat(docs): add brand book and page copy functionality unclecode 2025-09-30 18:28:05 +08:00
  • a62cfeebd9 feat(adaptive-crawling): implement adaptive crawling endpoints and job management AHMET YILMAZ 2025-09-30 18:17:40 +08:00
  • bb3b29042f chore: remove yoyo snapshot subproject and impelemented adaptive crawling AHMET YILMAZ 2025-09-30 18:17:26 +08:00
  • 1ea021b721 feat(api): add seed URL endpoint and related request model AHMET YILMAZ 2025-09-30 13:35:08 +08:00
  • 70af81d9d7 refactor(release): remove memory management section for cleaner documentation. ref #1443 ntohidi 2025-09-30 11:54:21 +08:00
  • 2dc6588573 fix: remove_overlay_elements functionality by calling injected JS function. ref: #1396 Soham Kukreti 2025-09-29 20:40:08 +05:30
  • 34c0996ee4 fix: Add CDP endpoint verification with exponential backoff for managed browsers (#1445) Soham Kukreti 2025-09-29 19:14:50 +05:30
  • 361499d291 Release v0.7.5: The Update ntohidi 2025-09-29 18:05:26 +08:00
  • 3fe49a766c fix(docker-deployment): replace console.log with print for metadata extraction ntohidi 2025-09-25 14:12:59 +08:00
  • fef715a891 Merge branch 'feature/docker-hooks' into develop ntohidi 2025-09-25 14:11:46 +08:00
  • d48d382d18 feat(tests): Implement comprehensive testing framework for telemetry system feature/telemetry AHMET YILMAZ 2025-09-22 19:06:20 +08:00
  • 69e8ca3d0d Merge pull request #1508 from unclecode/docker/base_config_overrides Nasrin 2025-09-22 18:02:14 +08:00
  • a1950afd98 #1505 fix(api): update config handling to only set base config if not provided by user docker/base_config_overrides AHMET YILMAZ 2025-09-22 17:19:27 +08:00
  • d0eb5a6ffe Merge pull request #1501 from unclecode/fix/n-playwright-stealth Nasrin 2025-09-19 14:17:35 +08:00
  • 89679cee67 #1489 refactor(normalize_url): enhance URL normalization logic and add comprehensive test suite fix/case_senstive_params AHMET YILMAZ 2025-09-18 18:31:07 +08:00
  • 77559f3373 feat(StealthAdapter): fix stealth features for Playwright integration. ref #1481 fix/n-playwright-stealth ntohidi 2025-09-18 15:39:06 +08:00
  • 84ba78c852 #1489 refactor(normalize_url): improve query parameter handling and sorting AHMET YILMAZ 2025-09-17 18:56:45 +08:00
  • e3467c08f6 #1490 feat(ManagedBrowser): add viewport size configuration for browser launch fix/viewport_in_managed_browser AHMET YILMAZ 2025-09-17 17:40:38 +08:00
  • 3899ac3d3b Merge pull request #1464 from unclecode/fix/proxy_deprecation Nasrin 2025-09-16 15:48:45 +08:00
  • 23431d8109 Merge pull request #1389 from unclecode/fix/deep-crawl-scoring Nasrin 2025-09-16 15:45:54 +08:00
  • 1717827732 refactor(BrowserConfig): change deprecation warning for 'proxy' parameter to UserWarning fix/proxy_deprecation AHMET YILMAZ 2025-09-12 11:10:38 +08:00
  • f8eaf01ed1 Merge pull request #1467 from unclecode/fix/request-crawl-stream Nasrin 2025-09-11 17:40:43 +08:00
  • 14b42b1f9a Merge pull request #1471 from unclecode/fix/adaptive-crawler-llm-config Nasrin 2025-09-09 12:56:33 +08:00
  • 3bc56dd028 fix: allow custom LLM providers for adaptive crawler embedding config. ref: #1291 fix/adaptive-crawler-llm-config ntohidi 2025-09-09 12:49:55 +08:00
  • 813b1f5534 #1268 fix: update redirected_url to current page URL and enhance normalize_url function fix/relative_url AHMET YILMAZ 2025-09-08 19:09:33 +08:00
  • 1874a7b8d2 fix: update option labels in request builder for clarity fix/request-crawl-stream AHMET YILMAZ 2025-09-05 17:06:25 +08:00
  • 0482c1eafc Merge pull request #1469 from unclecode/fix/docker-jwt Nasrin 2025-09-04 15:00:15 +08:00
  • 6a3b3e9d38 Commit without API AHMET YILMAZ 2025-09-03 17:02:40 +08:00
  • 1eacea1d2d Merge pull request #1432 from unclecode/example/web2api-example Nasrin 2025-09-03 16:30:39 +08:00
  • bc6d8147d2 Merge pull request #1451 from unclecode/fix/remove-python3.9-version Nasrin 2025-09-02 16:50:40 +08:00
  • 487839640f fix: raise error on last attempt failure in perform_completion_with_backoff. ref #989 ntohidi 2025-09-02 16:49:01 +08:00
  • 6772134a3a remove: delete unused yoyo snapshot subproject ntohidi 2025-09-02 12:07:08 +08:00
  • ae67d66b81 Merge pull request #1454 from nafeqq-1306/docstring-changes Nasrin 2025-09-02 11:59:59 +08:00
  • af28e84a21 Merge pull request #1441 from unclecode/fix/improve-docker-error-handling Nasrin 2025-09-02 11:56:01 +08:00
  • edd0b576b1 Fix: Use correct URL variable for raw HTML extraction (#1116) rbushria 2025-08-28 10:46:44 +03:00
  • b54c200c74 feat: make device_scale_factor configurable via BrowserConfig TristanDonze 2025-09-01 17:04:34 +02:00
  • 5e7fcb17e1 Merge pull request #1448 from unclecode/fix/https-reditrect Nasrin 2025-09-01 16:11:25 +08:00
  • 6e728096fa fix(auth): fixed Docker JWT authentication. ref #1442 fix/docker-jwt ntohidi 2025-09-01 12:48:16 +08:00
  • 367190fc75 Merge branch 'unclecode:main' into patch-versionmanager Vladimir Mandic 2025-08-30 23:22:26 -04:00
  • 2de200c1ba Merge pull request #1433 from Thermofish/fix/excluded_selector Nasrin 2025-08-29 16:08:24 +08:00
  • 9749e2832d issue #1329 refactor(crawler): move unwanted properties to CrawlerRunConfig class nafeqq-1306 2025-08-29 10:20:47 +08:00
  • 70f473b84d fix: drop Python 3.9 support and require Python >=3.10. The library no longer supports Python 3.9 and so it was important to drop all references to python 3.9. Following changes have been made: - pyproject.toml: set requires-python to ">=3.10"; remove 3.9 classifier - setup.py: set python_requires to ">=3.10"; remove 3.9 classifier - docs: update Python version mentions - deploy/docker/c4ai-doc-context.md: options -> 3.10, 3.11, 3.12, 3.13 Soham Kukreti 2025-08-28 19:27:33 +05:30
  • bdacf61ca9 feat: update documentation for preserve_https_for_internal_links. ref #1410 fix/https-reditrect ntohidi 2025-08-28 17:48:12 +08:00
  • f566c5a376 feat: add preserve_https_for_internal_links flag to maintain HTTPS during crawling. Ref #1410 ntohidi 2025-08-28 17:38:40 +08:00
  • 4ed33fce9e Remove deprecated test for 'proxy' parameter in BrowserConfig and update .gitignore to include test_scripts directory. AHMET YILMAZ 2025-08-28 17:26:10 +08:00
  • f7a3366f72 #1375 : refactor(proxy) Deprecate 'proxy' parameter in BrowserConfig and enhance proxy string parsing AHMET YILMAZ 2025-08-28 17:21:49 +08:00
  • 4e1c4bd24e Merge pull request #1436 from unclecode/fix/docker-filter Nasrin 2025-08-27 11:08:42 +08:00
  • 2ad3fb5fc8 feat(docker): improve docker error handling - Return comprehensive error messages along with status codes for api internal errors. - Fix fit_html property serialization issue in both /crawl and /crawl/stream endpoints - Add sanitization to ensure fit_html is always JSON-serializable (string or None) - Add comprehensive error handling test suite. Soham Kukreti 2025-08-26 23:18:35 +05:30
  • cce3390a2d Merge pull request #1426 from unclecode/fix/update-quickstart-and-adaptive-strategies-docs Nasrin 2025-08-26 16:53:47 +08:00
  • 4fe2d01361 Merge pull request #1440 from unclecode/feature/docker-llm-parameters Nasrin 2025-08-26 16:48:17 +08:00
  • 159207b86f feat(docker): Add temperature and base_url parameters for LLM configuration. ref #1035 feature/docker-llm-parameters ntohidi 2025-08-26 16:44:07 +08:00
  • 38f3ea42a7 fix(logger): ensure logger is a Logger instance in crawling strategies. ref #1437 fix/docker-filter ntohidi 2025-08-26 12:06:56 +08:00
  • 102352eac4 fix(docker): resolve filter serialization and JSON encoding errors in deep crawl strategy (ref #1419) ntohidi 2025-08-25 14:04:08 +08:00
  • f2da460bb9 fix(dependencies): add cssselect to project dependencies James T. Wood 2025-08-24 22:12:20 -04:00
  • b1dff5a4d3 feat: Add comprehensive website to API example with frontend Soham Kukreti 2025-08-24 18:20:15 +05:30
  • 40ab287c90 fix(utils): Improve URL normalization by avoiding quote/unquote to preserve '+' signs. ref #1332 ntohidi 2025-08-22 12:05:21 +08:00
  • c09a57644f docs: update adaptive crawler docs and cache defaults; remove deprecated examples (#1330) - Replace BaseStrategy with CrawlStrategy in custom strategy examples (DomainSpecificStrategy, HybridStrategy) - Remove “Custom Link Scoring” and “Caching Strategy” sections no longer aligned with current library - Revise memory pruning example to use adaptive.get_relevant_content and index-based retention of top 500 docs - Correct Quickstart note: default cache mode is CacheMode.BYPASS; instruct enabling with CacheMode.ENABLED Soham Kukreti 2025-08-21 19:11:31 +05:30
  • 90af453506 Merge branch 'develop' of https://github.com/unclecode/crawl4ai into develop ntohidi 2025-08-21 14:10:01 +08:00
  • 8bb0e68cce Merge pull request #1422 from unclecode/fix/docker-llmEnvFile Nasrin 2025-08-21 14:05:06 +08:00
  • 95051020f4 fix(docker): Fix LLM API key handling for multi-provider support fix/docker-llmEnvFile ntohidi 2025-08-21 14:01:04 +08:00
  • 69961cf40b Merge branch 'develop' of https://github.com/unclecode/crawl4ai into develop ntohidi 2025-08-20 16:56:19 +08:00
  • 7f360577d9 feat(telemetry): Add opt-in telemetry system for error tracking and stability improvement ntohidi 2025-08-20 16:49:44 +08:00
  • ef174a4c7a Merge pull request #1104 from emmanuel-ferdman/main Nasrin 2025-08-20 10:57:39 +08:00
  • f4206d6ba1 Merge pull request #1369 from NezarAli/main Nasrin 2025-08-18 14:22:54 +08:00
  • 9447054a65 docs: update Docker instructions to use the latest release tag ntohidi 2025-08-18 14:20:05 +08:00
  • dad7c51481 Merge pull request #1398 from unclecode/fix/update-url-seeding-docs Nasrin 2025-08-18 13:00:26 +08:00
  • f4a432829e fix(crawler): Removed the incorrect reference in browser_config variable #1310 ntohidi 2025-08-18 10:59:14 +08:00
  • e651e045c4 Release v0.7.4: Merge release branch v0.7.4 UncleCode 2025-08-17 19:46:48 +08:00
  • 5398acc7d2 docs: add v0.7.4 release blog post and update documentation release/v0.7.4 UncleCode 2025-08-17 19:45:23 +08:00
  • 22c7932ba3 chore(version): update version to 0.7.4 UncleCode 2025-08-17 19:22:23 +08:00
  • 2ab0bf27c2 refactor(utils): move memory utilities to utils and update imports UncleCode 2025-08-17 19:14:55 +08:00
  • d30dc9fdc1 fix(http-crawler): bring back HTTP crawler strategy ntohidi 2025-08-16 09:27:23 +08:00
  • e6044e6053 Merge branch 'develop' of https://github.com/unclecode/crawl4ai into develop ntohidi 2025-08-15 19:44:06 +08:00
  • a50e47adad Merge branch 'feature/table-extraction-strategies' into develop ntohidi 2025-08-15 19:41:37 +08:00
  • ada7441bd1 refactor: Update LLMTableExtraction examples and tests ntohidi 2025-08-15 18:47:31 +08:00
  • 9f7fee91a9 feat: 🚀 Introduce revolutionary LLMTableExtraction with intelligent chunking for massive tables ntohidi 2025-08-14 18:21:24 +08:00
  • 7f48655cf1 feat(browser-profiler): implement cross-platform keyboard listeners and improve quit handling AHMET YILMAZ 2025-08-08 11:18:34 +08:00
  • 1417a67e90 chore(profile-test): fix filename typo ( test_crteate_profile.py → test_create_profile.py ) prokopis3 2025-06-12 14:38:32 +03:00
  • 19398d33ef fix(browser_profiler): improve keyboard input handling prokopis3 2025-06-12 14:33:12 +03:00
  • 263d362daa fix(browser_profiler): cross-platform 'q' to quit prokopis3 2025-05-30 14:43:18 +03:00
  • bac92a47e4 refactor: Update LLMTableExtraction examples and tests ntohidi 2025-08-15 18:47:31 +08:00
  • 8e1362acf5 Fix async generator type mismatch in Docker Client streaming feat/ahmed_dev AHMET YILMAZ 2025-08-15 15:49:11 +08:00
  • 07e9d651fb feat: Comprehensive deep crawl streaming functionality restoration AHMET YILMAZ 2025-08-15 15:31:36 +08:00
  • a51545c883 feat: 🚀 Introduce revolutionary LLMTableExtraction with intelligent chunking for massive tables ntohidi 2025-08-14 18:21:24 +08:00
  • ecbe5ffb84 docs: Update URL seeding examples to use proper async context managers - Wrap all AsyncUrlSeeder usage with async context managers - Update URL seeding adventure example to use "sitemap+cc" source, focus on course posts, and add stream=True parameter to fix runtime error Soham Kukreti 2025-08-13 18:16:46 +05:30
  • 926e41aab8 Merge pull request #1378 from unclecode/fix/exit_with_q Nasrin 2025-08-13 14:16:47 +08:00