Commit Graph

  • 9d8ead59b8 📝 Add docstrings to codex/find-and-fix-a-bug (#1123) codex/find-and-fix-a-bug coderabbitai[bot] 2025-05-17 10:52:55 +08:00
  • 32fcacafa6 📝 Add docstrings to codex/find-and-fix-a-bug coderabbitai/docstrings/14vTVzYa3bH06l5wYNY9jTghrrj9FxxWL coderabbitai[bot] 2025-05-17 02:37:00 +00:00
  • 45f1652d98 Fix merge_chunks splitter usage and remove incorrect return UncleCode 2025-05-17 10:31:19 +08:00
  • ac9981a1f5 feat(favicon): add favicon image and update mkdocs configuration UncleCode 2025-05-16 21:59:23 +08:00
  • 83ef15fd47 feat(favicon): add favicon.ico for improved branding UncleCode 2025-05-16 21:55:07 +08:00
  • a3cb938675 feat(theme): enable dark color mode in mkdocs configuration UncleCode 2025-05-16 21:44:56 +08:00
  • 9b60988232 feat(feedback): add feedback modal styles and integrate into mkdocs configuration UncleCode 2025-05-16 21:25:10 +08:00
  • 98e951f611 fix(mkdocs): remove duplicate gtag.js entry in extra_javascript UncleCode 2025-05-16 20:52:41 +08:00
  • baca2df8df feat(analytics): add Google Tag Manager script and gtag.js for tracking UncleCode 2025-05-16 20:49:02 +08:00
  • 8a5e23d374 feat(crawler): add separate timeout for wait_for condition UncleCode 2025-05-16 17:00:45 +08:00
  • 22725ca87b fix(crawler): initialize captured_console to prevent unbound local error for local HTML files. REF: #1072 ntohidi 2025-05-15 11:29:36 +02:00
  • e0fbd2b0a0 fix(schema): update f parameter description to use lowercase enum values. REF: #1070 ntohidi 2025-05-15 10:45:23 +02:00
  • 32966bea11 fix(extraction): resolve 'str' object has no attribute 'choices' error in LLMExtractionStrategy. Refs: #979 ntohidi 2025-05-15 10:09:19 +02:00
  • a3b0cab52a #1088 is sloved flag -bc now if for --byPass-cache Ahmed-Tawfik94 2025-05-15 11:25:06 +08:00
  • 137556b3dc fix the EXTRACT to match the styling of the other methods medo94my 2025-05-14 16:01:10 +08:00
  • 260e2dc347 fix(browser): create browser config before launching managed browser instance. REF: https://discord.com/channels/1278297938551902308/1278298697540567132/1371683009459392716 ntohidi 2025-05-13 14:03:20 +02:00
  • 25d97d56e4 fix(dependencies): remove duplicated aiofiles from project dependencies. REF #1045 ntohidi 2025-05-13 13:56:12 +02:00
  • 98a56e6e01 Merge next branch Aravind Karnam 2025-05-13 17:12:11 +05:30
  • 1e1c887a2f fix(docker-api): migrate to modern datetime library API Emmanuel Ferdman 2025-05-13 00:04:58 -07:00
  • 897e017361 Set version to 0.6.3 vr0.6.3 v0.6.3 UncleCode 2025-05-12 21:20:10 +08:00
  • a3e9ef91ad fix(crawler): remove automatic page closure in screenshot methods UncleCode 2025-05-12 21:17:57 +08:00
  • 76dd86d1b3 Merge remote-tracking branch 'origin/linkedin-prep' into next UncleCode 2025-05-08 17:13:59 +08:00
  • 206a9dfabd feat(crawler): add session management and view-source support UncleCode 2025-05-08 17:13:35 +08:00
  • 1af3d1c2e0 Merge branch '2025-APR-1' of https://github.com/unclecode/crawl4ai into 2025-APR-1 ntohidi 2025-05-08 11:11:32 +02:00
  • c1041b9bbe fix: exclude_external_images flag simply discards elements ref:https://github.com/unclecode/crawl4ai/issues/345 Aravind Karnam 2025-05-07 18:43:29 +05:30
  • f6e25e2a6b fix: check_robots_txt to support wildcard rules ref: #699 Aravind Karnam 2025-05-07 17:53:30 +05:30
  • ee93acbd06 fix(async_playwright_crawler): use config directly instead of self.config for verbosity check ntohidi 2025-05-07 12:32:38 +02:00
  • 2b17f234f8 docs: update direct passing of content_filter to CrawlerRunConfig and instead pass it via MarkdownGenerator. Ref: #603 Aravind Karnam 2025-05-07 15:20:36 +05:30
  • eebb8c84f0 fix(requirements): add PyPDF2 dependency for PDF processing ntohidi 2025-05-07 11:18:44 +02:00
  • 12783fabda fix(dependencies): update pillow version constraint to allow newer releases. ref #709 ntohidi 2025-05-07 11:18:13 +02:00
  • 39e3b792a1 Merge branch 'next' into 2025-APR-1 Aravind Karnam 2025-05-07 10:25:25 +05:30
  • aaf05910eb fix: removed unnecessary imports and installs Aravind Karnam 2025-05-06 15:53:55 +05:30
  • a0555d5fa6 merge:from next branch Aravind Karnam 2025-05-06 15:16:47 +05:30
  • 9a0585c8f6 fix bs4 warning on text kwarg - switch to string RoyLeviLangware 2025-05-06 11:44:48 +03:00
  • 38ebcbb304 fix: provide support for local llm by adding it to the arguments Aravind Karnam 2025-05-05 10:34:38 +05:30
  • 9b5ccac76e feat(extraction): add RegexExtractionStrategy for pattern-based extraction UncleCode 2025-05-02 21:15:24 +08:00
  • 87d4b0fff4 format bash scripts properly so copy & paste may work without issues Aravind Karnam 2025-05-02 17:21:09 +05:30
  • bd5a9ac632 updated readme with arguments for litellm Aravind Karnam 2025-05-02 17:04:42 +05:30
  • 6650b2f34a fix: replace openAI with litellm to support multiple llm providers Aravind Karnam 2025-05-02 16:51:15 +05:30
  • 5cc58f9bb3 fix: 1. duplicate verbose flag 2.inconsistency in argument name --profile-name 3. duplicate initialisaiton of env_defaults Aravind Karnam 2025-05-02 16:40:58 +05:30
  • baf7f6a6f5 fix: typo in readme Aravind Karnam 2025-05-02 16:33:11 +05:30
  • e0cd3e10de fix(crawler): initialize captured_console variable for local file processing ntohidi 2025-05-02 10:35:35 +02:00
  • 94e9959fe0 feat(docker-api): add job-based polling endpoints for crawl and LLM tasks UncleCode 2025-05-01 21:24:52 +08:00
  • 7c2fd5202e fix: incorrect params and commands in linkedin app readme Aravind Karnam 2025-05-01 18:27:03 +05:30
  • ee01b81f3e Merge branch 'merge-pr971' into next UncleCode 2025-05-01 18:58:41 +08:00
  • 0e5d672763 Merge branch 'pr-971' into merge-pr971 merge-pr971 UncleCode 2025-05-01 18:57:28 +08:00
  • cd2b490b40 refactor(logger): Apply the Enumeration for color wakaka6 2025-05-01 16:59:33 +08:00
  • 50f0b83fcd feat(linkedin): add prospect-wizard app with scraping and visualization UncleCode 2025-04-30 19:38:25 +08:00
  • 1d6a2b9979 fix(crawler): surface real redirect status codes and keep redirect chain. the 30x response instead of always returning 200. Refs #660 ntohidi 2025-04-30 12:29:17 +02:00
  • 039be1b1ce feat: add pdf2image dependency to requirements ntohidi 2025-04-30 11:41:35 +02:00
  • 9499164d3c feat(browser): improve browser profile management and cleanup UncleCode 2025-04-29 23:04:32 +08:00
  • 53245e4e0e Fix: README.md urls list Marc Sacristán 2025-04-29 16:26:35 +02:00
  • 2140d9aca4 fix(browser): correct headless mode default behavior UncleCode 2025-04-26 21:09:50 +08:00
  • ccec40ed17 feat(models): add dedicated tables field to CrawlResult UncleCode 2025-04-24 18:36:25 +08:00
  • 094201ab2a Merge next + resolve conflicts Aravind Karnam 2025-04-23 19:44:50 +05:30
  • ad4dfb21e1 Remoce "rc1" UncleCode 2025-04-23 21:00:00 +08:00
  • 7784b2468e feat(docs): enhance Ask AI button UX and add v0.6.0 release notes UncleCode 2025-04-23 20:07:03 +08:00
  • 146f9d415f Update README vr0.6.0 UncleCode 2025-04-23 19:50:33 +08:00
  • 37fd80e4b9 feat(docs): add mobile-friendly navigation menu UncleCode 2025-04-23 19:44:25 +08:00
  • 949a93982e feat(docs): update documentation and disable Ask AI feature UncleCode 2025-04-23 19:02:39 +08:00
  • c4f5651199 chore(deps): upgrade to Python 3.12 and prepare for 0.6.0 release UncleCode 2025-04-23 16:35:15 +08:00
  • b0aa8bc9f7 Update README vr0.6.0rc1 UncleCode 2025-04-22 23:21:42 +08:00
  • c98ffe2130 Update CHANGELOG UncleCode 2025-04-22 22:36:41 +08:00
  • 4812f08a73 feat(docker): update Docker deployment for v0.6.0 UncleCode 2025-04-22 22:35:25 +08:00
  • f3ebb38edf Merge PR #899 into next, resolve conflicts in server.py and docs/browser-crawler-config.md v0.5.5 unclecode 2025-04-22 14:56:47 +08:00
  • 0007aea204 Update changelog UncleCode 2025-04-21 23:21:49 +08:00
  • b5c25731e6 feat(browser): add geolocation, locale and timezone support UncleCode 2025-04-21 23:20:59 +08:00
  • 5297e362f3 feat(mcp): Implement MCP protocol and enhance server capabilities UncleCode 2025-04-21 22:22:02 +08:00
  • 14a31456ef fix(docs): update browser-crawler-config example to include LLMContentFilter and DefaultMarkdownGenerator, fix syntax errors ntohidi 2025-04-21 13:59:49 +02:00
  • a58c8000aa refactor(server): migrate to pool-based crawler management UncleCode 2025-04-20 20:14:26 +08:00
  • b27bb367e8 merge next. Resolve conflicts. Fix some import errors and error handling in server.py Aravind Karnam 2025-04-19 20:27:47 +05:30
  • d2648eaa39 fix: solved with deepcopy of elements https://github.com/unclecode/crawl4ai/issues/902 Aravind Karnam 2025-04-19 20:08:36 +05:30
  • c2902fd200 reverse:last change in order of execution for it introduced a new issue in content generated. https://github.com/unclecode/crawl4ai/issues/902 Aravind Karnam 2025-04-19 19:46:20 +05:30
  • 16b2318242 feat(api): implement crawler pool manager for improved resource handling UncleCode 2025-04-18 22:26:24 +08:00
  • 907cba194f Merge branch 'next-stress' into next UncleCode 2025-04-17 22:34:43 +08:00
  • 3bf78ff47a refactor(docker-demo): enhance error handling and output formatting UncleCode 2025-04-17 22:32:58 +08:00
  • 921e0c46b6 feat(tests): implement high volume stress testing framework UncleCode 2025-04-17 22:31:51 +08:00
  • fd899f66aa Merge branch 'next-fix-markdown-source' into next UncleCode 2025-04-17 20:16:15 +08:00
  • 30ec4f571f feat(docs): add comprehensive Docker API demo script UncleCode 2025-04-17 20:16:11 +08:00
  • 7db6b468d9 feat(markdown): add content source selection for markdown generation UncleCode 2025-04-17 20:13:53 +08:00
  • 0886153d6a fix(async_playwright_crawler): improve segment handling and viewport adjustments during screenshot capture (Fixed bug: Capturing Screenshot Twice and Increasing Image Size) ntohidi 2025-04-17 12:48:11 +02:00
  • 0ec3c4a788 fix(crawler): handle navigation aborts during file downloads in AsyncPlaywrightCrawlerStrategy ntohidi 2025-04-17 12:11:12 +02:00
  • eed7f88f29 Merge branch 'next' into 2025-MAR-ALPHA-1 Aravind Karnam 2025-04-17 10:50:02 +05:30
  • 94d486579c docs(tests): clarify server URL comments in deep crawl tests UncleCode 2025-04-15 22:32:27 +08:00
  • 5206c6f2d6 Modify the test file UncleCode 2025-04-15 22:28:01 +08:00
  • 230f22da86 refactor(proxy): move ProxyConfig to async_configs and improve LLM token handling UncleCode 2025-04-15 22:27:18 +08:00
  • 05085b6e3d fix(requirements): add fake-useragent to requirements ntohidi 2025-04-15 13:05:19 +02:00
  • 793668a413 Remove parameter_updates.txt UncleCode 2025-04-14 23:05:24 +08:00
  • 82aa53aa59 Merge branch 'next-alpine-docker' into next UncleCode 2025-04-14 23:01:22 +08:00
  • cd7ff6f9c1 feat(docs): add AI assistant interface and code copy button next-alpine-docker UncleCode 2025-04-14 23:00:47 +08:00
  • c56974cf59 feat(docs): enhance documentation UI with ToC and GitHub stats UncleCode 2025-04-14 20:46:32 +08:00
  • 1f3b1251d0 docs(cli): add Crawl4AI CLI installation instructions to the CLI guide ntohidi 2025-04-14 12:16:31 +02:00
  • 7b9aabc64a fix(crawler): ensure max_pages limit is respected during batch processing in crawling strategies ntohidi 2025-04-14 12:11:22 +02:00
  • dcc265458c fix: Add a nominal wait time for remove overlay elements since it's already controllable through delay_before_return_html Aravind Karnam 2025-04-14 12:39:05 +05:30
  • ecec53a8c1 Docker tested on Windows machine. UncleCode 2025-04-13 20:14:41 +08:00
  • 7d8e81fb2e fix: fix target_elements, in a less invasive and more efficient way simply by changing order of execution :) https://github.com/unclecode/crawl4ai/issues/902 Aravind Karnam 2025-04-12 12:44:00 +05:30
  • 9fc5d315af fix: revert the old target_elms code in LXMLwebscraping strategy Aravind Karnam 2025-04-12 12:07:04 +05:30
  • d84508b4d5 fix: revert the old target_elms code in regular webscraping strategy Aravind Karnam 2025-04-12 12:05:17 +05:30
  • 022f5c9e25 Merged next branch Aravind Karnam 2025-04-12 10:47:02 +05:30
  • 3179d6ad0c fix(core): improve error handling and stability in core components UncleCode 2025-04-11 20:58:39 +08:00