This website requires JavaScript.
11b310edef
Merge pull request #1378 from unclecode/fix/exit_with_q
Nasrin
2025-08-13 14:16:47 +08:00
489981e670
Merge pull request #1390 from unclecode/fix/docker-raw-html
Nasrin
2025-08-13 13:56:33 +08:00
b92be4ef66
Merge pull request #1371 from unclecode/bug/proxy_config
Nasrin
2025-08-12 16:55:52 +08:00
7c0edaf266
Merge pull request #1384 from unclecode/fix/update_docker_examples
Nasrin
2025-08-12 16:53:42 +08:00
dfcfd8ae57
fix(dispatcher): enable true concurrency for fast-completing tasks in arun_many. REF: #560
ntohidi
2025-08-12 16:51:22 +08:00
955110a8b0
Merge branch 'develop' of https://github.com/unclecode/crawl4ai into develop
ntohidi
2025-08-12 12:22:25 +08:00
f30811b524
fix: Check for raw: and raw:// URLs before auto-appending https:// prefix - Add raw HTML URL validation alongside http/https checks - Fix URL preprocessing logic to handle raw: and raw:// prefixes - Update error message and add comprehensive test cases
Soham Kukreti
2025-08-11 22:10:53 +05:30
8146d477e9
Merge branch 'main' into develop
ntohidi
2025-08-11 18:56:15 +08:00
96c4b0de67
fix(browser_manager): serialize new_page on persistent context to avoid races ref #1198 - Add _page_lock and guarded creation; handle empty context.pages safely - Prevents BrowserContext.new_page “Target page/context closed” during concurrent arun_many
ntohidi
2025-08-11 18:55:43 +08:00
57c14db7cb
Merge pull request #1381 from unclecode/fix/base-tag-link-resolution
Nasrin
2025-08-11 18:32:32 +08:00
88a9fbbb7e
fix(deep-crawl): BestFirst priority inversion; remove pre-scoring truncation. ref #1253
fix/deep-crawl-scoring
ntohidi
2025-08-11 18:16:57 +08:00
be63c98db3
feat(docker): add user-provided hooks support to Docker API
feature/docker-hooks
ntohidi
2025-08-11 13:25:17 +08:00
cd2dd68e4c
docs: remove CRAWL4AI_API_TOKEN references and use correct endpoints in Docker example scripts (#1015 )
Soham Kukreti
2025-08-09 19:15:11 +05:30
f0ce7b2710
feat: add v0.7.3 release notes, changelog updates, and documentation for new features
UncleCode
2025-08-09 21:04:18 +08:00
21f79fe166
Release v0.7.3: Merge release branch
v0.7.3
UncleCode
2025-08-09 20:11:35 +08:00
a9a2d798b4
feat: update sponsorship tier details and add custom arrangements note
release/v0.7.3
unclecode
2025-08-09 20:10:32 +08:00
612270fcb0
feat: add scheduling link to contact information in SPONSORS.md
unclecode
2025-08-09 20:05:59 +08:00
bc099fdd76
Merge branch 'main' into release/v0.7.3
unclecode
2025-08-09 19:30:46 +08:00
18504d782e
Add Founding Sponsors section and update README with detailed project information
unclecode
2025-08-09 19:11:32 +08:00
ad547607b9
feat: add GitHub Sponsors support with 4 tiers
unclecode
2025-08-09 17:57:47 +08:00
18ad3ef159
fix: Implement base tag support in link extraction (#1147 ) - Extract base href from <head><base> tag using XPath in _process_element method - Use base URL as the primary URL for link normalization when present - Add error handling with logging for malformed or problematic base tags - Maintain backward compatibility when no base tag is present - Add test to verify the functionality of the base tag extraction.
Soham Kukreti
2025-08-08 20:00:11 +05:30
b61b2ee676
feat(browser-profiler): implement cross-platform keyboard listeners and improve quit handling
AHMET YILMAZ
2025-08-08 11:18:34 +08:00
0541b61405
feat(browser-profiler): implement cross-platform keyboard listeners and improve quit handling
fix/exit_with_q
AHMET YILMAZ
2025-08-08 11:18:34 +08:00
66925eb1d6
fix(deep_crawling): fix priority queue ordering and link truncation in BestFirstCrawlingStrategy - ref #1253
fix/deep-crawl-scoring-priority
ntohidi
2025-08-07 15:28:43 +08:00
89cf5aba2b
#1057 : enhance ProxyConfig initialization to support dict and string formats
bug/proxy_config
AHMET YILMAZ
2025-08-06 18:34:23 +08:00
6b0b5301ba
Release v0.7.3:
ntohidi
2025-08-06 17:52:01 +08:00
7a8190ecb6
Fix examples in README.md
Nezar Ali
2025-08-06 11:58:29 +03:00
64f37792a7
Merge pull request #1170 from prokopis3/fix/create-profile
Nasrin
2025-08-06 16:29:14 +08:00
6735c68288
Merge pull request #1170 from prokopis3/fix/create-profile
Nasrin
2025-08-06 16:29:14 +08:00
a5bcac4c9d
feat(docs): enhance table data access example with a real url
ntohidi
2025-08-06 15:19:37 +08:00
45d8327d23
Merge pull request #1366 from unclecode/fix/update-tables-documentation
Nasrin
2025-08-06 15:15:24 +08:00
437395e490
Merge branch 'feat/undetected-browser' into develop-future
ntohidi
2025-08-06 15:03:30 +08:00
fddae303fb
docs: Update README.md and modify Media and Tables Documentation.(#1271 ) - Update Table-to-DataFrame Extraction example in README.md - Replace old method of accessing tables via result.media directly with result.tables in the documentation - Remove tables section from links & media page. - Add tables section to crawler result page.
Soham Kukreti
2025-08-05 23:23:17 +05:30
660d7011b9
In obtaining cleaned_html, the tag "script" needs to be processed separately.
lizhuxiong
2025-08-05 16:27:03 +08:00
6d3444ba17
In obtaining cleaned_html, the tag "script" needs to be processed separately.
lizhuxiong
2025-08-05 16:18:34 +08:00
ff6ea41ac3
feat(docker): add flexible LLM provider configuration
ntohidi
2025-08-05 14:09:54 +08:00
31a435fb0e
Merge branch 'develop' of https://github.com/unclecode/crawl4ai into develop
ntohidi
2025-08-04 19:12:19 +08:00
5de6a28055
Merge pull request #1361 from unclecode/fix/crawler-result-docs
Nasrin
2025-08-04 19:12:09 +08:00
de1561ad14
Merge branch 'develop' of https://github.com/unclecode/crawl4ai into develop
ntohidi
2025-08-04 19:04:50 +08:00
337b588732
Merge pull request #1358 from shonenada/patch-1
Nasrin
2025-08-04 19:04:42 +08:00
7a6ad547f0
Squashed commit of the following:
ntohidi
2025-08-04 19:02:01 +08:00
e6692b987d
docs: Update CrawlResult documentation with missing fields. - Add missing fields: fit_html, js_execution_result, redirected_url, network_requests, console_messages, tables
Soham Kukreti
2025-08-04 15:40:33 +05:30
307fe28b32
fix: Correct URL matcher fallback behavior and improve memory monitoring
ntohidi
2025-08-03 16:50:54 +08:00
438a103b17
Fix typos in examples.md
Yaoda Liu
2025-08-03 14:33:10 +08:00
a03e68fa2f
feat: Add URL-specific crawler configurations for multi-URL crawling
ntohidi
2025-08-02 19:10:36 +08:00
864d87afb2
Merge pull request #1339 from charlaie/fix-sitemap-redirect
Nasrin
2025-07-31 15:21:03 +08:00
508b6fc233
fix: Enable following redirects in sitemap fetching for seeder
Charlie C
2025-07-25 15:57:09 +08:00
8a906fcad0
fix(dependencies): Update and clean up package versions in pyproject.toml, the bundle size will be much smaller.
next
UncleCode
2025-07-29 19:56:27 +08:00
54ae10d957
feat(extraction_strategy): Enhance schema generation with improved validation and task description handling
UncleCode
2025-07-29 19:33:36 +08:00
8e3c411a3e
Merge branch 'main' into main
Emmanuel Ferdman
2025-07-29 14:05:35 +03:00
e3281935bc
fix: Add write permissions for GitHub release creation
UncleCode
2025-07-25 18:22:45 +08:00
48647300b4
chore: Bump version to 0.7.2
v0.7.2
release/v0.7.2
UncleCode
2025-07-25 17:42:48 +08:00
9f9ea3bb3b
chore: Clean up test artifacts and disable test workflow
release/v0.7.1
UncleCode
2025-07-25 17:31:52 +08:00
d58b93c207
fix: Re-enable multi-platform Docker builds for ARM64 support
UncleCode
2025-07-25 16:38:11 +08:00
e2b4705010
fix: Use hardcoded Docker repository name to avoid masking issues
UncleCode
2025-07-25 15:52:26 +08:00
4a1abd5086
fix: Handle existing version on Test PyPI gracefully
UncleCode
2025-07-25 15:41:16 +08:00
04258cd4f2
fix: Speed up Docker test builds by using single platform and caching
UncleCode
2025-07-25 15:37:44 +08:00
84e462d9f8
Merge remote-tracking branch 'origin/develop'
UncleCode
2025-07-25 15:35:53 +08:00
9546773a07
fix: Move sentence-transformers to optional dependencies
UncleCode
2025-07-24 21:24:40 +08:00
66a979ad11
fix: Install dependencies before version check in workflows
UncleCode
2025-07-24 21:01:36 +08:00
0c31e91b53
feat: Add CI/CD workflows for automated PyPI and Docker releases
UncleCode
2025-07-24 20:58:43 +08:00
843457a9cb
Refactor adaptive crawling state management
UncleCode
2025-07-24 20:11:43 +08:00
1b6a31f88f
fix: encode PDF results to base64 in /crawl endpoint. ref #1301
ntohidi
2025-07-23 13:52:18 +02:00
b8c261780f
Merge pull request #1319 from volumetric/fix_for_bug_#1310
Nasrin
2025-07-23 12:45:12 +02:00
db6ad7a79d
fix: update links in README and C4A-Script documentation for accuracy
ntohidi
2025-07-23 09:47:18 +02:00
004d514f33
Merge pull request #1265 from unclecode/feature/nasrin-cli-deep-crawl
Nasrin
2025-07-23 09:40:33 +02:00
d1de82a332
feat(crawl4ai): Implement SMART cache mode
UncleCode
2025-07-21 21:19:37 +08:00
8a04351406
feat(crawl4ai): Update to version 0.7.1 with improvements and new tests
UncleCode
2025-07-18 16:27:19 +08:00
3a9e2c716e
Remvoed the incorrect reference in browser_config variable
Vinit Agrawal
2025-07-18 10:01:00 +05:30
0163bd797c
Merge branch 'release/v0.7.1'
v0.7.1
unclecode
2025-07-17 17:42:04 +08:00
26bad799e4
chore: update version to 0.7.1
ntohidi
2025-07-17 11:37:41 +02:00
cf8badfe27
feat: cleanup unused code and enhance documentation for v0.7.1
ntohidi
2025-07-17 11:35:16 +02:00
805c498adf
docs: add simple anti-bot examples
feat/undetected-browser
unclecode
2025-07-17 17:05:35 +08:00
6a728cbe5b
feat: add stealth mode and enhance undetected browser support
unclecode
2025-07-17 16:59:10 +08:00
ccbe3c105c
refactor: improve link scoring output format in release notes
ntohidi
2025-07-17 09:13:20 +02:00
761c19d54b
Merge pull request #1307 from unclecode/fix/json-infinity-serialization
Nasrin
2025-07-16 13:34:25 +02:00
14b0ecb137
Merge pull request #1305 from unclecode/fix/release-notes-demo-code
Nasrin
2025-07-16 13:33:53 +02:00
65902a4773
feat: Enhance stealth compatibility with new and legacy APIs, add configuration support
fix/playwright-stealth
AHMET YILMAZ
2025-07-16 17:41:47 +08:00
0eaa9f9895
fix: handle infinity values in JSON serialization for API responses
fix/json-infinity-serialization
ntohidi
2025-07-15 13:49:07 +02:00
1d1970ae69
docs: Update release notes and docs for v0.7.0 with teh correct parameters and explanations
fix/release-notes-demo-code
ntohidi
2025-07-15 11:32:04 +02:00
205df1e330
docs: Fix virtual scroll configuration
ntohidi
2025-07-15 10:29:47 +02:00
2640dc73a5
docs: Enhance session management example for dynamic content crawling with improved JavaScript handling and extraction schema. ref #226
ntohidi
2025-07-15 10:19:29 +02:00
58024755c5
docs: Update adaptive crawling parameters and examples in README and release notes
ntohidi
2025-07-15 10:15:05 +02:00
5c13baf574
feat: Add stealth option to BrowserConfig for enhanced browser behavior
AHMET YILMAZ
2025-07-15 15:48:23 +08:00
d2759824ef
fix: Update playwright-stealth to v2.0.0+ compatibility
AHMET YILMAZ
2025-07-15 15:09:53 +08:00
5c33cbcca2
feat: add undetected browser support with adapter pattern
unclecode
2025-07-14 17:29:50 +08:00
83b323f13a
fix VersionManager not using CRAWL4_AI_BASE_DIRECTORY
Vladimir Mandic
2025-07-12 17:40:34 -04:00
dd5ee752cf
docs: Add missing documentation pages to mkdocs.yml
UncleCode
2025-07-12 19:58:26 +08:00
bde1bba6a2
docs: Add missing documentation pages to mkdocs.yml
UncleCode
2025-07-12 19:56:33 +08:00
7b80eb6b99
docs: Add missing documentation pages to mkdocs.yml
UncleCode
2025-07-12 19:55:35 +08:00
14f690d751
docs: Update documentation for v0.7.0 release
UncleCode
2025-07-12 19:08:17 +08:00
7b9ba3015f
Merge branch 'release/v0.7.0' - The Adaptive Intelligence Update
v0.7.0
UncleCode
2025-07-12 18:54:20 +08:00
0c8bb742b7
Release v0.7.0-r1: The Adaptive Intelligence Update
release/v0.7.0
UncleCode
2025-07-12 18:51:13 +08:00
ba2ed53ff1
test(releases): Add test cases for release 0.7.0
UncleCode
2025-07-11 22:27:18 +08:00
a93efcb650
Merge PR #1285 : 2025 APR, MAY, and JUN bug fixes
UncleCode
2025-07-11 21:22:34 +08:00
8794852a26
Merge PR #1285 : 2025 APR, MAY, and JUN bug fixes
UncleCode
2025-07-11 21:22:03 +08:00
fb25a4a769
docs(examples): update crawl4ai showcase script
UncleCode
2025-07-11 20:55:37 +08:00
afe852935e
fix: show /llm API response in playground. ref #1288
next-MAY
ntohidi
2025-07-09 16:59:17 +02:00
0ebce590f8
Merge branch '2025-JUN-1' into next-MAY
ntohidi
2025-07-09 09:41:03 +02:00
026e96a2df
feat: Add social media and community links to README and index documentation
ntohidi
2025-07-08 15:48:40 +02:00