Commit Graph

  • 9e89d27fcd chore(version): bump version to 0.5.0.post2 UncleCode 2025-03-05 14:18:29 +08:00
  • b3ec7ce960 Merge branch 'vr0.5.0.post1' into next UncleCode 2025-03-05 14:17:19 +08:00
  • baee4949d3 refactor(llm): rename LlmConfig to LLMConfig for consistency UncleCode 2025-03-05 14:17:04 +08:00
  • 14fe5ef873 Update config.yml UncleCode 2025-03-05 14:16:24 +08:00
  • e12d2e29e5 Update config.yml unclecode-patch-8 UncleCode 2025-03-05 14:15:57 +08:00
  • fc425023f5 Update config.yml UncleCode 2025-03-05 12:51:07 +08:00
  • 9c58e4ce2e fix(docs): correct section numbering in deepcrawl_example.py tutorial v0.5.0.post1 UncleCode 2025-03-04 20:57:33 +08:00
  • df6a6d5f4f refactor(docs): reorganize tutorial sections and update wrap-up example UncleCode 2025-03-04 20:55:09 +08:00
  • e896c08f9c chore(version): bump version to 0.5.0.post1 vr0.5.0.post1 UncleCode 2025-03-04 20:29:27 +08:00
  • 56bc3c6e45 refactor(cli): improve CLI default command handling UncleCode 2025-03-04 20:28:16 +08:00
  • cbef406f9b docs: update README for version 0.5.0 release with new features and CLI commands UncleCode 2025-03-04 19:24:46 +08:00
  • 8a76563018 chore(docs): update site version to v0.5.x in mkdocs configuration UncleCode 2025-03-04 18:30:03 +08:00
  • 415c1c5bee refactor(core): replace float('inf') with math.inf UncleCode 2025-03-04 18:23:55 +08:00
  • f334daa979 feat(deep-crawling): add max_pages and score_threshold parameters for improved crawling control UncleCode 2025-03-03 21:54:58 +08:00
  • 504207faa6 docs: update text in llm-strategies.md to reflect new changes in LlmConfig Aravind Karnam 2025-03-03 19:24:44 +05:30
  • d024749633 refactor(deep-crawl): add max_pages limit and improve crawl control UncleCode 2025-03-03 21:51:11 +08:00
  • f14e4a4b67 Merge pull request #776 from jawshoeadan/patch-1 Aravind 2025-03-03 19:01:30 +05:30
  • 1e819cdb26 fixes: https://github.com/unclecode/crawl4ai/issues/774 Aravind Karnam 2025-03-03 11:53:15 +05:30
  • 5edfea279d Fix LiteLLM branding and link jawshoeadan 2025-03-02 16:58:00 +01:00
  • c612f9a852 feat(profiles): add CLI command for crawling with browser profiles UncleCode 2025-03-02 21:33:33 +08:00
  • 95175cb394 feat(cli): add browser profile management functionality UncleCode 2025-03-02 20:54:45 +08:00
  • cba4a466e5 feat(browser): add BrowserProfiler class for identity-based browsing UncleCode 2025-03-02 20:32:29 +08:00
  • 7c1705712d fix: https://github.com/unclecode/crawl4ai/issues/756 Aravind Karnam 2025-03-01 18:17:11 +05:30
  • a9e24307cc Release prep (#749) Aravind 2025-02-28 17:23:35 +05:30
  • 3a87b4e43b fix(dependencies): update cchardet to faust-cchardet for compatibility UncleCode 2025-02-26 18:25:58 +08:00
  • 4bcd4cbda1 refactor(pdf): improve PDF processor dependency handling UncleCode 2025-02-25 22:27:55 +08:00
  • 71ce01c9e1 feat(browser): add cdp_url parameter to BrowserManager initialization UncleCode 2025-02-24 14:48:02 +08:00
  • c6d48080a4 feat(logger): add abstract logger base class and file logger implementation UncleCode 2025-02-23 21:23:41 +08:00
  • 46d2f12851 chore: remove old Dockerfile and server script UncleCode 2025-02-22 13:45:04 +08:00
  • 367cd71db9 feat(core): release version 0.5.0 with deep crawling and CLI UncleCode 2025-02-21 19:55:02 +08:00
  • 2af958e12c Feat/llm config (#724) Aravind 2025-02-21 13:11:37 +05:30
  • 3cb28875c3 refactor(config): enhance serialization and config handling UncleCode 2025-02-19 17:23:25 +08:00
  • dad592c801 2025 feb alpha 1 (#685) Aravind 2025-02-19 11:43:17 +05:30
  • c171891999 Merge branch 'main' into next UncleCode 2025-02-19 13:26:42 +08:00
  • 3b1025abbb Merge branch 'main' of https://github.com/unclecode/crawl4ai UncleCode 2025-02-19 13:24:18 +08:00
  • f00dcc276f Update README.md (#562) UncleCode 2025-01-26 04:00:28 +01:00
  • 392c923980 feat(docker): add JWT authentication and improve server architecture UncleCode 2025-02-18 22:07:13 +08:00
  • 2864015469 feat(docker): implement supervisor and secure API endpoints UncleCode 2025-02-17 20:31:20 +08:00
  • 27af4cc27b Fix "raw://" URL parsing logic João Martins 2025-02-15 15:34:59 +00:00
  • 8bb799068e feat(crawler): add HTTP crawler strategy for lightweight web scraping UncleCode 2025-02-15 19:26:30 +08:00
  • 063df572b0 docs(examples): add SERP API project example UncleCode 2025-02-14 23:06:16 +08:00
  • 966fb47e64 feat(config): enhance serialization and add deep crawling exports UncleCode 2025-02-13 21:45:19 +08:00
  • 43e09da694 refactor(crawler): remove content filter functionality UncleCode 2025-02-12 21:59:19 +08:00
  • 69705df0b3 fix(install): ensure proper exit after running doctor command UncleCode 2025-02-11 19:48:23 +08:00
  • 91a5fea11f feat(cli): add command line interface with comprehensive features UncleCode 2025-02-10 16:58:52 +08:00
  • 467be9ac76 feat(deep-crawling): add DFS strategy and update exports; refactor CLI entry point UncleCode 2025-02-09 20:23:40 +08:00
  • 19df96ed56 feat(proxy): add proxy rotation strategy UncleCode 2025-02-09 18:49:10 +08:00
  • b957ff2ecd refactor(crawler): improve HTML handling and cleanup codebase UncleCode 2025-02-07 21:56:27 +08:00
  • 91073c1244 refactor(crawling): improve type hints and code cleanup UncleCode 2025-02-07 19:01:59 +08:00
  • 926beee832 base-config structure is changed (#618) Sezer Bozkır 2025-02-07 12:11:51 +03:00
  • a9415aaaf6 refactor(deep-crawling): reorganize deep crawling strategies and add new implementations UncleCode 2025-02-05 22:50:39 +08:00
  • c308a794e8 refactor(deep-crawl): reorganize deep crawling functionality into dedicated module UncleCode 2025-02-04 23:28:17 +08:00
  • bc7559586f feat(crawler): add deep crawling capabilities with BFS strategy UncleCode 2025-02-04 01:24:49 +08:00
  • 04bc643cec feat(api): improve cache handling and add API tests UncleCode 2025-02-02 20:53:31 +08:00
  • 33a21d6a7a refactor(docker): improve server architecture and configuration UncleCode 2025-02-02 20:19:51 +08:00
  • 7b1ef07c41 refactor(docker): remove unused models and utilities for cleaner codebase UncleCode 2025-02-01 20:10:13 +08:00
  • 2f15976b34 feat(docker): enhance Docker deployment setup and configuration UncleCode 2025-02-01 19:33:27 +08:00
  • 20920fa17b refactor(docker): clean up import statements in server.py UncleCode 2025-02-01 14:28:28 +08:00
  • 53ac3ec0b4 feat(docker): add Docker service integration and config serialization UncleCode 2025-01-31 18:00:16 +08:00
  • ce4f04dad2 feat(docker): add Docker deployment configuration and API server UncleCode 2025-01-31 15:22:21 +08:00
  • f7ce2d42c9 feat: Add deep crawl capabilities to arun_many function feature/scraper Aravind Karnam 2025-01-30 17:49:58 +05:30
  • f81712eb91 refactor(core): reorganize project structure and remove legacy code UncleCode 2025-01-30 19:35:06 +08:00
  • f6edb8342e Refactor: remove the old deep_crawl method Aravind Karnam 2025-01-30 16:22:41 +05:30
  • ca3f0126d3 Refactor:Moved deep_crawl_strategy, inside crawler run config Aravind Karnam 2025-01-30 16:18:15 +05:30
  • 31938fb922 feat(crawler): enhance JavaScript execution and PDF processing UncleCode 2025-01-29 21:03:39 +08:00
  • 858c18df39 fix: removed child_urls from CrawlResult Aravind Karnam 2025-01-29 18:08:34 +05:30
  • 2c8f2ec5a6 Refactor: Renamed scrape to traverse and deep_crawl in a few sections where it applies Aravind Karnam 2025-01-29 16:24:11 +05:30
  • 9ef43bc5f0 Refactor: Move adeep_crawl as method of crawler itself. Create attributes in CrawlResult to reconstruct the tree once deep crawling is completed Aravind Karnam 2025-01-29 15:58:21 +05:30
  • 84ffdaab9a Refactor: Move adeep_crawl as method of crawler itself. Create attributes in CrawlResult to reconstruct the tree once deep crawling is completed Aravind Karnam 2025-01-29 13:06:09 +05:30
  • 78223bc847 feat: create ScraperPageResult model to attach score and depth attributes to yielded/returned crawl results Aravind Karnam 2025-01-28 16:47:30 +05:30
  • 60ce8bbf55 Merge: with v-0.4.3b Aravind Karnam 2025-01-28 12:59:53 +05:30
  • 85847ff13f feat: Aravind Karnam 2025-01-28 12:39:45 +05:30
  • f34b4878cf fix: code formatting Aravind Karnam 2025-01-28 10:00:01 +05:30
  • f8fd9d9eff feat(pdf): add PDF processing capabilities UncleCode 2025-01-27 21:24:15 +08:00
  • d9324e3454 fix: Move the creation of crawler outside the main loop Aravind Karnam 2025-01-27 18:31:13 +05:30
  • 0ff95c83bc feat: change input params to scraper, Add asynchronous context manager to AsyncWebScraper, Optimise filter application Aravind Karnam 2025-01-27 18:13:33 +05:30
  • bb6450f458 Remove robots.txt compliance from scraper Aravind Karnam 2025-01-27 11:58:54 +05:30
  • 513d008de5 feat: Merge reviews from unclecode for scorers and filters & Remove the robots.txt compliance from scraper since that will be now handled by crawler Aravind Karnam 2025-01-27 11:54:10 +05:30
  • 0f00821df5 Fix version vr0.4.3b3 UncleCode 2025-01-26 18:08:24 +08:00
  • dde14eba7d Update README.md (#562) UncleCode 2025-01-26 04:00:28 +01:00
  • 149b69c832 Update README.md unclecode-patch-7 UncleCode 2025-01-26 10:59:48 +08:00
  • 54c84079c4 docs(api): improve formatting and readability of API documentation UncleCode 2025-01-25 22:06:11 +08:00
  • d0586f09a9 Merge branch 'vr0.4.3b3' UncleCode 2025-01-25 21:57:29 +08:00
  • 09ac7ed008 feat(demo): uncomment feature demos and add fake-useragent dependency UncleCode 2025-01-25 21:56:08 +08:00
  • 97796f39d2 docs(examples): update proxy rotation demo and disable other demos UncleCode 2025-01-25 21:52:35 +08:00
  • 4d7f91b378 refactor(user-agent): improve user agent generation system UncleCode 2025-01-25 21:16:39 +08:00
  • 69a77222ef feat(browser): add CDP URL configuration support UncleCode 2025-01-24 15:53:47 +08:00
  • 0afc3e9e5e refactor(examples): update API usage in features demo UncleCode 2025-01-23 22:37:29 +08:00
  • 65d33bcc0f style(docs): improve code formatting in features demo UncleCode 2025-01-23 22:36:58 +08:00
  • 6a01008a2b docs(multi-url): improve documentation clarity and update examples UncleCode 2025-01-23 22:33:36 +08:00
  • cf3e1e748d feat(scraper): add optimized URL scoring system UncleCode 2025-01-23 20:46:33 +08:00
  • 6dc01eae3a refactor(core): improve type hints and remove unused file UncleCode 2025-01-23 18:53:22 +08:00
  • 7b7fe84e0d docs(readme): resolve merge conflict and update version info UncleCode 2025-01-22 20:52:42 +08:00
  • 5c36f4308f Merge branch 'main' of https://github.com/unclecode/crawl4ai UncleCode 2025-01-22 20:51:52 +08:00
  • 45809d1c91 Merge branch 'vr0.4.3b2' UncleCode 2025-01-22 20:51:46 +08:00
  • 357414c345 docs(readme): update version references and fix links UncleCode 2025-01-22 20:46:39 +08:00
  • 260b9120c3 docs(examples): update v0.4.3 features demo to v0.4.3b2 vr0.4.3b2 UncleCode 2025-01-22 20:41:43 +08:00
  • 976ea52167 docs(examples): update demo scripts and fix output formats UncleCode 2025-01-22 20:40:03 +08:00
  • e6ef8d91ba refactor(scraper): optimize URL validation and filter performance UncleCode 2025-01-22 19:45:56 +08:00
  • d21ffad3a2 chore(git): update gitignore patterns scrapper UncleCode 2025-01-22 17:22:26 +08:00