crawl4ai

Author	SHA1	Message	Date
UncleCode	e4acd18429	docs: update README for version 0.3.743 with new features, enhancements, and contributor acknowledgments	2024-11-28 13:06:30 +08:00
zhounan	73661f7d1f	docs: enhance development installation instructions (#286 ) Thanks for your contribution. I'm merging your changes and I'll add your name to our contributor list. Thank you so much.	2024-11-27 15:04:20 +08:00
unclecode	d7c5b900b8	feat: add support for arm64 platform in Docker commands and update INSTALL_TYPE variable in docker-compose	2024-11-24 19:35:53 +08:00
UncleCode	8dea3f470f	chore: update README to include new features and improvements for version 0.3.74	2024-11-22 18:50:12 +08:00
UncleCode	e02935dc5b	chore: update README to reflect new features and improvements in version 0.3.74	2024-11-22 18:49:22 +08:00
UncleCode	571dda6549	Update Redme	2024-11-22 18:27:43 +08:00
UncleCode	006bee4a5a	feat: enhance image processing capabilities - Enhanced image processing with srcset support and validation checks for better image selection.	2024-11-22 16:00:17 +08:00
UncleCode	b6af94cbbb	Merge remote-tracking branch 'origin/main' into 0.3.74	2024-11-18 21:15:04 +08:00
UncleCode	152ac35bc2	feat(docs): update README for version 0.3.74 with new features and improvements fix(version): update version number to 0.3.74 refactor(async_webcrawler): enhance logging and add domain-based request delay	2024-11-17 21:09:26 +08:00
UncleCode	df63a40606	feat(docs): update examples and documentation to replace bypass_cache with cache_mode for improved clarity	2024-11-17 19:44:45 +08:00
UncleCode	3a524a3bdd	fix(docs): remove unnecessary blank line in README for improved readability	2024-11-17 16:00:39 +08:00
UncleCode	4b45b28f25	feat(docs): enhance deployment documentation with one-click setup, API security details, and Docker Compose examples	2024-11-16 18:44:47 +08:00
UncleCode	bf91adf3f8	fix: Resolve unexpected BrowserContext closure during crawl in Docker - Removed __del__ method in AsyncPlaywrightCrawlerStrategy to ensure reliable browser lifecycle management by using explicit context managers. - Added process monitoring in ManagedBrowser to detect and log unexpected terminations of the browser subprocess. - Updated Docker configuration to expose port 9222 for remote debugging and allocate extra shared memory to prevent browser crashes. - Improved error handling and resource cleanup for browser instances, particularly in Docker environments. Resolves Issue #256	2024-11-13 15:37:16 +08:00
UncleCode	8c22396d8b	Merge pull request #234 from devatnull/patch-1 Fix typo: scrapper → scraper	2024-11-12 08:37:14 +01:00
UncleCode	a098483cbb	Update Roadmap	2024-11-09 20:40:30 +08:00
UncleCode	f7574230a1	Update API server request object. text_docker file and Readme	2024-11-07 19:29:31 +08:00
devatnull	2879344d9c	Update README.md	2024-11-06 17:36:46 +03:00
UncleCode	1e7db0d293	docs(README): update release notes for version 0.3.73 with new features and improvements	2024-11-05 20:12:20 +08:00
UncleCode	1c20b815b3	docs(README): update Docker usage instructions and add deployment options	2024-11-05 20:10:24 +08:00
UncleCode	43a2b26f63	Merge branch 'main' of https://github.com/unclecode/crawl4ai	2024-11-05 20:08:20 +08:00
UncleCode	67a23c3182	feat(core): Release v0.3.73 with Browser Takeover and Docker Support Major changes: - Add browser takeover feature using CDP for authentic browsing - Implement Docker support with full API server documentation - Enhance Mockdown with tag preservation system - Improve parallel crawling performance This release focuses on authenticity and scalability, introducing the ability to use users' own browsers while providing containerized deployment options. Breaking changes include modified browser handling and API response structure. See CHANGELOG.md for detailed migration guide.	2024-11-05 20:04:18 +08:00
unclecode	54d5a3a259	Improved database management and error handling, updated README instructions, refined .gitignore, enhanced async web crawling capabilities, and updated dependencies.	2024-11-04 13:22:13 +08:00
UncleCode	07f508bd0c	Merge pull request #218 from timoa/main chore(docs): fix documentation links + markdown lint fix	2024-11-03 06:59:30 +01:00
UncleCode	62a86dbe8d	Refactor mission section in README and add mission diagram	2024-10-31 16:38:56 +08:00
UncleCode	d8eef02867	Add link to mission statement in README	2024-10-31 15:23:58 +08:00
UncleCode	6c7235d6a7	Add mission.md file	2024-10-31 15:22:00 +08:00
Damien Laureaux	0a09d78fa5	chore(docs): fix documentation links + markdown lint	2024-10-31 05:50:22 +01:00
UncleCode	e97e8df6ba	Update README: Fix typo in project name	2024-10-30 20:45:20 +08:00
UncleCode	cb6f5323ae	Update README	2024-10-30 20:44:57 +08:00
UncleCode	47464cedec	Update README	2024-10-30 20:42:27 +08:00
UncleCode	9307c19f35	Update documents, upload new version of quickstart.	2024-10-30 20:39:35 +08:00
UncleCode	3529c2e732	Update new tutorial documents and added to the docs folder.	2024-10-30 00:16:18 +08:00
UncleCode	d913e20edc	Update Readme	2024-10-28 15:09:37 +08:00
UncleCode	4239654722	Update Documentation	2024-10-27 19:24:46 +08:00
unclecode	9ffa34b697	Update README	2024-10-14 22:58:27 +08:00
hitesh22rana	768b93140f	docs: fixed css_selector for example	2024-10-05 00:25:41 +09:00
unclecode	bccadec887	Remove dependency on psutil, PyYaml, and extend requests version range	2024-09-29 17:07:06 +08:00
unclecode	7f1c020746	Update README to add link to previous version in branch V0.2.76	2024-09-28 00:31:53 +08:00
unclecode	64190dd0c4	Update README	2024-09-25 17:26:13 +08:00
unclecode	10cdad039d	Update documents and README	2024-09-25 16:52:11 +08:00
unclecode	f1eee09cf4	Update README, add manifest, make selenium optional library	2024-09-25 16:35:14 +08:00
unclecode	4d48bd31ca	Push async version last changes for merge to main branch	2024-09-24 20:52:08 +08:00
unclecode	eb131bebdf	Create series of quickstart files.	2024-09-04 15:33:24 +08:00
unclecode	5c15837677	chore: Update README, generate new notbook for quickstart	2024-09-04 14:46:22 +08:00
datehoer	2ba70b9501	add use proxy and llm baseurl examples	2024-08-27 10:14:54 +08:00
unclecode	e5e6a34e80	## [v0.2.77] - 2024-08-04 Significant improvements in text processing and performance: - 🚀 Dependency reduction: Removed dependency on spaCy model for text chunk labeling in cosine extraction strategy. - 🤖 Transformer upgrade: Implemented text sequence classification using a transformer model for labeling text chunks. - ⚡ Performance enhancement: Improved model loading speed due to removal of spaCy dependency. - 🔧 Future-proofing: Laid groundwork for potential complete removal of spaCy dependency in future versions. These changes address issue #68 and provide a foundation for faster, more efficient text processing in Crawl4AI.	2024-08-04 14:54:18 +08:00
unclecode	897e766728	Update README	2024-08-02 16:04:14 +08:00
unclecode	9200a6731d	## [v0.2.76] - 2024-08-02 Major improvements in functionality, performance, and cross-platform compatibility! 🚀 - 🐳 Docker enhancements: Significantly improved Dockerfile for easy installation on Linux, Mac, and Windows. - 🌐 Official Docker Hub image: Launched our first official image on Docker Hub for streamlined deployment (unclecode/crawl4ai). - 🔧 Selenium upgrade: Removed dependency on ChromeDriver, now using Selenium's built-in capabilities for better compatibility. - 🖼️ Image description: Implemented ability to generate textual descriptions for extracted images from web pages. - ⚡ Performance boost: Various improvements to enhance overall speed and performance.	2024-08-02 16:02:42 +08:00
unclecode	61c166ab19	refactor: Update Crawl4AI version to v0.2.76 This commit updates the Crawl4AI version from v0.2.7765 to v0.2.76. The version number is updated in the README.md file. This change ensures consistency and reflects the correct version of the software.	2024-08-02 15:55:53 +08:00
unclecode	659c8cd953	refactor: Update image description minimum word threshold in get_content_of_website_optimized	2024-08-02 15:55:32 +08:00

1 2 3

133 Commits