Commit Graph

15 Commits

Author SHA1 Message Date
unclecode
4d283ab386 ## [v0.2.74] - 2024-07-08
A slew of exciting updates to improve the crawler's stability and robustness! 🎉

- 💻 **UTF encoding fix**: Resolved the Windows \"charmap\" error by adding UTF encoding.
- 🛡️ **Error handling**: Implemented MaxRetryError exception handling in LocalSeleniumCrawlerStrategy.
- 🧹 **Input sanitization**: Improved input sanitization and handled encoding issues in LLMExtractionStrategy.
- 🚮 **Database cleanup**: Removed existing database file and initialized a new one.
2024-07-08 16:33:25 +08:00
unclecode
3cd1b3719f Bump version to v0.2.73, update documentation, and resolve installation issues 2024-07-03 15:26:43 +08:00
unclecode
940df4631f Update ChangeLog 2024-06-30 00:18:40 +08:00
unclecode
f8a11779fe Update change log 2024-06-26 16:48:36 +08:00
unclecode
d11a83c232 ## [0.2.71] 2024-06-26
• Refactored `crawler_strategy.py` to handle exceptions and improve error messages
• Improved `get_content_of_website_optimized` function in `utils.py` for better performance
• Updated `utils.py` with latest changes
• Migrated to `ChromeDriverManager` for resolving Chrome driver download issues
2024-06-26 15:34:15 +08:00
unclecode
3255c7a3fa Update CHANGELOG.md with recent commits 2024-06-26 15:20:34 +08:00
unclecode
2c2362b4d3 issue 19 is resolved
- Update Dockerfile to install mkdocs and build documentation
2024-06-22 17:18:00 +08:00
unclecode
350ca1511b chore: Update configuration values, create new example, and update Dockerfile and README 2024-06-19 18:48:20 +08:00
unclecode
3f0e265baf Merge branch 'format-inline-tags' 2024-06-19 00:48:38 +08:00
unclecode
853b9d59d8 feat: Add hooks for enhanced control over Selenium drivers
- Added six hooks: on_driver_created, before_get_url, after_get_url, before_return_html, on_user_agent_updated.
- Included example usage in quickstart.py.
- Updated README and changelog.
2024-06-18 20:00:51 +08:00
unclecode
413595542a Enhancement: Replaced inline HTML tags with textual format for better LLM context handling #24 2024-06-17 15:14:34 +08:00
unclecode
42a5da854d Update version and change log. 2024-06-17 14:47:58 +08:00
unclecode
b3a0edaa6d - User agent
- Extract Links
- Extract Metadata
- Update Readme
- Update REST API document
2024-06-08 17:59:42 +08:00
unclecode
1635a92218 chore: Update Crawl4AI quickstart script in README.md 2024-05-17 18:25:32 +08:00
unclecode
aac4e07389 chore: Update README.md and project structure 2024-05-12 12:39:31 +08:00