chore: Update configuration values for chunk token threshold, overlap rate, and minimum word threshold. Create a new example for LLMExtraction Strategy, update Dockerfile, and README

This commit is contained in:
unclecode
2024-06-19 18:32:20 +08:00
parent 3f0e265baf
commit 539263a8ba
11 changed files with 212 additions and 130 deletions

View File

@@ -22,6 +22,7 @@ Crawl4AI has one clear task: to simplify crawling and extract useful information
- 🟡 on_user_agent_updated: Called when the user changes the user_agent, causing the driver to reinitialize.
- 📄 Added an example in [`quickstart.py`](https://github.com/unclecode/crawl4ai/blob/main/docs/examples/quickstart.py) in the example folder under the docs.
- ✨ Maintaining the semantic context of inline tags (e.g., abbreviation, DEL, INS) for improved LLM-friendliness.
- 🐳 Updated Dockerfile to ensure compatibility across multiple platforms (Hopefully!).
### v0.2.4
- 🐞 Resolve the issue with the long url. (Issue #22)