Commit Graph

9 Commits

Author SHA1 Message Date
UncleCode
d5ed451299 Enhance crawler capabilities and documentation
- Add llm.txt generator
  - Added SSL certificate extraction in AsyncWebCrawler.
  - Introduced new content filters and chunking strategies for more robust data extraction.
  - Updated documentation.
2024-12-25 21:34:31 +08:00
unclecode
f6e59157bf - Test all methods
- Update index.hml
- Update Readme
- Resolve some bugs
2024-05-14 21:27:41 +08:00
unclecode
82706129f5 Update:
- Text Categorization
- Crawler, Extraction, and Chunking strategies
- Clustering for semantic segmentation
2024-05-12 22:37:21 +08:00
unclecode
50d7a7e45d chore: Update forced flag for single page fetch to use default value 2024-05-09 22:21:12 +08:00
unclecode
3ff1d15702 Change the project folder name from crawler to crawl4ai 2024-05-09 22:16:28 +08:00
unclecode
b9d9d2bbd4 chore: Update URL for single page fetch to NBC News 2024-05-09 20:05:59 +08:00
unclecode
181250cb93 chore: Add function to clear the database 2024-05-09 19:42:43 +08:00
unclecode
51095062d4 Update file names 2024-05-09 19:26:16 +08:00
unclecode
b8e743cd8d Initial Commit 2024-05-09 19:10:25 +08:00