feat(content-filter): add LLMContentFilter for intelligent markdown generation

Add new LLMContentFilter class that uses LLMs to generate high-quality markdown content:
- Implement intelligent content filtering with customizable instructions
- Add chunk processing for handling large documents
- Support parallel processing of content chunks
- Include caching mechanism for filtered results
- Add usage tracking and statistics
- Update documentation with examples and use cases

Also includes minor changes:
- Disable Pydantic warnings in __init__.py
- Add new prompt template for content filtering
This commit is contained in:
UncleCode
2025-01-18 19:31:07 +08:00
parent 2d6b19e1a2
commit 3d09b6a221
5 changed files with 495 additions and 5 deletions

View File

@@ -76,3 +76,10 @@ else:
WebCrawler = None
# import warnings
# print("Warning: Synchronous WebCrawler is not available. Install crawl4ai[sync] for synchronous support. However, please note that the synchronous version will be deprecated soon.")
import warnings
from pydantic import warnings as pydantic_warnings
# Disable all Pydantic warnings
warnings.filterwarnings("ignore", module="pydantic")
# pydantic_warnings.filter_warnings()