Files
crawl4ai/requirements.txt
UncleCode f3ae5a657c feat(scraping): add LXML-based scraping mode for improved performance
Adds a new ScrapingMode enum to allow switching between BeautifulSoup and LXML parsing.
LXML mode offers 10-20x better performance for large HTML documents.

Key changes:
- Added ScrapingMode enum with BEAUTIFULSOUP and LXML options
- Implemented LXMLWebScrapingStrategy class
- Added LXML-based metadata extraction
- Updated documentation with scraping mode usage and performance considerations
- Added cssselect dependency

BREAKING CHANGE: None
2025-01-12 20:46:23 +08:00

23 lines
478 B
Plaintext

# Note: These requirements are also specified in pyproject.toml
# This file is kept for development environment setup and compatibility
aiosqlite~=0.20
lxml~=5.3
litellm>=1.53.1
numpy>=1.26.0,<3
pillow~=10.4
playwright>=1.49.0
python-dotenv~=1.0
requests~=2.26
beautifulsoup4~=4.12
tf-playwright-stealth>=1.1.0
xxhash~=3.4
rank-bm25~=0.2
aiofiles>=24.1.0
colorama~=0.4
snowballstemmer~=2.2
pydantic>=2.10
pyOpenSSL>=24.3.0
psutil>=6.1.1
nltk>=3.9.1
rich>=13.9.4
cssselect>=1.2.0