crawl4ai/docs/examples/dispatcher_example.py at b71d624168cc668982c6ed5ebddfd15d0e615cbb

Files

UncleCode 9547bada3a feat(content): add target_elements parameter for selective content extraction

Adds new target_elements parameter to CrawlerRunConfig that allows more flexible content selection than css_selector. This enables focusing markdown generation and data extraction on specific elements while still processing the entire page for links and media.

Key changes:
- Added target_elements list parameter to CrawlerRunConfig
- Modified WebScrapingStrategy and LXMLWebScrapingStrategy to handle target_elements
- Updated documentation with examples and comparison between css_selector and target_elements
- Fixed table extraction in content_scraping_strategy.py

BREAKING CHANGE: Table extraction logic has been modified to better handle thead/tbody structures

2025-03-10 18:54:51 +08:00

4.6 KiB

Raw Blame History

View Raw

4.6 KiB Raw Blame History

4.6 KiB

Raw Blame History