Enhance Crawl4AI with new features and documentation
- Fix crawler text mode for improved performance; cover missing `srcset` and `data_srcset` attributes in image tags. - Introduced Managed Browsers for enhanced crawling experience. - Updated documentation for clearer navigation on configuration. - Changed 'text_only' to 'text_mode' in configuration and methods. - Improved performance and relevance in content filtering strategies.
This commit is contained in:
@@ -37,11 +37,11 @@ Here’s how to turn it on:
|
||||
|
||||
```python
|
||||
crawler = AsyncPlaywrightCrawlerStrategy(
|
||||
text_only=True # Set this to True to enable text-only crawling
|
||||
text_mode=True # Set this to True to enable text-only crawling
|
||||
)
|
||||
```
|
||||
|
||||
When `text_only=True`, the crawler automatically:
|
||||
When `text_mode=True`, the crawler automatically:
|
||||
- Disables GPU processing.
|
||||
- Blocks image and JavaScript resources.
|
||||
- Reduces the viewport size to 800x600 (you can override this with `viewport_width` and `viewport_height`).
|
||||
|
||||
Reference in New Issue
Block a user