feat(filters): add reverse option to URLPatternFilter
Adds a new 'reverse' parameter to URLPatternFilter that allows inverting the filter's logic. When reverse=True, URLs that would normally match are rejected and vice versa. Also removes unused 'scraped_html' from WebScrapingStrategy output to reduce memory usage. BREAKING CHANGE: WebScrapingStrategy no longer returns 'scraped_html' in its output dictionary
This commit is contained in:
@@ -848,7 +848,7 @@ class WebScrapingStrategy(ContentScrapingStrategy):
|
||||
|
||||
return {
|
||||
# **markdown_content,
|
||||
"scraped_html": html,
|
||||
# "scraped_html": html,
|
||||
"cleaned_html": cleaned_html,
|
||||
"success": success,
|
||||
"media": media,
|
||||
|
||||
Reference in New Issue
Block a user