Fix: Use correct URL variable for raw HTML extraction (#1116)
- Prevents full HTML content from being passed as URL to extraction strategies - Added unit tests to verify raw HTML and regular URL processing Fix: Wrong URL variable used for extraction of raw html
This commit is contained in:
@@ -615,7 +615,7 @@ class AsyncWebCrawler:
|
||||
else config.chunking_strategy
|
||||
)
|
||||
sections = chunking.chunk(content)
|
||||
extracted_content = config.extraction_strategy.run(url, sections)
|
||||
extracted_content = config.extraction_strategy.run(_url, sections)
|
||||
extracted_content = json.dumps(
|
||||
extracted_content, indent=4, default=str, ensure_ascii=False
|
||||
)
|
||||
|
||||
Reference in New Issue
Block a user