Fix Base64 image parsing in WebScrappingStrategy (issue 182)

- Add support for extracting Base64 encoded images
- Improve image format detection to include Base64 images
- Enhance compatibility with locally saved HTML files using Base64 image encoding
This commit is contained in:
UncleCode
2024-10-20 19:25:25 +08:00
parent 1dd36f9035
commit 04d16e6d2b
2 changed files with 8 additions and 1 deletions

View File

@@ -2,6 +2,9 @@
## [v0.3.72] - 2024-10-20
### Fixed
- Added support for parsing Base64 encoded images in WebScrappingStrategy
### Added
- Forked and integrated a customized version of the html2text library for more control over Markdown generation
- New configuration options for controlling external content: