Files
crawl4ai/crawl4ai/html2text
Yurii Chukhlib 2016d669a9 fix: Respect <base> tag for relative link resolution in html2text
Fixes #1680

The HTML2Text class was ignoring the <base> tag, causing relative links
to be resolved against the page URL instead of the base URL specified
in the <base href="..."> attribute.

Added <base> tag handling in both HTML2Text and CustomHTML2Text to update
self.baseurl when the tag is encountered, ensuring proper link resolution
according to HTML standards.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-17 12:16:01 +01:00
..
2025-01-13 19:19:58 +08:00
2025-01-13 19:19:58 +08:00