Files
crawl4ai/tests
UncleCode dc36997a08 feat(schema): improve HTML preprocessing for schema generation
Add new preprocess_html_for_schema utility function to better handle HTML cleaning
for schema generation. This replaces the previous optimize_html function in the
GoogleSearchCrawler and includes smarter attribute handling and pattern detection.

Other changes:
- Update default provider to gpt-4o
- Add DEFAULT_PROVIDER_API_KEY constant
- Make LLMConfig creation more flexible with create_llm_config helper
- Add new dependencies: zstandard and msgpack

This change improves schema generation reliability while reducing noise in the
processed HTML.
2025-03-12 22:40:46 +08:00
..
2025-02-28 19:53:35 +08:00
2024-05-14 21:27:41 +08:00
2025-01-13 19:19:58 +08:00
2025-01-13 19:19:58 +08:00
2025-01-13 19:19:58 +08:00
2025-01-13 19:19:58 +08:00
2025-01-13 19:19:58 +08:00
2025-02-28 19:53:35 +08:00