- Updated version to 0.7.8
- Introduced focused stability release addressing 11 community-reported bugs.
- Key fixes include Docker API improvements, LLM extraction enhancements, URL handling corrections, and dependency updates.
- Added detailed release notes for v0.7.8 in the blog and created a dedicated verification script to ensure all fixes are functioning as intended.
- Updated documentation to reflect recent changes and improvements.
- extend LLMConfig with backoff delay/attempt/factor fields and thread them
through LLMExtractionStrategy, LLMContentFilter, table extraction, and
Docker API handlers
- expose the backoff parameter knobs on perform_completion_with_backoff/aperform_completion_with_backoff
and document them in the md_v2 guides
The remove_empty_elements_fast() method was removing whitespace-only
span elements inside <pre> and <code> tags, causing import statements
like "import torch" to become "importtorch". Now skips elements inside
code blocks where whitespace is significant.
- Prevents full HTML content from being passed as URL to extraction strategies
- Added unit tests to verify raw HTML and regular URL processing
Fix: Wrong URL variable used for extraction of raw html
raiseException(f"Failed to generate schema: {str(e)}")
raiseException(f"Failed to generate schema: {str(e)}")
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.