Add new PDF processing module with the following features:
- PDF text extraction and formatting to HTML/Markdown
- Image extraction with multiple format support (JPEG, PNG, TIFF)
- Link extraction from PDF documents
- Metadata extraction including title, author, dates
- Support for both local and remote PDF files
Also includes:
- New configuration options for HTML attribute handling
- Internal/external link filtering improvements
- Version bump to 0.4.300b4