feat: Add hooks for enhanced control over Selenium drivers

- Added six hooks: on_driver_created, before_get_url, after_get_url, before_return_html, on_user_agent_updated.
- Included example usage in quickstart.py.
- Updated README and changelog.
This commit is contained in:
unclecode
2024-06-18 20:00:51 +08:00
parent 6d04284c44
commit 853b9d59d8
5 changed files with 26 additions and 4 deletions

View File

@@ -1,5 +1,15 @@
# Changelog
## [0.2.5] - 2024-06-18
### Added
- Added five important hooks to the crawler:
- on_driver_created: Called when the driver is ready for initializations.
- before_get_url: Called right before Selenium fetches the URL.
- after_get_url: Called after Selenium fetches the URL.
- before_return_html: Called when the data is parsed and ready.
- on_user_agent_updated: Called when the user changes the user_agent, causing the driver to reinitialize.
- Added an example in `quickstart.py` in the example folder under the docs.
## [0.2.4] - 2024-06-17
### Fixed
- Fix issue #22: Use MD5 hash for caching HTML files to handle long URLs
- Fix issue #22: Use MD5 hash for caching HTML files to handle long URLs