From f334daa979dcd7161649deab84a3642fa2fa57aa Mon Sep 17 00:00:00 2001 From: UncleCode Date: Mon, 3 Mar 2025 21:54:58 +0800 Subject: [PATCH] feat(deep-crawling): add max_pages and score_threshold parameters for improved crawling control --- CHANGELOG.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index cebaab96..96b1eb0f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -13,6 +13,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - *(cli)* Add interactive profile management to CLI with rich UI - *(profiles)* Add ability to crawl directly from profile management interface - *(browser)* Support identity-based browsing with persistent profiles +- *(deep-crawling)* Add max_pages parameter to limit the number of pages crawled in all deep crawling strategies +- *(deep-crawling)* Add score_threshold parameter to BFS and DFS strategies to filter URLs by score ### Changed @@ -20,11 +22,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - *(cli)* Enhance CLI with profile selection and status display for crawling - *(examples)* Update identity-based browsing example to use BrowserProfiler class - *(docs)* Update identity-based crawling documentation +- *(docs)* Update deep crawling documentation with max_pages and score_threshold parameters +- *(examples)* Add example demonstrating the use of max_pages and score_threshold parameters ### Fixed - *(browser)* Fix profile detection and management on different platforms - *(cli)* Fix CLI command structure for better user experience +- *(deep-crawling)* Improve BFS and DFS strategies to handle page count limits more efficiently ## Version 0.5.0 (2025-02-21)