This website requires JavaScript.
Explore
Help
Register
Sign In
ayrisdev
/
crawl4ai
Watch
1
Star
0
Fork
0
You've already forked crawl4ai
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
a677c2b61d1e451e4b6a80e7c7cca993ed1863c6
crawl4ai
/
crawl4ai
/
scraper
History
Aravind Karnam
7a5f83b76f
fix: Added browser config and crawler run config from 0.4.22
2024-12-18 10:33:09 +05:30
..
__init__.py
Fixed a few bugs, import errors and changed to asyncio wait_for instead of timeout to support python versions < 3.11
2024-11-23 12:39:25 +05:30
async_web_scraper.py
Refactored AsyncWebScraper to include comprehensive error handling and progress tracking capabilities. Introduced a ScrapingProgress data class to monitor processed and failed URLs. Enhanced scraping methods to log errors and track stats throughout the scraping process.
2024-11-06 21:09:47 +08:00
bfs_scraper_strategy.py
fix: Added browser config and crawler run config from 0.4.22
2024-12-18 10:33:09 +05:30
filters.py
feat(scraper): Enhance URL filtering and scoring systems
2024-11-08 19:02:28 +08:00
models.py
Parallel processing with retry on failure with exponential backoff - Simplified URL validation and normalisation - respecting Robots.txt
2024-09-19 12:34:12 +05:30
scorers.py
feat(scraper): Enhance URL filtering and scoring systems
2024-11-08 19:02:28 +08:00
scraper_strategy.py
updated definition of can_process_url to include dept as an argument, as it's needed to skip filters for start_url
2024-11-26 18:26:57 +05:30