Unclecode
90ba51b52f
fix(mkdocs): correct typo in Docker Deployment navigation entry
2024-12-04 12:31:41 +00:00
Unclecode
11721eb0ce
Merge branch 'main' of https://github.com/unclecode/crawl4ai
2024-11-05 13:02:59 +00:00
UncleCode
b51263664e
feat(api): add CORS support and static file serving, update root redirect
2024-11-05 21:02:47 +08:00
Unclecode
1222e456fb
Merge branch 'main' of https://github.com/unclecode/crawl4ai
2024-11-05 12:58:30 +00:00
UncleCode
1e7db0d293
docs(README): update release notes for version 0.3.73 with new features and improvements
2024-11-05 20:12:20 +08:00
UncleCode
2a54f3c048
refactor(core): remove main_v0.py file and associated functionality
2024-11-05 20:11:07 +08:00
UncleCode
1c20b815b3
docs(README): update Docker usage instructions and add deployment options
2024-11-05 20:10:24 +08:00
UncleCode
43a2b26f63
Merge branch 'main' of https://github.com/unclecode/crawl4ai
2024-11-05 20:08:20 +08:00
UncleCode
3cf19a1bc2
chore(version): bump version to 0.3.73
2024-11-05 20:05:58 +08:00
UncleCode
67a23c3182
feat(core): Release v0.3.73 with Browser Takeover and Docker Support
...
Major changes:
- Add browser takeover feature using CDP for authentic browsing
- Implement Docker support with full API server documentation
- Enhance Mockdown with tag preservation system
- Improve parallel crawling performance
This release focuses on authenticity and scalability, introducing the ability
to use users' own browsers while providing containerized deployment options.
Breaking changes include modified browser handling and API response structure.
See CHANGELOG.md for detailed migration guide.
2024-11-05 20:04:18 +08:00
UncleCode
c4c6227962
Creating the API server component
2024-11-04 20:33:15 +08:00
UncleCode
e6c914d2fa
Refactor version management and remove deprecated gitignore.dev file
2024-11-04 16:51:59 +08:00
UncleCode
be8f4fc59a
Merge branch '0.3.73' of https://github.com/unclecode/crawl4ai into 0.3.73
2024-11-04 14:12:07 +08:00
unclecode
fbdf870fbf
Update CHANGELOG
2024-11-04 14:10:27 +08:00
UncleCode
7b0cca41b4
Update gitignore
2024-11-04 13:48:26 +08:00
UncleCode
33d0e9ec8c
Update dev gitignore
2024-11-04 13:42:37 +08:00
UncleCode
42f1c67ca8
Merge branch '0.3.73' of https://github.com/unclecode/crawl4ai into 0.3.73
2024-11-04 13:39:39 +08:00
UncleCode
e28c49a8fe
Refactor .gitignore.dev file: Add ignore patterns for various files and directories
2024-11-04 13:39:38 +08:00
unclecode
54d5a3a259
Improved database management and error handling, updated README instructions, refined .gitignore, enhanced async web crawling capabilities, and updated dependencies.
2024-11-04 13:22:13 +08:00
UncleCode
de6b43f334
Merge pull request #215 from mjvankampen/build/flexible-requirements
...
build: make requirements more flexible
2024-11-03 08:30:06 +01:00
UncleCode
07f508bd0c
Merge pull request #218 from timoa/main
...
chore(docs): fix documentation links + markdown lint fix
2024-11-03 06:59:30 +01:00
UncleCode
62a86dbe8d
Refactor mission section in README and add mission diagram
2024-10-31 16:38:56 +08:00
UncleCode
492ada0ed4
Add mission diagram to MISSION.md
2024-10-31 15:26:43 +08:00
UncleCode
d8eef02867
Add link to mission statement in README
2024-10-31 15:23:58 +08:00
UncleCode
6c7235d6a7
Add mission.md file
2024-10-31 15:22:00 +08:00
Damien Laureaux
0a09d78fa5
chore(docs): fix documentation links + markdown lint
2024-10-31 05:50:22 +01:00
Unclecode
e8aaa57cb2
Merge branch 'main' of https://github.com/unclecode/crawl4ai
2024-10-30 12:59:34 +00:00
UncleCode
19c3f3efb2
Refactor tutorial markdown files: Update numbering and formatting
2024-10-30 20:58:07 +08:00
Unclecode
a661b3173d
Merge branch 'main' of https://github.com/unclecode/crawl4ai
2024-10-30 12:47:07 +00:00
UncleCode
e97e8df6ba
Update README: Fix typo in project name
2024-10-30 20:45:20 +08:00
UncleCode
cb6f5323ae
Update README
2024-10-30 20:44:57 +08:00
UncleCode
47464cedec
Update README
2024-10-30 20:42:27 +08:00
UncleCode
982d203d91
Merge branch '0.3.73'
2024-10-30 20:40:09 +08:00
UncleCode
9307c19f35
Update documents, upload new version of quickstart.
2024-10-30 20:39:35 +08:00
Mark Jan van Kampen
605a82793b
fix dev requirements and lock playwright due to failing tests
2024-10-30 10:41:37 +01:00
Mark Jan van Kampen
df9ee44d42
build: make requirements more flexible
...
According to #102 the requirements specified are minimum version. Currently they are defined as fixed versions in requirements.txt and setup.py leading to projects consuming this package are limited to using exactly these requirements instead of a more flexible range. This PR addresses this.
2024-10-30 10:03:22 +01:00
UncleCode
e9f7d5e73a
Merge branch '0.3.73'
2024-10-30 00:16:49 +08:00
UncleCode
3529c2e732
Update new tutorial documents and added to the docs folder.
2024-10-30 00:16:18 +08:00
UncleCode
d9e0b7abab
Fix README badge
2024-10-28 15:14:16 +08:00
UncleCode
b2800fefc6
Add badges to README
2024-10-28 15:10:12 +08:00
UncleCode
d913e20edc
Update Readme
2024-10-28 15:09:37 +08:00
Unclecode
b781b6df96
Merge branch 'main' of https://github.com/unclecode/crawl4ai
2024-10-27 11:42:23 +00:00
UncleCode
c2a71a5abe
Update Docs folder, prepare branch for new version 0.3.73
v.3.72
2024-10-27 19:35:13 +08:00
UncleCode
d61615e0b0
Merge branch '0.3.72'
2024-10-27 19:33:05 +08:00
UncleCode
ac9d83c72f
Update gitignore
2024-10-27 19:29:04 +08:00
UncleCode
ff9149b5c9
Merge branch 'main' of https://github.com/unclecode/crawl4ai
2024-10-27 19:28:05 +08:00
UncleCode
4239654722
Update Documentation
2024-10-27 19:24:46 +08:00
UncleCode
38474bd66a
Update version
2024-10-24 20:24:21 +08:00
UncleCode
bcfe83f702
feat: enhance crawler with overlay removal and improved screenshot capabilities
...
• Add smart overlay removal system for handling popups and modals
• Improve screenshot functionality with configurable timing controls
• Implement URL normalization and enhanced link processing
• Add custom base directory support for cache storage
• Refine external content filtering and social media domain handling
This commit significantly improves the crawler's ability to handle modern
websites by automatically removing intrusive overlays and providing better
screenshot capabilities. URL handling is now more robust with proper
normalization and duplicate detection. The cache system is more flexible
with customizable base directory support.
Breaking changes: None
Issue numbers: None
2024-10-24 20:22:47 +08:00
UncleCode
32f57c49d6
Merge pull request #194 from IdrisHanafi/feat/customize-crawl-base-directory
...
Support for custom crawl base directory
2024-10-24 13:09:27 +02:00