UncleCode
b654c49e55
Update .gitignore to exclude additional scripts and files
2024-11-19 19:32:06 +08:00
UncleCode
fbcff85ecb
Remove test files
2024-11-19 19:03:23 +08:00
UncleCode
788c67c29a
Merge branch 'main' of https://github.com/unclecode/crawl4ai
2024-11-19 19:02:44 +08:00
UncleCode
2f19d38693
Update .gitignore to include .gitboss/ and todo_executor.md
2024-11-19 19:02:41 +08:00
ntohidikplay
3aae30ed2a
test1: trying to push to main
2024-11-19 11:57:07 +01:00
ntohidikplay
593c7ad307
test: trying to push to main
2024-11-19 11:45:26 +01:00
UncleCode
38044d4afe
Merge pull request #255 from maheshpec/feature/configure-cache-directory
...
feat(config): Adding a configurable way of setting the cache directory for constrained environments
2024-11-13 09:43:29 +01:00
Mahesh
00026b5f8b
feat(config): Adding a configurable way of setting the cache directory for constrained environments
2024-11-12 14:52:51 -07:00
UncleCode
8c22396d8b
Merge pull request #234 from devatnull/patch-1
...
Fix typo: scrapper → scraper
2024-11-12 08:37:14 +01:00
UncleCode
a098483cbb
Update Roadmap
2024-11-09 20:40:30 +08:00
UncleCode
f9a297e08d
Add Docker example script for testing Crawl4AI functionality
2024-11-08 19:39:05 +08:00
UncleCode
bcdd80911f
Remove some old files.
2024-11-08 19:08:58 +08:00
UncleCode
b120965b6a
Fixed issues with the Manage Browser, including its inability to connect to the user directory and inability to create new pages within the Manage Browser context; all issues are now resolved.
2024-11-07 20:15:03 +08:00
UncleCode
16f918621f
Merge branch 'main' of https://github.com/unclecode/crawl4ai
2024-11-07 19:30:22 +08:00
UncleCode
f7574230a1
Update API server request object. text_docker file and Readme
2024-11-07 19:29:31 +08:00
devatnull
2879344d9c
Update README.md
2024-11-06 17:36:46 +03:00
UncleCode
9f5eef1f38
Refactored the CustomHTML2Text class in content_scrapping_strategy.py to remove the handling logic for header tags (h1-h6), which are now commented out. This cleanup improves code readability and reduces maintenance overhead.
2024-11-06 21:50:09 +08:00
UncleCode
c5aa1bec18
Merge pull request #229 from bizrockman/main
...
Preventing NoneType has no attribute get Errors
2024-11-06 07:31:07 +01:00
UncleCode
b51263664e
feat(api): add CORS support and static file serving, update root redirect
2024-11-05 21:02:47 +08:00
UncleCode
1e7db0d293
docs(README): update release notes for version 0.3.73 with new features and improvements
2024-11-05 20:12:20 +08:00
UncleCode
2a54f3c048
refactor(core): remove main_v0.py file and associated functionality
2024-11-05 20:11:07 +08:00
UncleCode
1c20b815b3
docs(README): update Docker usage instructions and add deployment options
2024-11-05 20:10:24 +08:00
UncleCode
43a2b26f63
Merge branch 'main' of https://github.com/unclecode/crawl4ai
2024-11-05 20:08:20 +08:00
UncleCode
3cf19a1bc2
chore(version): bump version to 0.3.73
2024-11-05 20:05:58 +08:00
UncleCode
67a23c3182
feat(core): Release v0.3.73 with Browser Takeover and Docker Support
...
Major changes:
- Add browser takeover feature using CDP for authentic browsing
- Implement Docker support with full API server documentation
- Enhance Mockdown with tag preservation system
- Improve parallel crawling performance
This release focuses on authenticity and scalability, introducing the ability
to use users' own browsers while providing containerized deployment options.
Breaking changes include modified browser handling and API response structure.
See CHANGELOG.md for detailed migration guide.
2024-11-05 20:04:18 +08:00
bizrockman
796dbaf08c
Rename episode_11_3_Extraction_Strategies:_Cosine.md to episode_11_3_Extraction_Strategies_Cosine.md
...
Name that will work in Windows
2024-11-04 20:19:43 +01:00
bizrockman
3a3c88a2d0
Rename episode_11_2_Extraction_Strategies:_LLM.md to episode_11_2_Extraction_Strategies_LLM.md
...
Name that will work in Windows
2024-11-04 20:19:20 +01:00
bizrockman
870296fa7e
Rename episode_11_1_Extraction_Strategies:_JSON_CSS.md to episode_11_1_Extraction_Strategies_JSON_CSS.md
...
Name that will work in Windows
2024-11-04 20:18:58 +01:00
bizrockman
a28046c233
Rename episode_08_Media_Handling:_Images,_Videos,_and_Audio.md to episode_08_Media_Handling_Images_Videos_and_Audio.md
...
Name that will work in Windows
2024-11-04 20:18:26 +01:00
bizrockman
0bba0e074f
Preventing NoneType has no attribute get Errors
...
Sometimes the list contains Tag elements that do not have attrs set, resulting in this Error.
2024-11-04 20:12:24 +01:00
UncleCode
c4c6227962
Creating the API server component
2024-11-04 20:33:15 +08:00
UncleCode
e6c914d2fa
Refactor version management and remove deprecated gitignore.dev file
2024-11-04 16:51:59 +08:00
UncleCode
be8f4fc59a
Merge branch '0.3.73' of https://github.com/unclecode/crawl4ai into 0.3.73
2024-11-04 14:12:07 +08:00
unclecode
fbdf870fbf
Update CHANGELOG
2024-11-04 14:10:27 +08:00
UncleCode
7b0cca41b4
Update gitignore
2024-11-04 13:48:26 +08:00
UncleCode
33d0e9ec8c
Update dev gitignore
2024-11-04 13:42:37 +08:00
UncleCode
42f1c67ca8
Merge branch '0.3.73' of https://github.com/unclecode/crawl4ai into 0.3.73
2024-11-04 13:39:39 +08:00
UncleCode
e28c49a8fe
Refactor .gitignore.dev file: Add ignore patterns for various files and directories
2024-11-04 13:39:38 +08:00
unclecode
54d5a3a259
Improved database management and error handling, updated README instructions, refined .gitignore, enhanced async web crawling capabilities, and updated dependencies.
2024-11-04 13:22:13 +08:00
UncleCode
de6b43f334
Merge pull request #215 from mjvankampen/build/flexible-requirements
...
build: make requirements more flexible
2024-11-03 08:30:06 +01:00
UncleCode
07f508bd0c
Merge pull request #218 from timoa/main
...
chore(docs): fix documentation links + markdown lint fix
2024-11-03 06:59:30 +01:00
UncleCode
62a86dbe8d
Refactor mission section in README and add mission diagram
2024-10-31 16:38:56 +08:00
UncleCode
492ada0ed4
Add mission diagram to MISSION.md
2024-10-31 15:26:43 +08:00
UncleCode
d8eef02867
Add link to mission statement in README
2024-10-31 15:23:58 +08:00
UncleCode
6c7235d6a7
Add mission.md file
2024-10-31 15:22:00 +08:00
Damien Laureaux
0a09d78fa5
chore(docs): fix documentation links + markdown lint
2024-10-31 05:50:22 +01:00
UncleCode
19c3f3efb2
Refactor tutorial markdown files: Update numbering and formatting
2024-10-30 20:58:07 +08:00
UncleCode
e97e8df6ba
Update README: Fix typo in project name
2024-10-30 20:45:20 +08:00
UncleCode
cb6f5323ae
Update README
2024-10-30 20:44:57 +08:00
UncleCode
47464cedec
Update README
2024-10-30 20:42:27 +08:00