This website requires JavaScript.
Explore
Help
Register
Sign In
ayrisdev
/
crawl4ai
Watch
1
Star
0
Fork
0
You've already forked crawl4ai
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
dbb751c8f09f76ffce4046784c2cd2b0021de7d0
crawl4ai
/
tests
History
UncleCode
dbb751c8f0
In this commit, we introduce the new concept of MakrdownGenerationStrategy, which allows us to expand our future strategies to generate better markdown. Right now, we generate raw markdown as we were doing before. We have a new algorithm for fitting markdown based on BM25, and now we add the ability to refine markdown into a citation form. Our links will be extracted and replaced by a citation reference number, and then we will have reference sections at the very end; we add all the links with the descriptions. This format is more suitable for large language models. In case we don't need to pass links, we can reduce the size of the markdown significantly and also attach the list of references as a separate file to a large language model. This commit contains changes for this direction.
2024-11-21 18:21:43 +08:00
..
async
In this commit, we introduce the new concept of MakrdownGenerationStrategy, which allows us to expand our future strategies to generate better markdown. Right now, we generate raw markdown as we were doing before. We have a new algorithm for fitting markdown based on BM25, and now we add the ability to refine markdown into a citation form. Our links will be extracted and replaced by a citation reference number, and then we will have reference sections at the very end; we add all the links with the descriptions. This format is more suitable for large language models. In case we don't need to pass links, we can reduce the size of the markdown significantly and also attach the list of references as a separate file to a large language model. This commit contains changes for this direction.
2024-11-21 18:21:43 +08:00
__init__.py
- Test all methods
2024-05-14 21:27:41 +08:00
docker_example.py
feat(cache): introduce CacheMode and CacheContext for enhanced caching behavior
2024-11-17 15:30:56 +08:00
test_docker.py
Update API server request object. text_docker file and Readme
2024-11-07 19:29:31 +08:00
test_main.py
Creating the API server component
2024-11-04 20:33:15 +08:00
test_web_crawler.py
- Test all methods
2024-05-14 21:27:41 +08:00