unclecode
144cfa0eda
Switch to ChromeDriverManager due some issues with download the chrome driver
2024-06-26 13:00:17 +08:00
unclecode
8c77a760fc
Fixed:
...
- Redirect "/" to mkdocs
2024-06-22 20:54:32 +08:00
unclecode
b9bf8ac9d7
Fix mounting the "/" to mkdocs site folder
2024-06-22 20:41:39 +08:00
unclecode
d6182bedd7
chore:
...
- Add demo page to the new mkdocs
- Set website home page to mkdocs
2024-06-22 20:36:01 +08:00
unclecode
e7705e661a
ADD MKDocs
2024-06-21 17:56:54 +08:00
unclecode
b3a0edaa6d
- User agent
...
- Extract Links
- Extract Metadata
- Update Readme
- Update REST API document
2024-06-08 17:59:42 +08:00
unclecode
8e73a482a2
feat: Add screenshot functionality to crawl_urls
...
The code changes in this commit add the `screenshot` parameter to the `crawl_urls` function in `main.py`. This allows users to specify whether they want to take a screenshot of the page during the crawling process. The default value is `False`.
This commit message follows the established convention of starting with a type (feat for feature) and providing a concise and descriptive summary of the changes made.
2024-06-07 15:23:32 +08:00
unclecode
0533aeb814
v0.2.3:
...
- Extract all media tags
- Take screenshot of the page
2024-06-07 15:23:13 +08:00
UncleCode
7381fa95e6
Merge pull request #3 from QIN2DIM/main
...
fix(main): UnicodeDecodeError
2024-05-23 09:29:28 +08:00
Unclecode
53d1176d53
chore: Update extraction strategy to support GPU, MPS, and CPU, add batch processing for CPU devices
2024-05-19 16:18:58 +00:00
QIN2DIM
5cee084340
fix(main): UnicodeDecodeError
...
File "T:\_GitHubProjects\Forks\crawl4ai\main.py", line 70, in read_index
partials[filename[:-5]] = file.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa4 in position 149: illegal multibyte sequence
2024-05-18 23:31:11 +08:00
Unclecode
bf00c26a83
chore: Update Dockerfile to install chromium-chromedriver and spacy library
2024-05-18 09:16:52 +00:00
unclecode
d7b37e849d
chore: Update CrawlRequest model to use NoExtractionStrategy as default
2024-05-17 16:50:38 +08:00
unclecode
5b80be956d
Update:
...
- Debug
- Refactor code for new version
2024-05-16 17:31:44 +08:00
unclecode
f6e59157bf
- Test all methods
...
- Update index.hml
- Update Readme
- Resolve some bugs
2024-05-14 21:27:41 +08:00
ntohidi
aa126e436b
Add CORS middleware for allowing all origins to make requests
2024-05-10 12:27:40 +02:00
unclecode
3ff1d15702
Change the project folder name from crawler to crawl4ai
2024-05-09 22:16:28 +08:00
unclecode
181250cb93
chore: Add function to clear the database
2024-05-09 19:42:43 +08:00
unclecode
b8e743cd8d
Initial Commit
2024-05-09 19:10:25 +08:00