Commit Graph

15 Commits

Author SHA1 Message Date
unclecode
e7705e661a ADD MKDocs 2024-06-21 17:56:54 +08:00
unclecode
b3a0edaa6d - User agent
- Extract Links
- Extract Metadata
- Update Readme
- Update REST API document
2024-06-08 17:59:42 +08:00
unclecode
8e73a482a2 feat: Add screenshot functionality to crawl_urls
The code changes in this commit add the `screenshot` parameter to the `crawl_urls` function in `main.py`. This allows users to specify whether they want to take a screenshot of the page during the crawling process. The default value is `False`.

This commit message follows the established convention of starting with a type (feat for feature) and providing a concise and descriptive summary of the changes made.
2024-06-07 15:23:32 +08:00
unclecode
0533aeb814 v0.2.3:
- Extract all media tags
- Take screenshot of the page
2024-06-07 15:23:13 +08:00
UncleCode
7381fa95e6 Merge pull request #3 from QIN2DIM/main
fix(main): UnicodeDecodeError
2024-05-23 09:29:28 +08:00
Unclecode
53d1176d53 chore: Update extraction strategy to support GPU, MPS, and CPU, add batch processing for CPU devices 2024-05-19 16:18:58 +00:00
QIN2DIM
5cee084340 fix(main): UnicodeDecodeError
File "T:\_GitHubProjects\Forks\crawl4ai\main.py", line 70, in read_index
    partials[filename[:-5]] = file.read()

UnicodeDecodeError: 'gbk' codec can't decode byte 0xa4 in position 149: illegal multibyte sequence
2024-05-18 23:31:11 +08:00
Unclecode
bf00c26a83 chore: Update Dockerfile to install chromium-chromedriver and spacy library 2024-05-18 09:16:52 +00:00
unclecode
d7b37e849d chore: Update CrawlRequest model to use NoExtractionStrategy as default 2024-05-17 16:50:38 +08:00
unclecode
5b80be956d Update:
- Debug
- Refactor code for new version
2024-05-16 17:31:44 +08:00
unclecode
f6e59157bf - Test all methods
- Update index.hml
- Update Readme
- Resolve some bugs
2024-05-14 21:27:41 +08:00
ntohidi
aa126e436b Add CORS middleware for allowing all origins to make requests 2024-05-10 12:27:40 +02:00
unclecode
3ff1d15702 Change the project folder name from crawler to crawl4ai 2024-05-09 22:16:28 +08:00
unclecode
181250cb93 chore: Add function to clear the database 2024-05-09 19:42:43 +08:00
unclecode
b8e743cd8d Initial Commit 2024-05-09 19:10:25 +08:00