unclecode
|
4a50781453
|
chore: Remove local and .files folders from .gitignore
|
2024-06-17 15:57:34 +08:00 |
|
unclecode
|
8b8683f22e
|
Add research assistant example using Chainlit
|
2024-06-04 22:43:09 +08:00 |
|
QIN2DIM
|
5cee084340
|
fix(main): UnicodeDecodeError
File "T:\_GitHubProjects\Forks\crawl4ai\main.py", line 70, in read_index
partials[filename[:-5]] = file.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa4 in position 149: illegal multibyte sequence
|
2024-05-18 23:31:11 +08:00 |
|
Unclecode
|
bf00c26a83
|
chore: Update Dockerfile to install chromium-chromedriver and spacy library
|
2024-05-18 09:16:52 +00:00 |
|
unclecode
|
199c66114c
|
chore: Update pip installation command and requirements, add new dependencies
|
2024-05-16 20:58:36 +08:00 |
|
unclecode
|
f6e59157bf
|
- Test all methods
- Update index.hml
- Update Readme
- Resolve some bugs
|
2024-05-14 21:27:41 +08:00 |
|
unclecode
|
82706129f5
|
Update:
- Text Categorization
- Crawler, Extraction, and Chunking strategies
- Clustering for semantic segmentation
|
2024-05-12 22:37:21 +08:00 |
|
unclecode
|
7039e3c1ee
|
- Issue Resolved: Every <pre> tag's HTML content is replaced with its inner text to address situations like syntax highlighters, where each character might be in a <span>. This avoids issues where the minimum word threshold might ignore them.
|
2024-05-12 14:08:22 +08:00 |
|
unclecode
|
181250cb93
|
chore: Add function to clear the database
|
2024-05-09 19:42:43 +08:00 |
|
unclecode
|
c71adb29ce
|
chore: Update .gitignore and README.md
|
2024-05-09 19:25:25 +08:00 |
|
unclecode
|
b8e743cd8d
|
Initial Commit
|
2024-05-09 19:10:25 +08:00 |
|