22 Commits

Author SHA1 Message Date
unclecode
539263a8ba chore: Update configuration values for chunk token threshold, overlap rate, and minimum word threshold. Create a new example for LLMExtraction Strategy, update Dockerfile, and README 2024-06-19 18:32:20 +08:00
unclecode
194050705d chore: Add pillow library to requirements.txt 2024-06-10 23:03:32 +08:00
unclecode
51f26d12fe Update for v0.2.2
- Support multiple JS scripts
- Fixed some of bugs
- Resolved a few issue relevant to Colab installation
2024-06-02 15:40:18 +08:00
UncleCode
d9753b6349 Update requirements.txt
Remove tokenizer version from requirements.txt
2024-05-24 14:49:48 +08:00
UncleCode
a554c0b143 Update requirements.txt 2024-05-23 12:52:31 +08:00
unclecode
13a3b21d19 - Add ONNX embedding model for CPU devices, Update the similarithy threshold, improve the embedding speed. 2024-05-19 22:30:10 +08:00
unclecode
468dad6169 chore: Update Dockerfile to install chromium-chromedriver and spacy library 2024-05-17 23:15:39 +08:00
UncleCode
57e5decb55 Update requirements.txt 2024-05-17 22:02:08 +08:00
UncleCode
0a902f562f Update requirements.txt Add Spacy 2024-05-17 21:41:35 +08:00
unclecode
e7bb76f19b chore: Update torch dependency to version 2.3.0 2024-05-17 15:52:39 +08:00
unclecode
593b928967 Update requirements.txt to include latest versions of dependencies 2024-05-17 15:48:14 +08:00
unclecode
bb3d37face chore: Update requirements.txt to include latest versions of dependencies 2024-05-17 15:32:37 +08:00
unclecode
a5f9d07dbf Remove dependency on Spacy model. 2024-05-17 15:08:03 +08:00
unclecode
199c66114c chore: Update pip installation command and requirements, add new dependencies 2024-05-16 20:58:36 +08:00
unclecode
7e0682e0de chore: Update dependencies and installation process 2024-05-16 20:22:50 +08:00
unclecode
8e28eb9efb Add model loader, update requirements.txt 2024-05-16 20:08:21 +08:00
unclecode
c8589f8da3 Update:
- Fix Spacy model issue
- Update Readme and requirements.txt
2024-05-16 19:50:20 +08:00
unclecode
5b80be956d Update:
- Debug
- Refactor code for new version
2024-05-16 17:31:44 +08:00
unclecode
5fea6c064b Improve libraries import 2024-05-13 02:46:35 +08:00
unclecode
b38bf64490 Exclude spaCy from requirements.txt 2024-05-12 22:59:26 +08:00
unclecode
82706129f5 Update:
- Text Categorization
- Crawler, Extraction, and Chunking strategies
- Clustering for semantic segmentation
2024-05-12 22:37:21 +08:00
unclecode
b8e743cd8d Initial Commit 2024-05-09 19:10:25 +08:00