Commit Graph

11 Commits

Author SHA1 Message Date
unclecode
13a3b21d19 - Add ONNX embedding model for CPU devices, Update the similarithy threshold, improve the embedding speed. 2024-05-19 22:30:10 +08:00
unclecode
eb6423875f chore: Update Selenium options in crawler_strategy.py and add verbose logging in CosineStrategy 2024-05-18 14:13:06 +08:00
unclecode
b6319c6f6e chore: Add support for GPU, MPS, and CPU 2024-05-17 21:56:13 +08:00
unclecode
957a2458b1 chore: Update web crawler URLs to use NBC News business section 2024-05-17 18:11:13 +08:00
unclecode
32c87f0388 chore: Update NlpSentenceChunking constructor parameters to None
The NlpSentenceChunking constructor parameters have been updated to None in order to simplify the usage of the class. This change removes the need for specifying the SpaCy model for sentence detection, making the code more concise and easier to understand.
2024-05-17 17:00:43 +08:00
unclecode
1cc67df301 chore: Update pip installation command and requirements, add new dependencies 2024-05-17 16:53:03 +08:00
unclecode
a5f9d07dbf Remove dependency on Spacy model. 2024-05-17 15:08:03 +08:00
UncleCode
6fcaf26b4f Update quickstart.py: Add counting items 2024-05-16 22:49:12 +08:00
unclecode
c8589f8da3 Update:
- Fix Spacy model issue
- Update Readme and requirements.txt
2024-05-16 19:50:20 +08:00
unclecode
5b80be956d Update:
- Debug
- Refactor code for new version
2024-05-16 17:31:44 +08:00
unclecode
f6e59157bf - Test all methods
- Update index.hml
- Update Readme
- Resolve some bugs
2024-05-14 21:27:41 +08:00