unclecode
fb2a6d0d04
chore: Update documentation link in README.md
2024-06-21 18:05:18 +08:00
unclecode
c1413e6916
chore: Update documentation link in README.md
2024-06-21 17:57:47 +08:00
unclecode
e7705e661a
ADD MKDocs
2024-06-21 17:56:54 +08:00
unclecode
1fcb573909
chore: Update table of contents in README.md
2024-06-19 18:53:22 +08:00
unclecode
0f6c5f5453
chore: Update configuration values, create new example, and update Dockerfile and README
2024-06-19 18:50:58 +08:00
unclecode
350ca1511b
chore: Update configuration values, create new example, and update Dockerfile and README
2024-06-19 18:48:20 +08:00
unclecode
539263a8ba
chore: Update configuration values for chunk token threshold, overlap rate, and minimum word threshold. Create a new example for LLMExtraction Strategy, update Dockerfile, and README
2024-06-19 18:32:20 +08:00
unclecode
3f0e265baf
Merge branch 'format-inline-tags'
2024-06-19 00:48:38 +08:00
unclecode
480902bd66
Update README
2024-06-18 20:02:21 +08:00
unclecode
853b9d59d8
feat: Add hooks for enhanced control over Selenium drivers
...
- Added six hooks: on_driver_created, before_get_url, after_get_url, before_return_html, on_user_agent_updated.
- Included example usage in quickstart.py.
- Updated README and changelog.
2024-06-18 20:00:51 +08:00
unclecode
52daf3936a
Fix typo in README
2024-06-17 15:15:37 +08:00
unclecode
413595542a
Enhancement: Replaced inline HTML tags with textual format for better LLM context handling #24
2024-06-17 15:14:34 +08:00
unclecode
42a5da854d
Update version and change log.
2024-06-17 14:47:58 +08:00
unclecode
989f8c91c8
Update README
2024-06-08 18:50:35 +08:00
unclecode
edba5fb5e9
Update README
2024-06-08 18:48:21 +08:00
unclecode
faa1defa5c
Update README
2024-06-08 18:47:23 +08:00
unclecode
b3a0edaa6d
- User agent
...
- Extract Links
- Extract Metadata
- Update Readme
- Update REST API document
2024-06-08 17:59:42 +08:00
unclecode
a19379aa58
Add recipe images, update README, and REST api example
2024-06-07 20:43:50 +08:00
unclecode
57a00ec677
Update Readme
2024-06-07 16:25:30 +08:00
unclecode
aeb2114170
Add example of REST API call
2024-06-07 16:24:40 +08:00
unclecode
b32013cb97
Fix README file hyperlink
2024-06-07 15:37:05 +08:00
unclecode
226a62a3c0
feat: Add screenshot functionality to crawl_urls
2024-06-07 15:33:15 +08:00
unclecode
8e73a482a2
feat: Add screenshot functionality to crawl_urls
...
The code changes in this commit add the `screenshot` parameter to the `crawl_urls` function in `main.py`. This allows users to specify whether they want to take a screenshot of the page during the crawling process. The default value is `False`.
This commit message follows the established convention of starting with a type (feat for feature) and providing a concise and descriptive summary of the changes made.
2024-06-07 15:23:32 +08:00
Gökhan Geyik
8f44db6499
Update README.md
2024-06-05 17:16:02 +03:00
unclecode
e5d401c67c
Update generated code sample
2024-06-02 16:06:43 +08:00
unclecode
ae77589a98
Update Readme
2024-06-02 15:42:13 +08:00
unclecode
ad373c0e19
Update Readme
2024-06-02 15:41:24 +08:00
unclecode
51f26d12fe
Update for v0.2.2
...
- Support multiple JS scripts
- Fixed some of bugs
- Resolved a few issue relevant to Colab installation
2024-06-02 15:40:18 +08:00
Unclecode
bf00c26a83
chore: Update Dockerfile to install chromium-chromedriver and spacy library
2024-05-18 09:16:52 +00:00
unclecode
e3524a10a7
chore: Update REST API base URL in README.md
2024-05-17 23:28:29 +08:00
unclecode
ce052a4eb5
Update README
2024-05-17 18:29:59 +08:00
unclecode
b43d77a56b
Update README
2024-05-17 18:28:39 +08:00
unclecode
1635a92218
chore: Update Crawl4AI quickstart script in README.md
2024-05-17 18:25:32 +08:00
unclecode
2a8a1b27e1
chore: Update Readme
2024-05-17 18:24:47 +08:00
unclecode
f5f3cce2c8
Merge new-release-0.0.2-no-spacy into main for v0.2.0 release
2024-05-17 18:23:27 +08:00
unclecode
6f96dcd649
chore: Update README
2024-05-17 18:12:50 +08:00
unclecode
957a2458b1
chore: Update web crawler URLs to use NBC News business section
2024-05-17 18:11:13 +08:00
unclecode
32c87f0388
chore: Update NlpSentenceChunking constructor parameters to None
...
The NlpSentenceChunking constructor parameters have been updated to None in order to simplify the usage of the class. This change removes the need for specifying the SpaCy model for sentence detection, making the code more concise and easier to understand.
2024-05-17 17:00:43 +08:00
unclecode
647cfda225
chore: Update Crawl4AI quickstart script in README.md
...
This commit updates the Crawl4AI quickstart script in the README.md file. The script is now properly formatted and aligned, making it easier to read and understand. The unnecessary indentation has been removed, and the script is now more concise and efficient.
2024-05-17 16:55:34 +08:00
unclecode
1cc67df301
chore: Update pip installation command and requirements, add new dependencies
2024-05-17 16:53:03 +08:00
unclecode
f85df91ca6
chore: Update README.md with Colab badge
2024-05-17 00:21:16 +08:00
unclecode
ea16dec587
Improve library loading
2024-05-16 21:19:02 +08:00
unclecode
45569d058d
chore: Update pip installation command and requirements for Crawl4AI
2024-05-16 20:42:53 +08:00
unclecode
5bb0b0b378
chore: Update pip installation command and requirements for Crawl4AI
2024-05-16 20:36:29 +08:00
unclecode
c8589f8da3
Update:
...
- Fix Spacy model issue
- Update Readme and requirements.txt
2024-05-16 19:50:20 +08:00
unclecode
6a6365ae0a
Refactor code to exclude the extraction of semantical blocks of text from the HTML
2024-05-16 18:10:55 +08:00
unclecode
5b80be956d
Update:
...
- Debug
- Refactor code for new version
2024-05-16 17:31:44 +08:00
UncleCode
4a2e17447b
Update README.md
2024-05-16 08:57:58 +08:00
unclecode
f6e59157bf
- Test all methods
...
- Update index.hml
- Update Readme
- Resolve some bugs
2024-05-14 21:27:41 +08:00
unclecode
8e536b9717
chore: Refactor README.md and project structure
2024-05-12 12:41:42 +08:00