chore: Update pip installation command and requirements for Crawl4AI

This commit is contained in:
unclecode
2024-05-16 20:42:53 +08:00
parent 5bb0b0b378
commit 45569d058d
2 changed files with 31 additions and 3 deletions

View File

@@ -24,12 +24,30 @@ Crawl4AI makes even complex web crawling tasks simple and intuitive. Below is an
**Example Task:** **Example Task:**
1. Execute custom JavaScript to click a "Load More" button. 1. Instantiate a WebCrawler object.
2. Filter the data to include only content related to "technology". 2. Execute custom JavaScript to click a "Load More" button.
3. Use a CSS selector to extract only paragraphs (`<p>` tags). 3. Filter the data to include only content related to "technology".
4. Use a CSS selector to extract only paragraphs (`<p>` tags).
**Example Code:** **Example Code:**
Simply, firtsy install the package:
```bash
virtualenv venv
source venv/bin/activate
# Install the required packages
pip install transformers torch chromedriver_autoinstaller
# Install Crawl4AI
pip install git+https://github.com/unclecode/crawl4ai.git
```
Run the following command to load the required models. This is optional, but it will boost the performance and speed of the crawler. You need to do this only once.
```bash
crawl4ai-download-models
```
Now, you can run the following code:
```python ```python
# Import necessary modules # Import necessary modules
from crawl4ai import WebCrawler from crawl4ai import WebCrawler
@@ -123,6 +141,10 @@ pip install transformers torch chromedriver_autoinstaller
pip install git+https://github.com/unclecode/crawl4ai.git pip install git+https://github.com/unclecode/crawl4ai.git
``` ```
💡 Better to run the following CLI-command to load the required models. This is optional, but it will boost the performance and speed of the crawler. You need to do this only once.
crawl4ai-download-models
2. Alternatively, you can clone the repository and install the package locally: 2. Alternatively, you can clone the repository and install the package locally:
```bash ```bash
virtualenv venv virtualenv venv

View File

@@ -33,6 +33,12 @@ pip install git+https://github.com/unclecode/crawl4ai.git
pip install transformers torch chromedriver_autoinstaller pip install transformers torch chromedriver_autoinstaller
</code></pre> </code></pre>
</li> </li>
<li class="mb-4">
Run the following command to load the required models. This is optional, but it will boost the performance and speed of the crawler. You need to do this only once.
<pre
class="bg-zinc-800 p-4 rounded mt-2 text-zinc-100"
><code>crawl4ai-download-models</code></pre>
</li>
<li class="mb-4"> <li class="mb-4">
Alternatively, you can clone the repository and install the package locally: Alternatively, you can clone the repository and install the package locally:
<pre <pre