Merge pull request #218 from timoa/main
chore(docs): fix documentation links + markdown lint fix
This commit is contained in:
13
README.md
13
README.md
@@ -12,7 +12,9 @@
|
||||
Crawl4AI simplifies asynchronous web crawling and data extraction, making it accessible for large language models (LLMs) and AI applications. 🆓🌐
|
||||
|
||||
## 🌟 Meet the Crawl4AI Assistant: Your Copilot for Crawling
|
||||
|
||||
Use the [Crawl4AI GPT Assistant](https://tinyurl.com/crawl4ai-gpt) as your AI-powered copilot! With this assistant, you can:
|
||||
|
||||
- 🧑💻 Generate code for complex crawling and extraction tasks
|
||||
- 💡 Get tailored support and examples
|
||||
- 📘 Learn Crawl4AI faster with step-by-step guidance
|
||||
@@ -81,11 +83,13 @@ By default, this will install the asynchronous version of Crawl4AI, using Playwr
|
||||
👉 Note: When you install Crawl4AI, the setup script should automatically install and set up Playwright. However, if you encounter any Playwright-related errors, you can manually install it using one of these methods:
|
||||
|
||||
1. Through the command line:
|
||||
|
||||
```bash
|
||||
playwright install
|
||||
```
|
||||
|
||||
2. If the above doesn't work, try this more specific command:
|
||||
|
||||
```bash
|
||||
python -m playwright install chromium
|
||||
```
|
||||
@@ -114,7 +118,7 @@ pip install -e .
|
||||
|
||||
We're in the process of creating Docker images and pushing them to Docker Hub. This will provide an easy way to run Crawl4AI in a containerized environment. Stay tuned for updates!
|
||||
|
||||
For more detailed installation instructions and options, please refer to our [Installation Guide](https://crawl4ai.com/mkdocs/installation).
|
||||
For more detailed installation instructions and options, please refer to our [Installation Guide](https://crawl4ai.com/mkdocs/basic/installation/).
|
||||
|
||||
## Quick Start 🚀
|
||||
|
||||
@@ -244,7 +248,7 @@ if __name__ == "__main__":
|
||||
asyncio.run(extract_news_teasers())
|
||||
```
|
||||
|
||||
For more advanced usage examples, check out our [Examples](https://crawl4ai.com/mkdocs/full_details/advanced_jsoncss_extraction.md) section in the documentation.
|
||||
For more advanced usage examples, check out our [Examples](https://crawl4ai.com/mkdocs/extraction/css-advanced/) section in the documentation.
|
||||
|
||||
### Extracting Structured Data with OpenAI
|
||||
|
||||
@@ -347,7 +351,7 @@ if __name__ == "__main__":
|
||||
|
||||
This example demonstrates Crawl4AI's ability to handle complex scenarios where content is loaded asynchronously. It crawls multiple pages of GitHub commits, executing JavaScript to load new content and using custom hooks to ensure data is loaded before proceeding.
|
||||
|
||||
For more advanced usage examples, check out our [Examples](https://crawl4ai.com/mkdocs/full_details/session_based_crawling.md) section in the documentation.
|
||||
For more advanced usage examples, check out our [Examples](https://crawl4ai.com/mkdocs/tutorial/episode_12_Session-Based_Crawling_for_Dynamic_Websites/) section in the documentation.
|
||||
|
||||
|
||||
## Speed Comparison 🚀
|
||||
@@ -356,7 +360,7 @@ Crawl4AI is designed with speed as a primary focus. Our goal is to provide the f
|
||||
|
||||
We've conducted a speed comparison between Crawl4AI and Firecrawl, a paid service. The results demonstrate Crawl4AI's superior performance:
|
||||
|
||||
```
|
||||
```bash
|
||||
Firecrawl:
|
||||
Time taken: 7.02 seconds
|
||||
Content length: 42074 characters
|
||||
@@ -374,6 +378,7 @@ Images found: 89
|
||||
```
|
||||
|
||||
As you can see, Crawl4AI outperforms Firecrawl significantly:
|
||||
|
||||
- Simple crawl: Crawl4AI is over 4 times faster than Firecrawl.
|
||||
- With JavaScript execution: Even when executing JavaScript to load more content (doubling the number of images found), Crawl4AI is still faster than Firecrawl's simple crawl.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user