From 0a09d78fa5a92c90139dd6c1bc5ff0faf2dde710 Mon Sep 17 00:00:00 2001 From: Damien Laureaux Date: Thu, 31 Oct 2024 05:50:22 +0100 Subject: [PATCH] chore(docs): fix documentation links + markdown lint --- README.md | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 6c8a5a2a..b87d9977 100644 --- a/README.md +++ b/README.md @@ -12,7 +12,9 @@ Crawl4AI simplifies asynchronous web crawling and data extraction, making it accessible for large language models (LLMs) and AI applications. 🆓🌐 ## 🌟 Meet the Crawl4AI Assistant: Your Copilot for Crawling + Use the [Crawl4AI GPT Assistant](https://tinyurl.com/crawl4ai-gpt) as your AI-powered copilot! With this assistant, you can: + - 🧑‍💻 Generate code for complex crawling and extraction tasks - 💡 Get tailored support and examples - 📘 Learn Crawl4AI faster with step-by-step guidance @@ -81,11 +83,13 @@ By default, this will install the asynchronous version of Crawl4AI, using Playwr 👉 Note: When you install Crawl4AI, the setup script should automatically install and set up Playwright. However, if you encounter any Playwright-related errors, you can manually install it using one of these methods: 1. Through the command line: + ```bash playwright install ``` 2. If the above doesn't work, try this more specific command: + ```bash python -m playwright install chromium ``` @@ -114,7 +118,7 @@ pip install -e . We're in the process of creating Docker images and pushing them to Docker Hub. This will provide an easy way to run Crawl4AI in a containerized environment. Stay tuned for updates! -For more detailed installation instructions and options, please refer to our [Installation Guide](https://crawl4ai.com/mkdocs/installation). +For more detailed installation instructions and options, please refer to our [Installation Guide](https://crawl4ai.com/mkdocs/basic/installation/). ## Quick Start 🚀 @@ -244,7 +248,7 @@ if __name__ == "__main__": asyncio.run(extract_news_teasers()) ``` -For more advanced usage examples, check out our [Examples](https://crawl4ai.com/mkdocs/full_details/advanced_jsoncss_extraction.md) section in the documentation. +For more advanced usage examples, check out our [Examples](https://crawl4ai.com/mkdocs/extraction/css-advanced/) section in the documentation. ### Extracting Structured Data with OpenAI @@ -347,7 +351,7 @@ if __name__ == "__main__": This example demonstrates Crawl4AI's ability to handle complex scenarios where content is loaded asynchronously. It crawls multiple pages of GitHub commits, executing JavaScript to load new content and using custom hooks to ensure data is loaded before proceeding. -For more advanced usage examples, check out our [Examples](https://crawl4ai.com/mkdocs/full_details/session_based_crawling.md) section in the documentation. +For more advanced usage examples, check out our [Examples](https://crawl4ai.com/mkdocs/tutorial/episode_12_Session-Based_Crawling_for_Dynamic_Websites/) section in the documentation. ## Speed Comparison 🚀 @@ -356,7 +360,7 @@ Crawl4AI is designed with speed as a primary focus. Our goal is to provide the f We've conducted a speed comparison between Crawl4AI and Firecrawl, a paid service. The results demonstrate Crawl4AI's superior performance: -``` +```bash Firecrawl: Time taken: 7.02 seconds Content length: 42074 characters @@ -374,6 +378,7 @@ Images found: 89 ``` As you can see, Crawl4AI outperforms Firecrawl significantly: + - Simple crawl: Crawl4AI is over 4 times faster than Firecrawl. - With JavaScript execution: Even when executing JavaScript to load more content (doubling the number of images found), Crawl4AI is still faster than Firecrawl's simple crawl.