feat: add Script Builder to Chrome Extension and reorganize LLM context files

This commit introduces significant enhancements to the Crawl4AI ecosystem:

  Chrome Extension - Script Builder (Alpha):
  - Add recording functionality to capture user interactions (clicks, typing, scrolling)
  - Implement smart event grouping for cleaner script generation
  - Support export to both JavaScript and C4A script formats
  - Add timeline view for visualizing and editing recorded actions
  - Include wait commands (time-based and element-based)
  - Add saved flows functionality for reusing automation scripts
  - Update UI with consistent dark terminal theme (Dank Mono font, green/pink accents)
  - Release new extension versions: v1.1.0, v1.2.0, v1.2.1

  LLM Context Builder Improvements:
  - Reorganize context files from llmtxt/ to llm.txt/ with better structure
  - Separate diagram templates from text content (diagrams/ and txt/ subdirectories)
  - Add comprehensive context files for all major Crawl4AI components
  - Improve file naming convention for better discoverability

  Documentation Updates:
  - Update apps index page to match main documentation theme
  - Standardize color scheme: "Available" tags use primary color (#50ffff)
  - Change "Coming Soon" tags to dark gray for better visual hierarchy
  - Add interactive two-column layout for extension landing page
  - Include code examples for both Schema Builder and Script Builder features

  Technical Improvements:
  - Enhance event capture mechanism with better element selection
  - Add support for contenteditable elements and complex form interactions
  - Implement proper scroll event handling for both window and element scrolling
  - Add meta key support for keyboard shortcuts
  - Improve selector generation for more reliable element targeting

  The Script Builder is released as Alpha, acknowledging potential bugs while providing
  early access to this powerful automation recording feature.
This commit is contained in:
UncleCode
2025-06-08 22:02:12 +08:00
parent 926592649e
commit 40640badad
72 changed files with 28600 additions and 100986 deletions

View File

@@ -0,0 +1,231 @@
## Installation
Multiple installation options for different environments and use cases.
### Basic Installation
```bash
# Install core library
pip install crawl4ai
# Initial setup (installs Playwright browsers)
crawl4ai-setup
# Verify installation
crawl4ai-doctor
```
### Quick Verification
```python
import asyncio
from crawl4ai import AsyncWebCrawler
async def main():
async with AsyncWebCrawler() as crawler:
result = await crawler.arun("https://example.com")
print(result.markdown[:300])
if __name__ == "__main__":
asyncio.run(main())
```
**📖 Learn more:** [Basic Usage Guide](https://docs.crawl4ai.com/core/quickstart.md)
### Advanced Features (Optional)
```bash
# PyTorch-based features (text clustering, semantic chunking)
pip install crawl4ai[torch]
crawl4ai-setup
# Transformers (Hugging Face models)
pip install crawl4ai[transformer]
crawl4ai-setup
# All features (large download)
pip install crawl4ai[all]
crawl4ai-setup
# Pre-download models (optional)
crawl4ai-download-models
```
**📖 Learn more:** [Advanced Features Documentation](https://docs.crawl4ai.com/extraction/llm-strategies.md)
### Docker Deployment
```bash
# Pull pre-built image (specify platform for consistency)
docker pull --platform linux/amd64 unclecode/crawl4ai:latest
# For ARM (M1/M2 Macs): docker pull --platform linux/arm64 unclecode/crawl4ai:latest
# Setup environment for LLM support
cat > .llm.env << EOL
OPENAI_API_KEY=sk-your-key
ANTHROPIC_API_KEY=your-anthropic-key
EOL
# Run with LLM support (specify platform)
docker run -d \
--platform linux/amd64 \
-p 11235:11235 \
--name crawl4ai \
--env-file .llm.env \
--shm-size=1g \
unclecode/crawl4ai:latest
# For ARM Macs, use: --platform linux/arm64
# Basic run (no LLM)
docker run -d \
--platform linux/amd64 \
-p 11235:11235 \
--name crawl4ai \
--shm-size=1g \
unclecode/crawl4ai:latest
```
**📖 Learn more:** [Complete Docker Guide](https://docs.crawl4ai.com/core/docker-deployment.md)
### Docker Compose
```bash
# Clone repository
git clone https://github.com/unclecode/crawl4ai.git
cd crawl4ai
# Copy environment template
cp deploy/docker/.llm.env.example .llm.env
# Edit .llm.env with your API keys
# Run pre-built image
IMAGE=unclecode/crawl4ai:latest docker compose up -d
# Build and run locally
docker compose up --build -d
# Build with all features
INSTALL_TYPE=all docker compose up --build -d
# Stop service
docker compose down
```
**📖 Learn more:** [Docker Compose Configuration](https://docs.crawl4ai.com/core/docker-deployment.md#option-2-using-docker-compose)
### Manual Docker Build
```bash
# Build multi-architecture image (specify platform)
docker buildx build --platform linux/amd64 -t crawl4ai-local:latest --load .
# For ARM: docker buildx build --platform linux/arm64 -t crawl4ai-local:latest --load .
# Build with specific features
docker buildx build \
--platform linux/amd64 \
--build-arg INSTALL_TYPE=all \
--build-arg ENABLE_GPU=false \
-t crawl4ai-local:latest --load .
# Run custom build (specify platform)
docker run -d \
--platform linux/amd64 \
-p 11235:11235 \
--name crawl4ai-custom \
--env-file .llm.env \
--shm-size=1g \
crawl4ai-local:latest
```
**📖 Learn more:** [Manual Build Guide](https://docs.crawl4ai.com/core/docker-deployment.md#option-3-manual-local-build--run)
### Google Colab
```python
# Install in Colab
!pip install crawl4ai
!crawl4ai-setup
# If setup fails, manually install Playwright browsers
!playwright install chromium
# Install with all features (may take 5-10 minutes)
!pip install crawl4ai[all]
!crawl4ai-setup
!crawl4ai-download-models
# If still having issues, force Playwright install
!playwright install chromium --force
# Quick test
import asyncio
from crawl4ai import AsyncWebCrawler
async def test_crawl():
async with AsyncWebCrawler() as crawler:
result = await crawler.arun("https://example.com")
print("✅ Installation successful!")
print(f"Content length: {len(result.markdown)}")
# Run test in Colab
await test_crawl()
```
**📖 Learn more:** [Colab Examples Notebook](https://colab.research.google.com/github/unclecode/crawl4ai/blob/main/docs/examples/quickstart.ipynb)
### Docker API Usage
```python
# Using Docker SDK
import asyncio
from crawl4ai.docker_client import Crawl4aiDockerClient
from crawl4ai import BrowserConfig, CrawlerRunConfig, CacheMode
async def main():
async with Crawl4aiDockerClient(base_url="http://localhost:11235") as client:
results = await client.crawl(
["https://example.com"],
browser_config=BrowserConfig(headless=True),
crawler_config=CrawlerRunConfig(cache_mode=CacheMode.BYPASS)
)
for result in results:
print(f"Success: {result.success}, Length: {len(result.markdown)}")
asyncio.run(main())
```
**📖 Learn more:** [Docker Client API](https://docs.crawl4ai.com/core/docker-deployment.md#python-sdk)
### Direct API Calls
```python
# REST API example
import requests
payload = {
"urls": ["https://example.com"],
"browser_config": {"type": "BrowserConfig", "params": {"headless": True}},
"crawler_config": {"type": "CrawlerRunConfig", "params": {"cache_mode": "bypass"}}
}
response = requests.post("http://localhost:11235/crawl", json=payload)
print(response.json())
```
**📖 Learn more:** [REST API Reference](https://docs.crawl4ai.com/core/docker-deployment.md#rest-api-examples)
### Health Check
```bash
# Check Docker service
curl http://localhost:11235/health
# Access playground
open http://localhost:11235/playground
# View metrics
curl http://localhost:11235/metrics
```
**📖 Learn more:** [Monitoring & Metrics](https://docs.crawl4ai.com/core/docker-deployment.md#metrics--monitoring)