Add Crawl4AI Assistant Chrome Extension
- Created manifest.json for the Crawl4AI Assistant extension. - Added popup HTML, CSS, and JS files for the extension interface. - Included icons and favicon for the extension. - Implemented functionality for schema capture and code generation. - Updated index.md to reflect the availability of the new extension. - Enhanced LLM Context Builder layout and styles for consistency. - Adjusted global styles for better branding and responsiveness.
This commit is contained in:
123
docs/md_v2/apps/crawl4ai-assistant/README.md
Normal file
123
docs/md_v2/apps/crawl4ai-assistant/README.md
Normal file
@@ -0,0 +1,123 @@
|
||||
# Crawl4AI Chrome Extension
|
||||
|
||||
Visual schema and script builder for Crawl4AI - Build extraction schemas by clicking on webpage elements!
|
||||
|
||||
## 🚀 Features
|
||||
|
||||
- **Visual Schema Builder**: Click on elements to build extraction schemas
|
||||
- **Smart Element Selection**: Container and field selection with visual feedback
|
||||
- **Code Generation**: Generates complete Python code with LLM integration
|
||||
- **Beautiful Dark UI**: Consistent with Crawl4AI's design language
|
||||
- **One-Click Download**: Get your generated code instantly
|
||||
|
||||
## 📦 Installation
|
||||
|
||||
### Method 1: Load Unpacked Extension (Recommended for Development)
|
||||
|
||||
1. Open Chrome and navigate to `chrome://extensions/`
|
||||
2. Enable "Developer mode" in the top right corner
|
||||
3. Click "Load unpacked"
|
||||
4. Select the `crawl4ai-assistant` folder
|
||||
5. The extension icon (🚀🤖) will appear in your toolbar
|
||||
|
||||
### Method 2: Generate Icons First
|
||||
|
||||
If you want proper icons:
|
||||
|
||||
1. Open `icons/generate_icons.html` in your browser
|
||||
2. Right-click each canvas and save as:
|
||||
- `icon-16.png`
|
||||
- `icon-48.png`
|
||||
- `icon-128.png`
|
||||
3. Then follow Method 1 above
|
||||
|
||||
## 🎯 How to Use
|
||||
|
||||
### Building a Schema
|
||||
|
||||
1. **Navigate to any website** you want to extract data from
|
||||
2. **Click the Crawl4AI extension icon** in your toolbar
|
||||
3. **Click "Schema Builder"** to start the capture mode
|
||||
4. **Select a container element**:
|
||||
- Hover over elements (they'll highlight in blue)
|
||||
- Click on a repeating container (e.g., product card, article block)
|
||||
5. **Select fields within the container**:
|
||||
- Elements will now highlight in green
|
||||
- Click on each piece of data you want to extract
|
||||
- Name each field (e.g., "title", "price", "description")
|
||||
6. **Generate the code**:
|
||||
- Click "Generate Code" in the extension popup
|
||||
- A Python file will automatically download
|
||||
|
||||
### Running the Generated Code
|
||||
|
||||
The downloaded Python file contains:
|
||||
|
||||
```python
|
||||
# 1. The HTML snippet of your selected container
|
||||
HTML_SNIPPET = """..."""
|
||||
|
||||
# 2. The extraction query based on your selections
|
||||
EXTRACTION_QUERY = """..."""
|
||||
|
||||
# 3. Functions to generate and test the schema
|
||||
async def generate_schema():
|
||||
# Generates the extraction schema using LLM
|
||||
|
||||
async def test_extraction():
|
||||
# Tests the schema on the actual website
|
||||
```
|
||||
|
||||
To use it:
|
||||
|
||||
1. Install Crawl4AI: `pip install crawl4ai`
|
||||
2. Run the script: `python crawl4ai_schema_*.py`
|
||||
3. The script will generate a `generated_schema.json` file
|
||||
4. Use this schema in your Crawl4AI projects!
|
||||
|
||||
## 🎨 Visual Feedback
|
||||
|
||||
- **Blue dashed outline**: Container selection mode
|
||||
- **Green dashed outline**: Field selection mode
|
||||
- **Solid blue outline**: Selected container
|
||||
- **Solid green outline**: Selected fields
|
||||
- **Floating toolbar**: Shows current mode and selection status
|
||||
|
||||
## ⌨️ Keyboard Shortcuts
|
||||
|
||||
- **ESC**: Cancel current capture session
|
||||
|
||||
## 🔧 Technical Details
|
||||
|
||||
- Built with Manifest V3 for security and performance
|
||||
- Pure client-side - no data sent to external servers
|
||||
- Generates code that uses Crawl4AI's LLM integration
|
||||
- Smart selector generation prioritizes stable attributes
|
||||
|
||||
## 🐛 Troubleshooting
|
||||
|
||||
### Extension doesn't load
|
||||
- Make sure you're in Developer Mode
|
||||
- Check the console for any errors
|
||||
- Ensure all files are in the correct directories
|
||||
|
||||
### Can't select elements
|
||||
- Some websites may block extensions
|
||||
- Try refreshing the page
|
||||
- Make sure you clicked "Schema Builder" first
|
||||
|
||||
### Generated code doesn't work
|
||||
- Ensure you have Crawl4AI installed
|
||||
- Check that you have an LLM API key configured
|
||||
- Make sure the website structure hasn't changed
|
||||
|
||||
## 🤝 Contributing
|
||||
|
||||
This extension is part of the Crawl4AI project. Contributions are welcome!
|
||||
|
||||
- Report issues: [GitHub Issues](https://github.com/unclecode/crawl4ai/issues)
|
||||
- Join discussion: [Discord](https://discord.gg/crawl4ai)
|
||||
|
||||
## 📄 License
|
||||
|
||||
Same as Crawl4AI - see main project for details.
|
||||
Reference in New Issue
Block a user