✨ New Features: - Click2Crawl: Visual element selection with markdown conversion - Ctrl/Cmd+Click to select multiple elements - Visual text mode for WYSIWYG extraction - Real-time markdown preview with syntax highlighting - Export to .md file or clipboard - Schema Builder Enhancement: Instant data extraction without LLMs - Test schemas directly in browser - See JSON results immediately - Export data or Python code - Cloud deployment ready (coming soon) - Modular Architecture: - Separated into schemaBuilder.js, scriptBuilder.js, click2CrawlBuilder.js - Added contentAnalyzer.js and markdownConverter.js modules - Shared utilities and CSS reset system - Integrated marked.js for markdown rendering 🎨 UI/UX Improvements: - Added edgy cloud announcement banner with seamless shimmer animation - Direct, technical copy: "You don't need Puppeteer. You need Crawl4AI Cloud." - Enhanced feature cards with emojis - Fixed CSS conflicts with targeted reset approach - Improved badge hover effects (red on hover) - Added wrap toggle for code preview 📚 Documentation Updates: - Split extraction diagrams into LLM and no-LLM versions - Updated llms-full.txt with latest content - Added versioned LLM context (v0.1.1) 🔧 Technical Enhancements: - Refactored 3464 lines of monolithic content.js into modules - Added proper event handling and cleanup - Improved z-index management - Better scroll position tracking for badges - Enhanced error handling throughout This release transforms the Chrome Extension from a simple tool into a powerful visual data extraction suite, making web scraping accessible to everyone.
Crawl4AI Chrome Extension
Visual schema and script builder for Crawl4AI - Build extraction schemas by clicking on webpage elements!
🚀 Features
- Visual Schema Builder: Click on elements to build extraction schemas
- Smart Element Selection: Container and field selection with visual feedback
- Code Generation: Generates complete Python code with LLM integration
- Beautiful Dark UI: Consistent with Crawl4AI's design language
- One-Click Download: Get your generated code instantly
📦 Installation
Method 1: Load Unpacked Extension (Recommended for Development)
- Open Chrome and navigate to
chrome://extensions/ - Enable "Developer mode" in the top right corner
- Click "Load unpacked"
- Select the
crawl4ai-assistantfolder - The extension icon (🚀🤖) will appear in your toolbar
Method 2: Generate Icons First
If you want proper icons:
- Open
icons/generate_icons.htmlin your browser - Right-click each canvas and save as:
icon-16.pngicon-48.pngicon-128.png
- Then follow Method 1 above
🎯 How to Use
Building a Schema
- Navigate to any website you want to extract data from
- Click the Crawl4AI extension icon in your toolbar
- Click "Schema Builder" to start the capture mode
- Select a container element:
- Hover over elements (they'll highlight in blue)
- Click on a repeating container (e.g., product card, article block)
- Select fields within the container:
- Elements will now highlight in green
- Click on each piece of data you want to extract
- Name each field (e.g., "title", "price", "description")
- Generate the code:
- Click "Generate Code" in the extension popup
- A Python file will automatically download
Running the Generated Code
The downloaded Python file contains:
# 1. The HTML snippet of your selected container
HTML_SNIPPET = """..."""
# 2. The extraction query based on your selections
EXTRACTION_QUERY = """..."""
# 3. Functions to generate and test the schema
async def generate_schema():
# Generates the extraction schema using LLM
async def test_extraction():
# Tests the schema on the actual website
To use it:
- Install Crawl4AI:
pip install crawl4ai - Run the script:
python crawl4ai_schema_*.py - The script will generate a
generated_schema.jsonfile - Use this schema in your Crawl4AI projects!
🎨 Visual Feedback
- Blue dashed outline: Container selection mode
- Green dashed outline: Field selection mode
- Solid blue outline: Selected container
- Solid green outline: Selected fields
- Floating toolbar: Shows current mode and selection status
⌨️ Keyboard Shortcuts
- ESC: Cancel current capture session
🔧 Technical Details
- Built with Manifest V3 for security and performance
- Pure client-side - no data sent to external servers
- Generates code that uses Crawl4AI's LLM integration
- Smart selector generation prioritizes stable attributes
🐛 Troubleshooting
Extension doesn't load
- Make sure you're in Developer Mode
- Check the console for any errors
- Ensure all files are in the correct directories
Can't select elements
- Some websites may block extensions
- Try refreshing the page
- Make sure you clicked "Schema Builder" first
Generated code doesn't work
- Ensure you have Crawl4AI installed
- Check that you have an LLM API key configured
- Make sure the website structure hasn't changed
🤝 Contributing
This extension is part of the Crawl4AI project. Contributions are welcome!
- Report issues: GitHub Issues
- Join discussion: Discord
📄 License
Same as Crawl4AI - see main project for details.