Transform any website into structured data with just a few clicks! The Crawl4AI Assistant Chrome Extension provides three powerful tools for web scraping and data extraction.
Schema Builder
Extract data instantly without LLMs - see results in real-time!
Script Builder (Alpha)
Record browser actions to create automation scripts
Click2Crawl (New!)
Select multiple elements to extract clean markdown "as you see"
Quick Start
Download the Extension
Get the latest release from GitHub or use the button below
Download Extension (v1.2.1)Load in Chrome
Navigate to chrome://extensions/ and enable Developer Mode
Load Unpacked
Click "Load unpacked" and select the extracted extension folder
Explore Our Tools
Schema Builder
Visual data extraction
Script Builder
Browser automation
Click2Crawl
Markdown extraction
📊 Schema Builder
No LLM needed - Extract data instantly!Select Container
Click on any repeating element like product cards or articles. Use up/down navigation to fine-tune selection!
Click Fields to Extract
Click on data fields inside the container - choose text, links, images, or attributes
Test & Extract Data NOW!
🎉 Click "Test Schema" to extract ALL matching data instantly - no coding required!
🔴 Script Builder
Record actions, generate automationHit Record
Start capturing your browser interactions
Interact Naturally
Click, type, scroll - everything is captured
Export Script
Get JavaScript for Crawl4AI's js_code parameter
📝 Click2Crawl
Select multiple elements to extract clean markdownCtrl/Cmd + Click
Hold Ctrl/Cmd and click multiple elements you want to extract
Enable Visual Text Mode
Extract content "as you see" - clean text without complex HTML structures
Export Clean Markdown
Get beautifully formatted markdown ready for documentation or LLMs
See the Generated Code & Extracted Data
#!/usr/bin/env python3
"""
🎉 NO LLM NEEDED! Direct extraction with CSS selectors
Generated by Crawl4AI Chrome Extension
"""
import asyncio
import json
from crawl4ai import AsyncWebCrawler, BrowserConfig, CrawlerRunConfig
from crawl4ai.extraction_strategy import JsonCssExtractionStrategy
# The EXACT schema from your visual clicks - no guessing!
EXTRACTION_SCHEMA = {
"name": "Product Catalog",
"baseSelector": "div.product-card", # The container you selected
"fields": [
{
"name": "title",
"selector": "h3.product-title",
"type": "text"
},
{
"name": "price",
"selector": "span.price",
"type": "text"
},
{
"name": "image",
"selector": "img.product-img",
"type": "attribute",
"attribute": "src"
},
{
"name": "link",
"selector": "a.product-link",
"type": "attribute",
"attribute": "href"
}
]
}
async def extract_data(url: str):
# Direct extraction - no LLM API calls!
extraction_strategy = JsonCssExtractionStrategy(schema=EXTRACTION_SCHEMA)
async with AsyncWebCrawler() as crawler:
result = await crawler.arun(
url=url,
config=CrawlerRunConfig(extraction_strategy=extraction_strategy)
)
if result.success:
data = json.loads(result.extracted_content)
print(f"✅ Extracted {len(data)} items instantly!")
# Save to file
with open('products.json', 'w') as f:
json.dump(data, f, indent=2)
return data
# Run extraction on any similar page!
data = asyncio.run(extract_data("https://example.com/products"))
# 🎯 Result: Clean JSON data, no LLM costs, instant results!
// 🎉 Instantly extracted from the page - no coding required!
[
{
"title": "Wireless Bluetooth Headphones",
"price": "$79.99",
"image": "https://example.com/images/headphones-bt-01.jpg",
"link": "/products/wireless-bluetooth-headphones"
},
{
"title": "Smart Watch Pro 2024",
"price": "$299.00",
"image": "https://example.com/images/smartwatch-pro.jpg",
"link": "/products/smart-watch-pro-2024"
},
{
"title": "4K Webcam for Streaming",
"price": "$149.99",
"image": "https://example.com/images/webcam-4k.jpg",
"link": "/products/4k-webcam-streaming"
},
{
"title": "Mechanical Gaming Keyboard RGB",
"price": "$129.99",
"image": "https://example.com/images/keyboard-gaming.jpg",
"link": "/products/mechanical-gaming-keyboard"
},
{
"title": "USB-C Hub 7-in-1",
"price": "$45.99",
"image": "https://example.com/images/usbc-hub.jpg",
"link": "/products/usb-c-hub-7in1"
}
]
import asyncio
from crawl4ai import AsyncWebCrawler, CrawlerRunConfig
# JavaScript generated from your recorded actions
js_script = """
// Search for products
document.querySelector('button.search-toggle').click();
await new Promise(r => setTimeout(r, 500));
// Type search query
const searchInput = document.querySelector('input#search');
searchInput.value = 'wireless headphones';
searchInput.dispatchEvent(new Event('input', {bubbles: true}));
// Submit search
searchInput.dispatchEvent(new KeyboardEvent('keydown', {
key: 'Enter', keyCode: 13, bubbles: true
}));
// Wait for results
await new Promise(r => setTimeout(r, 2000));
// Click first product
document.querySelector('.product-item:first-child').click();
// Wait for product page
await new Promise(r => setTimeout(r, 1000));
// Add to cart
document.querySelector('button.add-to-cart').click();
"""
async def automate_shopping():
config = CrawlerRunConfig(
js_code=js_script,
wait_for="css:.cart-confirmation",
screenshot=True
)
async with AsyncWebCrawler() as crawler:
result = await crawler.arun(
url="https://shop.example.com",
config=config
)
print(f"✓ Automation complete: {result.url}")
return result
asyncio.run(automate_shopping())
# Extracted from Hacker News with Visual Text Mode 👁️
1. **Show HN: I built a tool to find and reach out to YouTubers** (hellosimply.io)
84 points by erickim 2 hours ago | hide | 31 comments
2. **The 24 Hour Restaurant** (logicmag.io)
124 points by helsinkiandrew 5 hours ago | hide | 52 comments
3. **Building a Better Bloom Filter in Rust** (carlmastrangelo.com)
89 points by carlmastrangelo 3 hours ago | hide | 27 comments
---
### Article: The 24 Hour Restaurant
In New York City, the 24-hour restaurant is becoming extinct. What we lose when we can no longer eat whenever we want.
When I first moved to New York, I loved that I could get a full meal at 3 AM. Not just pizza or fast food, but a proper sit-down dinner with table service and a menu that ran for pages. The city that never sleeps had restaurants that matched its rhythm.
Today, finding a 24-hour restaurant in Manhattan requires genuine effort. The pandemic accelerated a decline that was already underway, but the roots go deeper: rising rents, changing labor laws, and shifting cultural patterns have all contributed to the death of round-the-clock dining.
---
### Product Review: Framework Laptop 16
**Specifications:**
- Display: 16" 2560×1600 165Hz
- Processor: AMD Ryzen 7 7840HS
- Memory: 32GB DDR5-5600
- Storage: 2TB NVMe Gen4
- Price: Starting at $1,399
**Pros:**
- Fully modular and repairable
- Excellent Linux support
- Great keyboard and trackpad
- Expansion card system
**Cons:**
- Battery life could be better
- Slightly heavier than competitors
- Fan noise under load
Crawl4AI Cloud
Your browser cluster without the cluster.
See it extract your own data. Right now.
More Features Coming Soon
We're continuously expanding C4AI Assistant with powerful new features:
Get CrawlResult Without Code
Skip the code generation entirely! Get extracted data directly in the extension as a CrawlResult object, ready to download as JSON.
📊 One-click extraction • No Python needed • Export to JSON/CSV
Smart Schema Suggestions
AI-powered field detection that automatically suggests the most likely data fields on any page, making schema building even faster.
🤖 Auto-detect fields • Smart naming • Pattern recognition
🚀 Stay tuned for updates! Follow our GitHub for the latest releases.