Files
crawl4ai/docs/examples/c4a_script/script_samples/data_extraction.c4a
UncleCode 08a2cdae53 Add C4A-Script support and documentation
- Generate OneShot js code geenrator
- Introduced a new C4A-Script tutorial example for login flow using Blockly.
- Updated index.html to include Blockly theme and event editor modal for script editing.
- Created a test HTML file for testing Blockly integration.
- Added comprehensive C4A-Script API reference documentation covering commands, syntax, and examples.
- Developed core documentation for C4A-Script, detailing its features, commands, and real-world examples.
- Updated mkdocs.yml to include new C4A-Script documentation in navigation.
2025-06-07 23:07:19 +08:00

56 lines
1.7 KiB
Plaintext

# Data extraction example
# Scrapes product information from an e-commerce site
# Navigate to products page
GO https://shop.example.com/products
WAIT `.product-list` 10
# Scroll to load lazy-loaded content
SCROLL DOWN 500
WAIT 1
SCROLL DOWN 500
WAIT 1
SCROLL DOWN 500
WAIT 2
# Extract product data
EVAL `
// Extract all product information
const products = Array.from(document.querySelectorAll('.product-card')).map((card, index) => {
return {
id: index + 1,
name: card.querySelector('.product-title')?.textContent?.trim() || 'N/A',
price: card.querySelector('.price')?.textContent?.trim() || 'N/A',
rating: card.querySelector('.rating')?.textContent?.trim() || 'N/A',
availability: card.querySelector('.in-stock') ? 'In Stock' : 'Out of Stock',
image: card.querySelector('img')?.src || 'N/A'
};
});
// Log results
console.log('=== Product Extraction Results ===');
console.log('Total products found:', products.length);
console.log(JSON.stringify(products, null, 2));
// Save to localStorage for retrieval
localStorage.setItem('scraped_products', JSON.stringify(products));
`
# Optional: Click on first product for details
CLICK `.product-card:first-child`
WAIT `.product-details` 5
# Extract detailed information
EVAL `
const details = {
description: document.querySelector('.product-description')?.textContent?.trim(),
specifications: Array.from(document.querySelectorAll('.spec-item')).map(spec => ({
label: spec.querySelector('.spec-label')?.textContent,
value: spec.querySelector('.spec-value')?.textContent
})),
reviews: document.querySelector('.review-count')?.textContent
};
console.log('=== Product Details ===');
console.log(JSON.stringify(details, null, 2));
`