Files
crawl4ai/docs/examples/website-to-api/app.py
Soham Kukreti b1dff5a4d3 feat: Add comprehensive website to API example with frontend
This commit adds a complete, web scraping API example that demonstrates how to get structured data from any website and use it like an API using the crawl4ai library with a minimalist frontend interface.

Core Functionality
- AI-powered web scraping with plain English queries
- Dual scraping approaches: Schema-based (faster) and LLM-based (flexible)
- Intelligent schema caching for improved performance
- Custom LLM model support with API key management
- Automatic duplicate request prevention

Modern Frontend Interface
- Minimalist black-and-white design inspired by modern web apps
- Responsive layout with smooth animations and transitions
- Three main pages: Scrape Data, Models Management, API Request History
- Real-time results display with JSON formatting
- Copy-to-clipboard functionality for extracted data
- Toast notifications for user feedback
- Auto-scroll to results when scraping starts

Model Management System
- Web-based model configuration interface
- Support for any LLM provider (OpenAI, Gemini, Anthropic, etc.)
- Simplified configuration requiring only provider and API token
- Add, list, and delete model configurations
- Secure storage of API keys in local JSON files

API Request History
- Automatic saving of all API requests and responses
- Display of request history with URL, query, and cURL commands
- Duplicate prevention (same URL + query combinations)
- Request deletion functionality
- Clean, simplified display focusing on essential information

Technical Implementation

Backend (FastAPI)
- RESTful API with comprehensive endpoints
- Pydantic models for request/response validation
- Async web scraping with crawl4ai library
- Error handling with detailed error messages
- File-based storage for models and request history

Frontend (Vanilla JS/CSS/HTML)
- No framework dependencies - pure HTML, CSS, JavaScript
- Modern CSS Grid and Flexbox layouts
- Custom dropdown styling with SVG arrows
- Responsive design for mobile and desktop
- Smooth scrolling and animations

Core Library Integration
- WebScraperAgent class for orchestration
- ModelConfig class for LLM configuration management
- Schema generation and caching system
- LLM extraction strategy support
- Browser configuration with headless mode
2025-08-24 18:52:37 +05:30

49 lines
1.4 KiB
Python

#!/usr/bin/env python3
"""
Startup script for the Web Scraper API with frontend interface.
"""
import os
import sys
import uvicorn
from pathlib import Path
def main():
# Check if static directory exists
static_dir = Path("static")
if not static_dir.exists():
print("❌ Static directory not found!")
print("Please make sure the 'static' directory exists with the frontend files.")
sys.exit(1)
# Check if required frontend files exist
required_files = ["index.html", "styles.css", "script.js"]
missing_files = []
for file in required_files:
if not (static_dir / file).exists():
missing_files.append(file)
if missing_files:
print(f"❌ Missing frontend files: {', '.join(missing_files)}")
print("Please make sure all frontend files are present in the static directory.")
sys.exit(1)
print("🚀 Starting Web Scraper API with Frontend Interface")
print("=" * 50)
print("📁 Static files found and ready to serve")
print("🌐 Frontend will be available at: http://localhost:8000")
print("🔌 API endpoints available at: http://localhost:8000/docs")
print("=" * 50)
# Start the server
uvicorn.run(
"api_server:app",
host="0.0.0.0",
port=8000,
reload=True,
log_level="info"
)
if __name__ == "__main__":
main()