Files
crawl4ai/docs/llm.txt/11_page_interaction.q.md
UncleCode d5ed451299 Enhance crawler capabilities and documentation
- Add llm.txt generator
  - Added SSL certificate extraction in AsyncWebCrawler.
  - Introduced new content filters and chunking strategies for more robust data extraction.
  - Updated documentation.
2024-12-25 21:34:31 +08:00

1.8 KiB

javascript_execution: Execute single or multiple JavaScript commands in webpage | js code, javascript commands, browser execution | CrawlerRunConfig(js_code="window.scrollTo(0, document.body.scrollHeight);") css_wait: Wait for specific CSS elements to appear on page | css selector, element waiting, dynamic content | CrawlerRunConfig(wait_for="css:.dynamic-content") js_wait_condition: Define custom JavaScript wait conditions for dynamic content | javascript waiting, conditional wait, custom conditions | CrawlerRunConfig(wait_for="js:() => document.querySelectorAll('.item').length > 10") infinite_scroll: Handle infinite scroll and load more buttons | pagination, dynamic loading, scroll handling | CrawlerRunConfig(js_code="window.scrollTo(0, document.body.scrollHeight);") form_interaction: Fill and submit forms using JavaScript | form handling, input filling, form submission | CrawlerRunConfig(js_code="document.querySelector('#search').value = 'search term';") timing_control: Set page timeouts and delays before content capture | page timing, delays, timeouts | CrawlerRunConfig(page_timeout=60000, delay_before_return_html=2.0) session_management: Maintain browser session for multiple interactions | session handling, browser state, session cleanup | crawler.crawler_strategy.kill_session(session_id) cookie_consent: Handle cookie consent popups and notifications | cookie handling, popup management | CrawlerRunConfig(js_code="document.querySelector('.cookie-accept')?.click();") extraction_combination: Combine page interactions with structured data extraction | data extraction, content parsing | JsonCssExtractionStrategy(schema), LLMExtractionStrategy(schema) dynamic_content_loading: Wait for and verify dynamic content loading | content verification, dynamic loading | wait_for="js:() => document.querySelector('.content').innerText.length > 100"