Merge branch 'pr-971' into merge-pr971

refactor(logger): Apply the Enumeration for color
feat(linkedin): add prospect-wizard app with scraping and visualization
2025-05-01 18:57:28 +08:00 · 2025-05-01 17:04:44 +08:00 · 2025-04-30 19:38:25 +08:00 · 2025-04-29 23:04:32 +08:00 · 2025-04-26 21:09:50 +08:00 · 2025-04-24 18:36:25 +08:00
34 changed files with 3769 additions and 420 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,7 +5,16 @@ All notable changes to Crawl4AI will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
-## [0.6.0rc1] ‑ 2025‑04‑22
+## [0.6.1] - 2025-04-24
 ### Added
 - New dedicated `tables` field in `CrawlResult` model for better table extraction handling
 - Updated crypto_analysis_example.py to use the new tables field with backward compatibility
 ### Changed
 - Improved playground UI in Docker deployment with better endpoint handling and UI feedback
 ## [0.6.0] ‑ 2025‑04‑22
 ### Added
 - Browser pooling with page pre‑warming and fine‑grained **geolocation, locale, and timezone** controls  
--- a/README.md
+++ b/README.md
@@ -21,9 +21,9 @@
 Crawl4AI is the #1 trending GitHub repository, actively maintained by a vibrant community. It delivers blazing-fast, AI-ready web crawling tailored for LLMs, AI agents, and data pipelines. Open source, flexible, and built for real-time performance, Crawl4AI empowers developers with unmatched speed, precision, and deployment ease.  
-[✨ Check out latest update v0.6.0rc1](#-recent-updates)
+[✨ Check out latest update v0.6.0](#-recent-updates)
-🎉 **Version 0.6.0rc1 is now available!** This release candidate introduces World-aware Crawling with geolocation and locale settings, Table-to-DataFrame extraction, Browser pooling with pre-warming, Network and console traffic capture, MCP integration for AI tools, and a completely revamped Docker deployment! [Read the release notes →](https://docs.crawl4ai.com/blog)
+🎉 **Version 0.6.0 is now available!** This release candidate introduces World-aware Crawling with geolocation and locale settings, Table-to-DataFrame extraction, Browser pooling with pre-warming, Network and console traffic capture, MCP integration for AI tools, and a completely revamped Docker deployment! [Read the release notes →](https://docs.crawl4ai.com/blog)
 <details>
 <summary>🤓 <strong>My Personal Story</strong></summary>
@@ -505,7 +505,7 @@ async def test_news_crawl():
 ## ✨ Recent Updates
-### Version 0.6.0rc1 Release Highlights
+### Version 0.6.0 Release Highlights
 - **🌎 World-aware Crawling**: Set geolocation, language, and timezone for authentic locale-specific content:
  ```python
@@ -575,7 +575,7 @@ async def test_news_crawl():
 - **📱 Multi-stage Build System**: Optimized Dockerfile with platform-specific performance enhancements
-Read the full details in our [0.6.0rc1 Release Notes](https://docs.crawl4ai.com/blog/releases/0.6.0.html) or check the [CHANGELOG](https://github.com/unclecode/crawl4ai/blob/main/CHANGELOG.md).
+Read the full details in our [0.6.0 Release Notes](https://docs.crawl4ai.com/blog/releases/0.6.0.html) or check the [CHANGELOG](https://github.com/unclecode/crawl4ai/blob/main/CHANGELOG.md).
 ### Previous Version: 0.5.0 Major Release Highlights
@@ -606,7 +606,7 @@ We use different suffixes to indicate development stages:
 - `dev` (0.4.3dev1): Development versions, unstable
 - `a` (0.4.3a1): Alpha releases, experimental features
 - `b` (0.4.3b1): Beta releases, feature complete but needs testing
- `rc` (0.4.3rc1): Release candidates, potential final version
+- `rc` (0.4.3): Release candidates, potential final version
 #### Installation
 - Regular installation (stable version):
--- a/crawl4ai/version.py
+++ b/crawl4ai/version.py
@@ -1,3 +1,3 @@
 # crawl4ai/_version.py
-__version__ = "0.6.0"
+__version__ = "0.6.3"
--- a/crawl4ai/async_configs.py
+++ b/crawl4ai/async_configs.py
@@ -427,7 +427,7 @@ class BrowserConfig:
        host: str = "localhost",
    ):
        self.browser_type = browser_type
-        self.headless = headless or True
+        self.headless = headless 
        self.browser_mode = browser_mode
        self.use_managed_browser = use_managed_browser
        self.cdp_url = cdp_url
--- a/crawl4ai/async_database.py
+++ b/crawl4ai/async_database.py
@@ -171,7 +171,10 @@ class AsyncDatabaseManager:
                            f"Code context:\n{error_context['code_context']}"
                        )
                        self.logger.error(
-                            message=create_box_message(error_message, type="error"),
+                            message="{error}",
                            tag="ERROR",
                            params={"error": str(error_message)},
                            boxes=["error"],
                        )
                        raise
@@ -189,7 +192,10 @@ class AsyncDatabaseManager:
                f"Code context:\n{error_context['code_context']}"
            )
            self.logger.error(
-                message=create_box_message(error_message, type="error"),
+                message="{error}",
                tag="ERROR",
                params={"error": str(error_message)},
                boxes=["error"],
            )
            raise
        finally:
--- a/crawl4ai/async_logger.py
+++ b/crawl4ai/async_logger.py
@@ -1,10 +1,12 @@
 from abc import ABC, abstractmethod
 from enum import Enum
-from typing import Optional, Dict, Any
+from typing import Optional, Dict, Any, List
 from colorama import Fore, Style, init
 import os
 from datetime import datetime
 from urllib.parse import unquote
 from rich.console import Console
 from rich.text import Text
 from .utils import create_box_message
 class LogLevel(Enum):
@@ -21,6 +23,26 @@ class LogLevel(Enum):
    FATAL = 10
    def __str__(self):
        return self.name.lower()
 class LogColor(str, Enum):
    """Enum for log colors."""
    DEBUG = "lightblack"
    INFO = "cyan"
    SUCCESS = "green"
    WARNING = "yellow"
    ERROR = "red"
    CYAN = "cyan"
    GREEN = "green"
    YELLOW = "yellow"
    MAGENTA = "magenta"
    DIM_MAGENTA = "dim magenta"
    def __str__(self):
        """Automatically convert rich color to string."""
        return self.value
 class AsyncLoggerBase(ABC):
@@ -52,6 +74,7 @@ class AsyncLoggerBase(ABC):
    def error_status(self, url: str, error: str, tag: str = "ERROR", url_length: int = 100):
        pass
 class AsyncLogger(AsyncLoggerBase):
    """
    Asynchronous logger with support for colored console output and file logging.
@@ -79,17 +102,11 @@ class AsyncLogger(AsyncLoggerBase):
    }
    DEFAULT_COLORS = {
-        LogLevel.DEBUG: Fore.LIGHTBLACK_EX,
+        LogLevel.DEBUG: LogColor.DEBUG,
-        LogLevel.INFO: Fore.CYAN,
+        LogLevel.INFO: LogColor.INFO,
-        LogLevel.SUCCESS: Fore.GREEN,
+        LogLevel.SUCCESS: LogColor.SUCCESS,
-        LogLevel.WARNING: Fore.YELLOW,
+        LogLevel.WARNING: LogColor.WARNING,
-        LogLevel.ERROR: Fore.RED,
+        LogLevel.ERROR: LogColor.ERROR,
        LogLevel.CRITICAL: Fore.RED + Style.BRIGHT,
        LogLevel.ALERT: Fore.RED + Style.BRIGHT,
        LogLevel.NOTICE: Fore.BLUE,
        LogLevel.EXCEPTION: Fore.RED + Style.BRIGHT,
        LogLevel.FATAL: Fore.RED + Style.BRIGHT,
        LogLevel.DEFAULT: Fore.WHITE,
    }
    def __init__(
@@ -98,7 +115,7 @@ class AsyncLogger(AsyncLoggerBase):
        log_level: LogLevel = LogLevel.DEBUG,
        tag_width: int = 10,
        icons: Optional[Dict[str, str]] = None,
-        colors: Optional[Dict[LogLevel, str]] = None,
+        colors: Optional[Dict[LogLevel, LogColor]] = None,
        verbose: bool = True,
    ):
        """
@@ -112,13 +129,13 @@ class AsyncLogger(AsyncLoggerBase):
            colors: Custom colors for different log levels
            verbose: Whether to output to console
        """
        init()  # Initialize colorama
        self.log_file = log_file
        self.log_level = log_level
        self.tag_width = tag_width
        self.icons = icons or self.DEFAULT_ICONS
        self.colors = colors or self.DEFAULT_COLORS
        self.verbose = verbose
        self.console = Console()
        # Create log file directory if needed
        if log_file:
@@ -143,16 +160,11 @@ class AsyncLogger(AsyncLoggerBase):
    def _write_to_file(self, message: str):
        """Write a message to the log file if configured."""
        if self.log_file:
            text = Text.from_markup(message)
            plain_text = text.plain
            timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S.%f")[:-3]
            with open(self.log_file, "a", encoding="utf-8") as f:
-                # Strip ANSI color codes for file output
+                f.write(f"[{timestamp}] {plain_text}\n")
                clean_message = message.replace(Fore.RESET, "").replace(
                    Style.RESET_ALL, ""
                )
                for color in vars(Fore).values():
                    if isinstance(color, str):
                        clean_message = clean_message.replace(color, "")
                f.write(f"[{timestamp}] {clean_message}\n")
    def _log(
        self,
@@ -160,8 +172,9 @@ class AsyncLogger(AsyncLoggerBase):
        message: str,
        tag: str,
        params: Optional[Dict[str, Any]] = None,
-        colors: Optional[Dict[str, str]] = None,
+        colors: Optional[Dict[str, LogColor]] = None,
-        base_color: Optional[str] = None,
+        boxes: Optional[List[str]] = None,
        base_color: Optional[LogColor] = None,
        **kwargs,
    ):
        """
@@ -173,55 +186,44 @@ class AsyncLogger(AsyncLoggerBase):
            tag: Tag for the message
            params: Parameters to format into the message
            colors: Color overrides for specific parameters
            boxes: Box overrides for specific parameters
            base_color: Base color for the entire message
        """
        if level.value < self.log_level.value:
            return
-        # Format the message with parameters if provided
+        # avoid conflict with rich formatting
        parsed_message = message.replace("[", "[[").replace("]", "]]")
        if params:
-            try:
+            # FIXME: If there are formatting strings in floating point format, 
-                # First format the message with raw parameters
+            # this may result in colors and boxes not being applied properly.
-                formatted_message = message.format(**params)
+            # such as {value:.2f}, the value is 0.23333 format it to 0.23,
            # but we replace("0.23333", "[color]0.23333[/color]")
            formatted_message = parsed_message.format(**params)
            for key, value in params.items():
                # value_str may discard `[` and `]`, so we need to replace it. 
                value_str = str(value).replace("[", "[[").replace("]", "]]")
                # check is need apply color
                if colors and key in colors:
                    color_str = f"[{colors[key]}]{value_str}[/{colors[key]}]"
                    formatted_message = formatted_message.replace(value_str, color_str)
                    value_str = color_str
-                # Then apply colors if specified
+                # check is need apply box
-                color_map = {
+                if boxes and key in boxes:
-                    "green": Fore.GREEN,
+                    formatted_message = formatted_message.replace(value_str,
-                    "red": Fore.RED,
+                        create_box_message(value_str, type=str(level)))
                    "yellow": Fore.YELLOW,
                    "blue": Fore.BLUE,
                    "cyan": Fore.CYAN,
                    "magenta": Fore.MAGENTA,
                    "white": Fore.WHITE,
                    "black": Fore.BLACK,
                    "reset": Style.RESET_ALL,
                }
                if colors:
                    for key, color in colors.items():
                        # Find the formatted value in the message and wrap it with color
                        if color in color_map:
                            color = color_map[color]
                        if key in params:
                            value_str = str(params[key])
                            formatted_message = formatted_message.replace(
                                value_str, f"{color}{value_str}{Style.RESET_ALL}"
                            )
            except KeyError as e:
                formatted_message = (
                    f"LOGGING ERROR: Missing parameter {e} in message template"
                )
                level = LogLevel.ERROR
        else:
-            formatted_message = message
+            formatted_message = parsed_message
        # Construct the full log line
-        color = base_color or self.colors[level]
+        color: LogColor = base_color or self.colors[level]
-        log_line = f"{color}{self._format_tag(tag)} {self._get_icon(tag)} {formatted_message}{Style.RESET_ALL}"
+        log_line = f"[{color}]{self._format_tag(tag)} {self._get_icon(tag)} {formatted_message} [/{color}]"
        # Output to console if verbose
        if self.verbose or kwargs.get("force_verbose", False):
-            print(log_line)
+            self.console.print(log_line)
        # Write to file if configured
        self._write_to_file(log_line)
@@ -292,8 +294,8 @@ class AsyncLogger(AsyncLoggerBase):
                "timing": timing,
            },
            colors={
-                "status": Fore.GREEN if success else Fore.RED,
+                "status": LogColor.SUCCESS if success else LogColor.ERROR,
-                "timing": Fore.YELLOW,
+                "timing": LogColor.WARNING,
            },
        )
--- a/crawl4ai/async_webcrawler.py
+++ b/crawl4ai/async_webcrawler.py
@@ -2,7 +2,6 @@ from .__version__ import __version__ as crawl4ai_version
 import os
 import sys
 import time
 from colorama import Fore
 from pathlib import Path
 from typing import Optional, List
 import json
@@ -44,7 +43,6 @@ from .utils import (
    sanitize_input_encode,
    InvalidCSSSelectorError,
    fast_format_html,
    create_box_message,
    get_error_context,
    RobotsParser,
    preprocess_html_for_schema,
@@ -419,7 +417,7 @@ class AsyncWebCrawler:
                self.logger.error_status(
                    url=url,
-                    error=create_box_message(error_message, type="error"),
+                    error=error_message,
                    tag="ERROR",
                )
@@ -496,11 +494,13 @@ class AsyncWebCrawler:
            cleaned_html = sanitize_input_encode(
                result.get("cleaned_html", ""))
            media = result.get("media", {})
            tables = media.pop("tables", []) if isinstance(media, dict) else []
            links = result.get("links", {})
            metadata = result.get("metadata", {})
        else:
            cleaned_html = sanitize_input_encode(result.cleaned_html)
            media = result.media.model_dump()
            tables = media.pop("tables", [])
            links = result.links.model_dump()
            metadata = result.metadata
@@ -627,6 +627,7 @@ class AsyncWebCrawler:
            cleaned_html=cleaned_html,
            markdown=markdown_result,
            media=media,
            tables=tables,                       # NEW
            links=links,
            metadata=metadata,
            screenshot=screenshot_data,
--- a/crawl4ai/browser_manager.py
+++ b/crawl4ai/browser_manager.py
@@ -5,7 +5,10 @@ import os
 import sys
 import shutil
 import tempfile
 import psutil  
 import signal
 import subprocess
 import shlex
 from playwright.async_api import BrowserContext
 import hashlib
 from .js_snippet import load_js_script
@@ -194,6 +197,45 @@ class ManagedBrowser:
        if self.browser_config.extra_args:
            args.extend(self.browser_config.extra_args)
        # ── make sure no old Chromium instance is owning the same port/profile ──
        try:
            if sys.platform == "win32":
                if psutil is None:
                    raise RuntimeError("psutil not available, cannot clean old browser")
                for p in psutil.process_iter(["pid", "name", "cmdline"]):
                    cl = " ".join(p.info.get("cmdline") or [])
                    if (
                        f"--remote-debugging-port={self.debugging_port}" in cl
                        and f"--user-data-dir={self.user_data_dir}" in cl
                    ):
                        p.kill()
                        p.wait(timeout=5)
            else:  # macOS / Linux
                # kill any process listening on the same debugging port
                pids = (
                    subprocess.check_output(shlex.split(f"lsof -t -i:{self.debugging_port}"))
                    .decode()
                    .strip()
                    .splitlines()
                )
                for pid in pids:
                    try:
                        os.kill(int(pid), signal.SIGTERM)
                    except ProcessLookupError:
                        pass
                # remove Chromium singleton locks, or new launch exits with
                # “Opening in existing browser session.”
                for f in ("SingletonLock", "SingletonSocket", "SingletonCookie"):
                    fp = os.path.join(self.user_data_dir, f)
                    if os.path.exists(fp):
                        os.remove(fp)
        except Exception as _e:
            # non-fatal — we'll try to start anyway, but log what happened
            self.logger.warning(f"pre-launch cleanup failed: {_e}", tag="BROWSER")            
        # Start browser process
        try:
            # Use DETACHED_PROCESS flag on Windows to fully detach the process
@@ -922,7 +964,7 @@ class BrowserManager:
            pages = context.pages
            page = next((p for p in pages if p.url == crawlerRunConfig.url), None)
            if not page:
-                page = await context.new_page()
+                page = context.pages[0] # await context.new_page()
        else:
            # Otherwise, check if we have an existing context for this config
            config_signature = self._make_config_signature(crawlerRunConfig)
--- a/crawl4ai/browser_profiler.py
+++ b/crawl4ai/browser_profiler.py
@@ -15,12 +15,12 @@ import shutil
 import json
 import subprocess
 import time
-from typing import List, Dict, Optional, Any, Tuple
+from typing import List, Dict, Optional, Any
-from colorama import Fore, Style, init
+from rich.console import Console
 from .async_configs import BrowserConfig
 from .browser_manager import ManagedBrowser
-from .async_logger import AsyncLogger, AsyncLoggerBase
+from .async_logger import AsyncLogger, AsyncLoggerBase, LogColor
 from .utils import get_home_folder
@@ -45,8 +45,8 @@ class BrowserProfiler:
            logger (AsyncLoggerBase, optional): Logger for outputting messages.
                If None, a default AsyncLogger will be created.
        """
-        # Initialize colorama for colorful terminal output
+        # Initialize rich console for colorful input prompts
-        init()
+        self.console = Console()
        # Create a logger if not provided
        if logger is None:
@@ -127,26 +127,30 @@ class BrowserProfiler:
        profile_path = os.path.join(self.profiles_dir, profile_name)
        os.makedirs(profile_path, exist_ok=True)
-        # Print instructions for the user with colorama formatting
+        # Print instructions for the user with rich formatting
-        border = f"{Fore.CYAN}{'='*80}{Style.RESET_ALL}"
+        border = "{'='*80}"
-        self.logger.info(f"\n{border}", tag="PROFILE")
+        self.logger.info("{border}", tag="PROFILE", params={"border": f"\n{border}"}, colors={"border": LogColor.CYAN})
-        self.logger.info(f"Creating browser profile: {Fore.GREEN}{profile_name}{Style.RESET_ALL}", tag="PROFILE")
+        self.logger.info("Creating browser profile: {profile_name}", tag="PROFILE", params={"profile_name": profile_name}, colors={"profile_name": LogColor.GREEN})
-        self.logger.info(f"Profile directory: {Fore.YELLOW}{profile_path}{Style.RESET_ALL}", tag="PROFILE")
+        self.logger.info("Profile directory: {profile_path}", tag="PROFILE", params={"profile_path": profile_path}, colors={"profile_path": LogColor.YELLOW})
        self.logger.info("\nInstructions:", tag="PROFILE")
        self.logger.info("1. A browser window will open for you to set up your profile.", tag="PROFILE")
-        self.logger.info(f"2. {Fore.CYAN}Log in to websites{Style.RESET_ALL}, configure settings, etc. as needed.", tag="PROFILE")
+        self.logger.info("{segment}, configure settings, etc. as needed.", tag="PROFILE", params={"segment": "2. Log in to websites"}, colors={"segment": LogColor.CYAN})
-        self.logger.info(f"3. When you're done, {Fore.YELLOW}press 'q' in this terminal{Style.RESET_ALL} to close the browser.", tag="PROFILE")
+        self.logger.info("3. When you're done, {segment} to close the browser.", tag="PROFILE", params={"segment": "press 'q' in this terminal"}, colors={"segment": LogColor.YELLOW})
        self.logger.info("4. The profile will be saved and ready to use with Crawl4AI.", tag="PROFILE")
-        self.logger.info(f"{border}\n", tag="PROFILE")
+        self.logger.info("{border}", tag="PROFILE", params={"border": f"{border}\n"}, colors={"border": LogColor.CYAN})
        browser_config.headless = False
        browser_config.user_data_dir = profile_path
        # Create managed browser instance
        managed_browser = ManagedBrowser(
-            browser_type=browser_config.browser_type,
+            browser_config=browser_config,
-            user_data_dir=profile_path,
+            # user_data_dir=profile_path,
-            headless=False,  # Must be visible
+            # headless=False,  # Must be visible
            logger=self.logger,
-            debugging_port=browser_config.debugging_port
+            # debugging_port=browser_config.debugging_port
        )
        # Set up signal handlers to ensure cleanup on interrupt
@@ -181,7 +185,7 @@ class BrowserProfiler:
            import select
            # First output the prompt
-            self.logger.info(f"{Fore.CYAN}Press '{Fore.WHITE}q{Fore.CYAN}' when you've finished using the browser...{Style.RESET_ALL}", tag="PROFILE")
+            self.logger.info("Press 'q' when you've finished using the browser...", tag="PROFILE")
            # Save original terminal settings
            fd = sys.stdin.fileno()
@@ -197,7 +201,7 @@ class BrowserProfiler:
                    if readable:
                        key = sys.stdin.read(1)
                        if key.lower() == 'q':
-                            self.logger.info(f"{Fore.GREEN}Closing browser and saving profile...{Style.RESET_ALL}", tag="PROFILE")
+                            self.logger.info("Closing browser and saving profile...", tag="PROFILE", base_color=LogColor.GREEN)
                            user_done_event.set()
                            return
@@ -223,7 +227,7 @@ class BrowserProfiler:
                self.logger.error("Failed to start browser process.", tag="PROFILE")
                return None
-            self.logger.info(f"Browser launched. {Fore.CYAN}Waiting for you to finish...{Style.RESET_ALL}", tag="PROFILE") 
+            self.logger.info("Browser launched. Waiting for you to finish...", tag="PROFILE") 
            # Start listening for keyboard input
            listener_task = asyncio.create_task(listen_for_quit_command())
@@ -245,10 +249,10 @@ class BrowserProfiler:
                self.logger.info("Terminating browser process...", tag="PROFILE")
                await managed_browser.cleanup()
-            self.logger.success(f"Browser closed. Profile saved at: {Fore.GREEN}{profile_path}{Style.RESET_ALL}", tag="PROFILE")
+            self.logger.success(f"Browser closed. Profile saved at: {profile_path}", tag="PROFILE")
        except Exception as e:
-            self.logger.error(f"Error creating profile: {str(e)}", tag="PROFILE")
+            self.logger.error(f"Error creating profile: {e!s}", tag="PROFILE")
            await managed_browser.cleanup()
            return None
        finally:
@@ -440,25 +444,27 @@ class BrowserProfiler:
            ```
        """
        while True:
-            self.logger.info(f"\n{Fore.CYAN}Profile Management Options:{Style.RESET_ALL}", tag="MENU")
+            self.logger.info("\nProfile Management Options:", tag="MENU")
-            self.logger.info(f"1. {Fore.GREEN}Create a new profile{Style.RESET_ALL}", tag="MENU")
+            self.logger.info("1. Create a new profile", tag="MENU", base_color=LogColor.GREEN)
-            self.logger.info(f"2. {Fore.YELLOW}List available profiles{Style.RESET_ALL}", tag="MENU")
+            self.logger.info("2. List available profiles", tag="MENU", base_color=LogColor.YELLOW)
-            self.logger.info(f"3. {Fore.RED}Delete a profile{Style.RESET_ALL}", tag="MENU")
+            self.logger.info("3. Delete a profile", tag="MENU", base_color=LogColor.RED)
            # Only show crawl option if callback provided
            if crawl_callback:
-                self.logger.info(f"4. {Fore.CYAN}Use a profile to crawl a website{Style.RESET_ALL}", tag="MENU")
+                self.logger.info("4. Use a profile to crawl a website", tag="MENU", base_color=LogColor.CYAN)
-                self.logger.info(f"5. {Fore.MAGENTA}Exit{Style.RESET_ALL}", tag="MENU")
+                self.logger.info("5. Exit", tag="MENU", base_color=LogColor.MAGENTA)
                exit_option = "5"
            else:
-                self.logger.info(f"4. {Fore.MAGENTA}Exit{Style.RESET_ALL}", tag="MENU")
+                self.logger.info("4. Exit", tag="MENU", base_color=LogColor.MAGENTA)
                exit_option = "4"
-            choice = input(f"\n{Fore.CYAN}Enter your choice (1-{exit_option}): {Style.RESET_ALL}")
+            self.logger.print(f"\n[cyan]Enter your choice (1-{exit_option}): [/cyan]", end="")
            choice = input()
            if choice == "1":
                # Create new profile
-                name = input(f"{Fore.GREEN}Enter a name for the new profile (or press Enter for auto-generated name): {Style.RESET_ALL}")
+                self.console.print("[green]Enter a name for the new profile (or press Enter for auto-generated name): [/green]", end="")
                name = input()
                await self.create_profile(name or None)
            elif choice == "2":
@@ -472,8 +478,8 @@ class BrowserProfiler:
                # Print profile information with colorama formatting
                self.logger.info("\nAvailable profiles:", tag="PROFILES")
                for i, profile in enumerate(profiles):
-                    self.logger.info(f"[{i+1}] {Fore.CYAN}{profile['name']}{Style.RESET_ALL}", tag="PROFILES")
+                    self.logger.info(f"[{i+1}] {profile['name']}", tag="PROFILES")
-                    self.logger.info(f"    Path: {Fore.YELLOW}{profile['path']}{Style.RESET_ALL}", tag="PROFILES")
+                    self.logger.info(f"    Path: {profile['path']}", tag="PROFILES", base_color=LogColor.YELLOW)
                    self.logger.info(f"    Created: {profile['created'].strftime('%Y-%m-%d %H:%M:%S')}", tag="PROFILES")
                    self.logger.info(f"    Browser type: {profile['type']}", tag="PROFILES")
                    self.logger.info("", tag="PROFILES")  # Empty line for spacing
@@ -486,12 +492,13 @@ class BrowserProfiler:
                    continue
                # Display numbered list
-                self.logger.info(f"\n{Fore.YELLOW}Available profiles:{Style.RESET_ALL}", tag="PROFILES")
+                self.logger.info("\nAvailable profiles:", tag="PROFILES", base_color=LogColor.YELLOW)
                for i, profile in enumerate(profiles):
                    self.logger.info(f"[{i+1}] {profile['name']}", tag="PROFILES")
                # Get profile to delete
-                profile_idx = input(f"{Fore.RED}Enter the number of the profile to delete (or 'c' to cancel): {Style.RESET_ALL}")
+                self.console.print("[red]Enter the number of the profile to delete (or 'c' to cancel): [/red]", end="")
                profile_idx = input()
                if profile_idx.lower() == 'c':
                    continue
@@ -499,17 +506,18 @@ class BrowserProfiler:
                    idx = int(profile_idx) - 1
                    if 0 <= idx < len(profiles):
                        profile_name = profiles[idx]["name"]
-                        self.logger.info(f"Deleting profile: {Fore.YELLOW}{profile_name}{Style.RESET_ALL}", tag="PROFILES")
+                        self.logger.info(f"Deleting profile: [yellow]{profile_name}[/yellow]", tag="PROFILES")
                        # Confirm deletion
-                        confirm = input(f"{Fore.RED}Are you sure you want to delete this profile? (y/n): {Style.RESET_ALL}")
+                        self.console.print("[red]Are you sure you want to delete this profile? (y/n): [/red]", end="")
                        confirm = input()
                        if confirm.lower() == 'y':
                            success = self.delete_profile(profiles[idx]["path"])
                            if success:
-                                self.logger.success(f"Profile {Fore.GREEN}{profile_name}{Style.RESET_ALL} deleted successfully", tag="PROFILES")
+                                self.logger.success(f"Profile {profile_name} deleted successfully", tag="PROFILES")
                            else:
-                                self.logger.error(f"Failed to delete profile {Fore.RED}{profile_name}{Style.RESET_ALL}", tag="PROFILES")
+                                self.logger.error(f"Failed to delete profile {profile_name}", tag="PROFILES")
                    else:
                        self.logger.error("Invalid profile number", tag="PROFILES")
                except ValueError:
@@ -523,12 +531,13 @@ class BrowserProfiler:
                    continue
                # Display numbered list
-                self.logger.info(f"\n{Fore.YELLOW}Available profiles:{Style.RESET_ALL}", tag="PROFILES")
+                self.logger.info("\nAvailable profiles:", tag="PROFILES", base_color=LogColor.YELLOW)
                for i, profile in enumerate(profiles):
                    self.logger.info(f"[{i+1}] {profile['name']}", tag="PROFILES")
                # Get profile to use
-                profile_idx = input(f"{Fore.CYAN}Enter the number of the profile to use (or 'c' to cancel): {Style.RESET_ALL}")
+                self.console.print("[cyan]Enter the number of the profile to use (or 'c' to cancel): [/cyan]", end="")
                profile_idx = input()
                if profile_idx.lower() == 'c':
                    continue
@@ -536,7 +545,8 @@ class BrowserProfiler:
                    idx = int(profile_idx) - 1
                    if 0 <= idx < len(profiles):
                        profile_path = profiles[idx]["path"]
-                        url = input(f"{Fore.CYAN}Enter the URL to crawl: {Style.RESET_ALL}")
+                        self.console.print("[cyan]Enter the URL to crawl: [/cyan]", end="")
                        url = input()
                        if url:
                            # Call the provided crawl callback
                            await crawl_callback(profile_path, url)
@@ -599,11 +609,11 @@ class BrowserProfiler:
        # Print initial information
        border = f"{Fore.CYAN}{'='*80}{Style.RESET_ALL}"
        self.logger.info(f"\n{border}", tag="CDP")
-        self.logger.info(f"Launching standalone browser with CDP debugging", tag="CDP")
+        self.logger.info("Launching standalone browser with CDP debugging", tag="CDP")
-        self.logger.info(f"Browser type: {Fore.GREEN}{browser_type}{Style.RESET_ALL}", tag="CDP")
+        self.logger.info("Browser type: {browser_type}", tag="CDP", params={"browser_type": browser_type}, colors={"browser_type": LogColor.CYAN})
-        self.logger.info(f"Profile path: {Fore.YELLOW}{profile_path}{Style.RESET_ALL}", tag="CDP")
+        self.logger.info("Profile path: {profile_path}", tag="CDP", params={"profile_path": profile_path}, colors={"profile_path": LogColor.YELLOW})
-        self.logger.info(f"Debugging port: {Fore.CYAN}{debugging_port}{Style.RESET_ALL}", tag="CDP")
+        self.logger.info(f"Debugging port: {debugging_port}", tag="CDP")
-        self.logger.info(f"Headless mode: {Fore.CYAN}{headless}{Style.RESET_ALL}", tag="CDP")
+        self.logger.info(f"Headless mode: {headless}", tag="CDP")
        # Create managed browser instance
        managed_browser = ManagedBrowser(
@@ -646,7 +656,7 @@ class BrowserProfiler:
            import select
            # First output the prompt
-            self.logger.info(f"{Fore.CYAN}Press '{Fore.WHITE}q{Fore.CYAN}' to stop the browser and exit...{Style.RESET_ALL}", tag="CDP")
+            self.logger.info("Press 'q' to stop the browser and exit...", tag="CDP")
            # Save original terminal settings
            fd = sys.stdin.fileno()
@@ -662,7 +672,7 @@ class BrowserProfiler:
                    if readable:
                        key = sys.stdin.read(1)
                        if key.lower() == 'q':
-                            self.logger.info(f"{Fore.GREEN}Closing browser...{Style.RESET_ALL}", tag="CDP")
+                            self.logger.info("Closing browser...", tag="CDP")
                            user_done_event.set()
                            return
@@ -716,20 +726,20 @@ class BrowserProfiler:
                self.logger.error("Failed to start browser process.", tag="CDP")
                return None
-            self.logger.info(f"Browser launched successfully. Retrieving CDP information...", tag="CDP") 
+            self.logger.info("Browser launched successfully. Retrieving CDP information...", tag="CDP") 
            # Get CDP URL and JSON config
            cdp_url, config_json = await get_cdp_json(debugging_port)
            if cdp_url:
-                self.logger.success(f"CDP URL: {Fore.GREEN}{cdp_url}{Style.RESET_ALL}", tag="CDP")
+                self.logger.success(f"CDP URL: {cdp_url}", tag="CDP")
                if config_json:
                    # Display relevant CDP information
-                    self.logger.info(f"Browser: {Fore.CYAN}{config_json.get('Browser', 'Unknown')}{Style.RESET_ALL}", tag="CDP")
+                    self.logger.info(f"Browser: {config_json.get('Browser', 'Unknown')}", tag="CDP", colors={"Browser": LogColor.CYAN})
-                    self.logger.info(f"Protocol Version: {config_json.get('Protocol-Version', 'Unknown')}", tag="CDP")
+                    self.logger.info(f"Protocol Version: {config_json.get('Protocol-Version', 'Unknown')}", tag="CDP", colors={"Protocol-Version": LogColor.CYAN})
                    if 'webSocketDebuggerUrl' in config_json:
-                        self.logger.info(f"WebSocket URL: {Fore.GREEN}{config_json['webSocketDebuggerUrl']}{Style.RESET_ALL}", tag="CDP")
+                        self.logger.info("WebSocket URL: {webSocketDebuggerUrl}", tag="CDP", params={"webSocketDebuggerUrl": config_json['webSocketDebuggerUrl']}, colors={"webSocketDebuggerUrl": LogColor.GREEN})
                else:
                    self.logger.warning("Could not retrieve CDP configuration JSON", tag="CDP")
            else:
@@ -757,7 +767,7 @@ class BrowserProfiler:
                self.logger.info("Terminating browser process...", tag="CDP")
                await managed_browser.cleanup()
-            self.logger.success(f"Browser closed.", tag="CDP")
+            self.logger.success("Browser closed.", tag="CDP")
        except Exception as e:
            self.logger.error(f"Error launching standalone browser: {str(e)}", tag="CDP")
@@ -972,3 +982,30 @@ class BrowserProfiler:
            'info': browser_info
        }
 if __name__ == "__main__":
    # Example usage
    profiler = BrowserProfiler()
    # Create a new profile
    import os
    from pathlib import Path
    home_dir = Path.home()
    profile_path = asyncio.run(profiler.create_profile( str(home_dir / ".crawl4ai/profiles/test-profile")))
    # Launch a standalone browser
    asyncio.run(profiler.launch_standalone_browser())
    # List profiles
    profiles = profiler.list_profiles()
    for profile in profiles:
        print(f"Profile: {profile['name']}, Path: {profile['path']}")
    # Delete a profile
    success = profiler.delete_profile("my-profile")
    if success:
        print("Profile deleted successfully")
    else:
        print("Failed to delete profile")
--- a/crawl4ai/content_filter_strategy.py
+++ b/crawl4ai/content_filter_strategy.py
@@ -27,8 +27,7 @@ import json
 import hashlib
 from pathlib import Path
 from concurrent.futures import ThreadPoolExecutor
-from .async_logger import AsyncLogger, LogLevel
+from .async_logger import AsyncLogger, LogLevel, LogColor
 from colorama import Fore, Style
 class RelevantContentFilter(ABC):
@@ -846,8 +845,7 @@ class LLMContentFilter(RelevantContentFilter):
                },
                colors={
                    **AsyncLogger.DEFAULT_COLORS,
-                    LogLevel.INFO: Fore.MAGENTA
+                    LogLevel.INFO: LogColor.DIM_MAGENTA  # Dimmed purple for LLM ops
                    + Style.DIM,  # Dimmed purple for LLM ops
                },
            )
        else:
@@ -892,7 +890,7 @@ class LLMContentFilter(RelevantContentFilter):
                "Starting LLM markdown content filtering process",
                tag="LLM",
                params={"provider": self.llm_config.provider},
-                colors={"provider": Fore.CYAN},
+                colors={"provider": LogColor.CYAN},
            )
        # Cache handling
@@ -929,7 +927,7 @@ class LLMContentFilter(RelevantContentFilter):
                "LLM markdown: Split content into {chunk_count} chunks",
                tag="CHUNK",
                params={"chunk_count": len(html_chunks)},
-                colors={"chunk_count": Fore.YELLOW},
+                colors={"chunk_count": LogColor.YELLOW},
            )
        start_time = time.time()
@@ -1038,7 +1036,7 @@ class LLMContentFilter(RelevantContentFilter):
                "LLM markdown: Completed processing in {time:.2f}s",
                tag="LLM",
                params={"time": end_time - start_time},
-                colors={"time": Fore.YELLOW},
+                colors={"time": LogColor.YELLOW},
            )
        result = ordered_results if ordered_results else []
--- a/crawl4ai/models.py
+++ b/crawl4ai/models.py
@@ -1,4 +1,4 @@
-from pydantic import BaseModel, HttpUrl, PrivateAttr
+from pydantic import BaseModel, HttpUrl, PrivateAttr, Field
 from typing import List, Dict, Optional, Callable, Awaitable, Union, Any
 from typing import AsyncGenerator
 from typing import Generic, TypeVar
@@ -150,6 +150,7 @@ class CrawlResult(BaseModel):
    redirected_url: Optional[str] = None
    network_requests: Optional[List[Dict[str, Any]]] = None
    console_messages: Optional[List[Dict[str, Any]]] = None
    tables: List[Dict] = Field(default_factory=list)  # NEW – [{headers,rows,caption,summary}]
    class Config:
        arbitrary_types_allowed = True
--- a/crawl4ai/utils.py
+++ b/crawl4ai/utils.py
@@ -20,7 +20,6 @@ from urllib.parse import urljoin
 import requests
 from requests.exceptions import InvalidSchema
 import xxhash
 from colorama import Fore, Style, init
 import textwrap
 import cProfile
 import pstats
@@ -441,14 +440,13 @@ def create_box_message(
        str: A formatted string containing the styled message box.
    """
    init()
    # Define border and text colors for different types
    styles = {
-        "warning": (Fore.YELLOW, Fore.LIGHTYELLOW_EX, "⚠"),
+        "warning": ("yellow", "bright_yellow", "⚠"),
-        "info": (Fore.BLUE, Fore.LIGHTBLUE_EX, "ℹ"),
+        "info": ("blue", "bright_blue", "ℹ"),
-        "success": (Fore.GREEN, Fore.LIGHTGREEN_EX, "✓"),
+        "debug": ("lightblack", "bright_black", "⋯"),
-        "error": (Fore.RED, Fore.LIGHTRED_EX, "×"),
+        "success": ("green", "bright_green", "✓"),
        "error": ("red", "bright_red", "×"),
    }
    border_color, text_color, prefix = styles.get(type.lower(), styles["info"])
@@ -480,12 +478,12 @@ def create_box_message(
    # Create the box with colored borders and lighter text
    horizontal_line = h_line * (width - 1)
    box = [
-        f"{border_color}{tl}{horizontal_line}{tr}",
+        f"[{border_color}]{tl}{horizontal_line}{tr}[/{border_color}]",
        *[
-            f"{border_color}{v_line}{text_color} {line:<{width-2}}{border_color}{v_line}"
+            f"[{border_color}]{v_line}[{text_color}] {line:<{width-2}}[/{text_color}][{border_color}]{v_line}[/{border_color}]"
            for line in formatted_lines
        ],
-        f"{border_color}{bl}{horizontal_line}{br}{Style.RESET_ALL}",
+        f"[{border_color}]{bl}{horizontal_line}{br}[/{border_color}]",
    ]
    result = "\n".join(box)
@@ -2778,4 +2776,3 @@ def preprocess_html_for_schema(html_content, text_threshold=100, attr_value_thre
        # Fallback for parsing errors
        return html_content[:max_size] if len(html_content) > max_size else html_content
--- a/deploy/docker/README.md
+++ b/deploy/docker/README.md
@@ -58,7 +58,7 @@ Pull and run images directly from Docker Hub without building locally.
 #### 1. Pull the Image
-Our latest release candidate is `0.6.0rc1-r1`. Images are built with multi-arch manifests, so Docker automatically pulls the correct version for your system.
+Our latest release candidate is `0.6.0-r1`. Images are built with multi-arch manifests, so Docker automatically pulls the correct version for your system.
 ```bash
 # Pull the release candidate (recommended for latest features)
@@ -124,9 +124,9 @@ docker stop crawl4ai && docker rm crawl4ai
 #### Docker Hub Versioning Explained
 *   **Image Name:** `unclecode/crawl4ai`
-*   **Tag Format:** `LIBRARY_VERSION[-SUFFIX]` (e.g., `0.6.0rc1-r1`)
+*   **Tag Format:** `LIBRARY_VERSION[-SUFFIX]` (e.g., `0.6.0-r1`)
    *   `LIBRARY_VERSION`: The semantic version of the core `crawl4ai` Python library
-    *   `SUFFIX`: Optional tag for release candidates (`rc1`) and revisions (`r1`)
+    *   `SUFFIX`: Optional tag for release candidates (``) and revisions (`r1`)
 *   **`latest` Tag:** Points to the most recent stable version
 *   **Multi-Architecture Support:** All images support both `linux/amd64` and `linux/arm64` architectures through a single tag
--- a/deploy/docker/static/playground/index.html
+++ b/deploy/docker/static/playground/index.html
@@ -193,7 +193,48 @@
                <textarea id="urls" class="w-full bg-dark border border-border rounded p-2 h-32 text-sm mb-4"
                    spellcheck="false">https://example.com</textarea>
-                <details class="mb-4">
+                <!-- Specific options for /md endpoint -->
                <details id="md-options" class="mb-4 hidden">
                    <summary class="text-sm text-secondary cursor-pointer">/md Options</summary>
                    <div class="mt-2 space-y-3 p-2 border border-border rounded">
                        <div>
                            <label for="md-filter" class="block text-xs text-secondary mb-1">Filter Type</label>
                            <select id="md-filter" class="bg-dark border border-border rounded px-2 py-1 text-sm w-full">
                                <option value="fit">fit - Adaptive content filtering</option>
                                <option value="raw">raw - No filtering</option>
                                <option value="bm25">bm25 - BM25 keyword relevance</option>
                                <option value="llm">llm - LLM-based filtering</option>
                            </select>
                        </div>
                        <div>
                            <label for="md-query" class="block text-xs text-secondary mb-1">Query (for BM25/LLM filters)</label>
                            <input id="md-query" type="text" placeholder="Enter search terms or instructions" 
                                class="bg-dark border border-border rounded px-2 py-1 text-sm w-full">
                        </div>
                        <div>
                            <label for="md-cache" class="block text-xs text-secondary mb-1">Cache Mode</label>
                            <select id="md-cache" class="bg-dark border border-border rounded px-2 py-1 text-sm w-full">
                                <option value="0">Write-Only (0)</option>
                                <option value="1">Enabled (1)</option>
                            </select>
                        </div>
                    </div>
                </details>
                <!-- Specific options for /llm endpoint -->
                <details id="llm-options" class="mb-4 hidden">
                    <summary class="text-sm text-secondary cursor-pointer">/llm Options</summary>
                    <div class="mt-2 space-y-3 p-2 border border-border rounded">
                        <div>
                            <label for="llm-question" class="block text-xs text-secondary mb-1">Question</label>
                            <input id="llm-question" type="text" value="What is this page about?" 
                                class="bg-dark border border-border rounded px-2 py-1 text-sm w-full">
                        </div>
                    </div>
                </details>
                <!-- Advanced config for /crawl endpoints -->
                <details id="adv-config" class="mb-4">
                    <summary class="text-sm text-secondary cursor-pointer">Advanced Config <span
                        class="text-xs text-primary">(Python → auto‑JSON)</span></summary>
@@ -438,6 +479,33 @@
            document.getElementById('cfg-status').textContent = '';
        });
        // Handle endpoint selection change to show appropriate options
        document.getElementById('endpoint').addEventListener('change', function(e) {
            const endpoint = e.target.value;
            const mdOptions = document.getElementById('md-options');
            const llmOptions = document.getElementById('llm-options');
            const advConfig = document.getElementById('adv-config');
            // Hide all option sections first
            mdOptions.classList.add('hidden');
            llmOptions.classList.add('hidden');
            advConfig.classList.add('hidden');
            // Show the appropriate section based on endpoint
            if (endpoint === 'md') {
                mdOptions.classList.remove('hidden');
                // Auto-open the /md options
                mdOptions.setAttribute('open', '');
            } else if (endpoint === 'llm') {
                llmOptions.classList.remove('hidden');
                // Auto-open the /llm options
                llmOptions.setAttribute('open', '');
            } else {
                // For /crawl endpoints, show the advanced config
                advConfig.classList.remove('hidden');
            }
        });
        async function pyConfigToJson() {
            const code = cm.getValue().trim();
            if (!code) return {};
@@ -494,10 +562,18 @@
        }
        // Generate code snippets
-        function generateSnippets(api, payload) {
+        function generateSnippets(api, payload, method = 'POST') {
            // Python snippet
            const pyCodeEl = document.querySelector('#python-content code');
-            const pySnippet = `import httpx\n\nasync def crawl():\n    async with httpx.AsyncClient() as client:\n        response = await client.post(\n            "${window.location.origin}${api}",\n            json=${JSON.stringify(payload, null, 4).replace(/\n/g, '\n            ')}\n        )\n        return response.json()`;
+            let pySnippet;
            if (method === 'GET') {
                // GET request (for /llm endpoint)
                pySnippet = `import httpx\n\nasync def crawl():\n    async with httpx.AsyncClient() as client:\n        response = await client.get(\n            "${window.location.origin}${api}"\n        )\n        return response.json()`;
            } else {
                // POST request (for /crawl and /md endpoints)
                pySnippet = `import httpx\n\nasync def crawl():\n    async with httpx.AsyncClient() as client:\n        response = await client.post(\n            "${window.location.origin}${api}",\n            json=${JSON.stringify(payload, null, 4).replace(/\n/g, '\n            ')}\n        )\n        return response.json()`;
            }
            pyCodeEl.textContent = pySnippet;
            pyCodeEl.className = 'python hljs'; // Reset classes
@@ -505,7 +581,15 @@
            // cURL snippet
            const curlCodeEl = document.querySelector('#curl-content code');
-            const curlSnippet = `curl -X POST ${window.location.origin}${api} \\\n  -H "Content-Type: application/json" \\\n  -d '${JSON.stringify(payload)}'`;
+            let curlSnippet;
            if (method === 'GET') {
                // GET request (for /llm endpoint)
                curlSnippet = `curl -X GET "${window.location.origin}${api}"`;
            } else {
                // POST request (for /crawl and /md endpoints)
                curlSnippet = `curl -X POST ${window.location.origin}${api} \\\n  -H "Content-Type: application/json" \\\n  -d '${JSON.stringify(payload)}'`;
            }
            curlCodeEl.textContent = curlSnippet;
            curlCodeEl.className = 'bash hljs'; // Reset classes
@@ -536,20 +620,39 @@
            const endpointMap = {
                crawl: '/crawl',
-            };
+                // crawl_stream: '/crawl/stream',
            /*const endpointMap = {
                crawl: '/crawl',
                crawl_stream: '/crawl/stream',
                md: '/md',
                llm: '/llm'
-            };*/
+            };
            const api = endpointMap[endpoint];
-            const payload = {
+            let payload;
            // Create appropriate payload based on endpoint type
            if (endpoint === 'md') {
                // Get values from the /md specific inputs
                const filterType = document.getElementById('md-filter').value;
                const query = document.getElementById('md-query').value.trim();
                const cache = document.getElementById('md-cache').value;
                // MD endpoint expects: { url, f, q, c }
                payload = {
                    url: urls[0], // Take first URL
                    f: filterType, // Lowercase filter type as required by server
                    q: query || null, // Use the query if provided, otherwise null
                    c: cache
                };
            } else if (endpoint === 'llm') {
                // LLM endpoint has a different URL pattern and uses query params
                // This will be handled directly in the fetch below
                payload = null;
            } else {
                // Default payload for /crawl and /crawl/stream
                payload = {
                    urls,
                    ...advConfig
                };
            }
            updateStatus('processing');
@@ -557,7 +660,18 @@
                const startTime = performance.now();
                let response, responseData;
-                if (endpoint === 'crawl_stream') {
+                if (endpoint === 'llm') {
                    // Special handling for LLM endpoint which uses URL pattern: /llm/{encoded_url}?q={query}
                    const url = urls[0];
                    const encodedUrl = encodeURIComponent(url);
                    // Get the question from the LLM-specific input
                    const question = document.getElementById('llm-question').value.trim() || "What is this page about?";
                    response = await fetch(`${api}/${encodedUrl}?q=${encodeURIComponent(question)}`, {
                        method: 'GET',
                        headers: { 'Accept': 'application/json' }
                    });
                } else if (endpoint === 'crawl_stream') {
                    // Stream processing
                    response = await fetch(api, {
                        method: 'POST',
@@ -597,7 +711,7 @@
                    document.querySelector('#response-content code').className = 'json hljs'; // Reset classes
                    forceHighlightElement(document.querySelector('#response-content code'));
                } else {
-                    // Regular request
+                    // Regular request (handles /crawl and /md)
                    response = await fetch(api, {
                        method: 'POST',
                        headers: { 'Content-Type': 'application/json' },
@@ -625,7 +739,16 @@
                }
                forceHighlightElement(document.querySelector('#response-content code'));
                // For generateSnippets, handle the LLM case specially
                if (endpoint === 'llm') {
                    const url = urls[0];
                    const encodedUrl = encodeURIComponent(url);
                    const question = document.getElementById('llm-question').value.trim() || "What is this page about?";
                    generateSnippets(`${api}/${encodedUrl}?q=${encodeURIComponent(question)}`, null, 'GET');
                } else {
                    generateSnippets(api, payload);
                }
            } catch (error) {
                console.error('Error:', error);
                updateStatus('error');
@@ -808,8 +931,23 @@
            });
        }
-        // Call this in your DOMContentLoaded or initialization
+        // Function to initialize UI based on selected endpoint
        function initUI() {
            // Trigger the endpoint change handler to set initial UI state
            const endpointSelect = document.getElementById('endpoint');
            const event = new Event('change');
            endpointSelect.dispatchEvent(event);
            // Initialize copy buttons
            initCopyButtons();
        }
        // Initialize on page load
        document.addEventListener('DOMContentLoaded', initUI);
        // Also call it immediately in case the script runs after DOM is already loaded
        if (document.readyState !== 'loading') {
            initUI();
        }
    </script>
 </body>
--- a/docs/apps/linkdin/README.md
+++ b/docs/apps/linkdin/README.md
@@ -0,0 +1,126 @@
 # Crawl4AI Prospect‑Wizard – step‑by‑step guide
 A three‑stage demo that goes from **LinkedIn scraping** ➜ **LLM reasoning** ➜ **graph visualisation**.
 ```
 prospect‑wizard/
 ├─ c4ai_discover.py         # Stage 1 – scrape companies + people
 ├─ c4ai_insights.py         # Stage 2 – embeddings, org‑charts, scores
 ├─ graph_view_template.html # Stage 3 – graph viewer (static HTML)
 └─ data/                    # output lands here (*.jsonl / *.json)
 ```
 ---
 ## 1  Install & boot a LinkedIn profile (one‑time)
 ### 1.1  Install dependencies
 ```bash
 pip install crawl4ai openai sentence-transformers networkx pandas vis-network rich
 ```
 ### 1.2  Create / warm a LinkedIn browser profile
 ```bash
 crwl profiler
 ```
 1. The interactive shell shows **New profile** – hit **enter**.
 2. Choose a name, e.g. `profile_linkedin_uc`.
 3. A Chromium window opens – log in to LinkedIn, solve whatever CAPTCHA, then close.
 > Remember the **profile name**. All future runs take `--profile-name <your_name>`.
 ---
 ## 2  Discovery – scrape companies & people
 ```bash
 python c4ai_discover.py full \ 
  --query "health insurance management" \ 
  --geo 102713980 \               # Malaysia geoUrn
  --title_filters "" \            # or "Product,Engineering"
  --max_companies 10 \            # default set small for workshops
  --max_people 20 \               # \^ same
  --profile-name profile_linkedin_uc \ 
  --outdir ./data \ 
  --concurrency 2 \ 
  --log_level debug
 ```
 **Outputs** in `./data/`:
 * `companies.jsonl` – one JSON per company
 * `people.jsonl` – one JSON per employee
 🛠️  **Dry‑run:** `C4AI_DEMO_DEBUG=1 python c4ai_discover.py full --query coffee` uses bundled HTML snippets, no network.
 ### Handy geoUrn cheatsheet
 | Location | geoUrn |
 |----------|--------|
 | Singapore | **103644278** |
 | Malaysia | **102713980** |
 | United States | **103644922** |
 | United Kingdom | **102221843** |
 | Australia | **101452733** |
 _See more: <https://www.linkedin.com/search/results/companies/?geoUrn=XXX> – the number after `geoUrn=` is what you need._
 ---
 ## 3  Insights – embeddings, org‑charts, decision makers
 ```bash
 python c4ai_insights.py \ 
  --in  ./data \ 
  --out ./data \ 
  --embed_model all-MiniLM-L6-v2 \ 
  --top_k 10 \ 
  --openai_model gpt-4.1 \ 
  --max_llm_tokens 8024 \ 
  --llm_temperature 1.0 \ 
  --workers 4
 ```
 Emits next to the Stage‑1 files:
 * `company_graph.json` – inter‑company similarity graph
 * `org_chart_<handle>.json` – one per company
 * `decision_makers.csv` – hand‑picked ‘who to pitch’ list
 Flags reference (straight from `build_arg_parser()`):
 | Flag | Default | Purpose |
 |------|---------|---------|
 | `--in` | `.` | Stage‑1 output dir |
 | `--out` | `.` | Destination dir |
 | `--embed_model` | `all-MiniLM-L6-v2` | Sentence‑Transformer model |
 | `--top_k` | `10` | Neighbours per company in graph |
 | `--openai_model` | `gpt-4.1` | LLM for scoring decision makers |
 | `--max_llm_tokens` | `8024` | Token budget per LLM call |
 | `--llm_temperature` | `1.0` | Creativity knob |
 | `--stub` | off | Skip OpenAI and fabricate tiny charts |
 | `--workers` | `4` | Parallel LLM workers |
 ---
 ## 4  Visualise – interactive graph
 After Stage 2 completes, simply open the HTML viewer from the project root:
 ```bash
 open graph_view_template.html   # or Live Server / Python -http
 ```
 The page fetches `data/company_graph.json` and the `org_chart_*.json` files automatically; keep the `data/` folder beside the HTML file.
 * Left pane → list of companies (clans).
 * Click a node to load its org‑chart on the right.
 * Chat drawer lets you ask follow‑up questions; context is pulled from `people.jsonl`.
 ---
 ## 5  Common snags
 | Symptom | Fix |
 |---------|-----|
 | Infinite CAPTCHA | Use a residential proxy: `--proxy http://user:pass@ip:port` |
 | 429 Too Many Requests | Lower `--concurrency`, rotate profile, add delay |
 | Blank graph | Check JSON paths, clear `localStorage` in browser |
 ---
 ### TL;DR
 `crwl profiler` → `c4ai_discover.py` → `c4ai_insights.py` → open `graph_view_template.html`.  
 Live long and `import crawl4ai`.
--- a/docs/apps/linkdin/c4ai_discover.py
+++ b/docs/apps/linkdin/c4ai_discover.py
@@ -0,0 +1,440 @@
 #!/usr/bin/env python3
 """
 c4ai-discover — Stage‑1 Discovery CLI
 Scrapes LinkedIn company search + their people pages and dumps two newline‑delimited
 JSON files: companies.jsonl and people.jsonl.
 Key design rules
 ----------------
 * No BeautifulSoup — Crawl4AI only for network + HTML fetch.
 * JsonCssExtractionStrategy for structured scraping; schema auto‑generated once
  from sample HTML provided by user and then cached under ./schemas/.
 * Defaults are embedded so the file runs inside VS Code debugger without CLI args.
 * If executed as a console script (argv > 1), CLI flags win.
 * Lightweight deps: argparse + Crawl4AI stack.
 Author: Tom @ Kidocode 2025‑04‑26
 """
 from __future__ import annotations
 import warnings, re
 warnings.filterwarnings(
    "ignore",
    message=r"The pseudo class ':contains' is deprecated, ':-soup-contains' should be used.*",
    category=FutureWarning,
    module=r"soupsieve"
 )
 # ───────────────────────────────────────────────────────────────────────────────
 # Imports
 # ───────────────────────────────────────────────────────────────────────────────
 import argparse
 import random
 import asyncio
 import json
 import logging
 import os
 import pathlib
 import sys
 # 3rd-party rich for pretty logging
 from rich.console import Console
 from rich.logging import RichHandler
 from datetime import datetime, UTC
 from itertools import cycle
 from textwrap import dedent
 from types import SimpleNamespace
 from typing import Dict, List, Optional
 from urllib.parse import quote
 from pathlib import Path
 from glob import glob
 from crawl4ai import (
    AsyncWebCrawler,
    BrowserConfig,
    CacheMode,
    CrawlerRunConfig,
    JsonCssExtractionStrategy,
    BrowserProfiler,
    LLMConfig,
 )
 # ───────────────────────────────────────────────────────────────────────────────
 # Constants / paths
 # ───────────────────────────────────────────────────────────────────────────────
 BASE_DIR = pathlib.Path(__file__).resolve().parent
 SCHEMA_DIR = BASE_DIR / "schemas"
 SCHEMA_DIR.mkdir(parents=True, exist_ok=True)
 COMPANY_SCHEMA_PATH = SCHEMA_DIR / "company_card.json"
 PEOPLE_SCHEMA_PATH = SCHEMA_DIR / "people_card.json"
 # ---------- deterministic target JSON examples ----------
 _COMPANY_SCHEMA_EXAMPLE = {
    "handle": "/company/posify/",
    "profile_image": "https://media.licdn.com/dms/image/v2/.../logo.jpg",
    "name": "Management Research Services, Inc. (MRS, Inc)",
    "descriptor": "Insurance • Milwaukee, Wisconsin",
    "about": "Insurance • Milwaukee, Wisconsin",
    "followers": 1000
 }
 _PEOPLE_SCHEMA_EXAMPLE = {
    "profile_url": "https://www.linkedin.com/in/lily-ng/",
    "name": "Lily Ng",
    "headline": "VP Product @ Posify",
    "followers": 890,
    "connection_degree": "2nd",
    "avatar_url": "https://media.licdn.com/dms/image/v2/.../lily.jpg"
 }
 # Provided sample HTML snippets (trimmed) — used exactly once to cold‑generate schema.
 _SAMPLE_COMPANY_HTML = (Path(__file__).resolve().parent / "snippets/company.html").read_text()
 _SAMPLE_PEOPLE_HTML = (Path(__file__).resolve().parent / "snippets/people.html").read_text()
 # --------- tighter schema prompts ----------
 _COMPANY_SCHEMA_QUERY = dedent(
    """
    Using the supplied <li> company-card HTML, build a JsonCssExtractionStrategy schema that,
    for every card, outputs *exactly* the keys shown in the example JSON below.
    JSON spec:
      • handle        – href of the outermost <a> that wraps the logo/title, e.g. "/company/posify/"
      • profile_image – absolute URL of the <img> inside that link
      • name          – text of the <a> inside the <span class*='t-16'>
      • descriptor    – text line with industry • location
      • about         – text of the <div class*='t-normal'> below the name (industry + geo)
      • followers     – integer parsed from the <div> containing 'followers'
    IMPORTANT: Do not use the base64 kind of classes to target element. It's not reliable.
    The main div parent contains these li element is "div.search-results-container" you can use this.
    The <ul> parent has "role" equal to "list". Using these two should be enough to target the <li> elements."
    """
 )
 _PEOPLE_SCHEMA_QUERY = dedent(
    """
    Using the supplied <li> people-card HTML, build a JsonCssExtractionStrategy schema that
    outputs exactly the keys in the example JSON below.
    Fields:
      • profile_url        – href of the outermost profile link
      • name               – text inside artdeco-entity-lockup__title
      • headline           – inner text of artdeco-entity-lockup__subtitle
      • followers          – integer parsed from the span inside lt-line-clamp--multi-line
      • connection_degree  – '1st', '2nd', etc. from artdeco-entity-lockup__badge
      • avatar_url         – src of the <img> within artdeco-entity-lockup__image
    IMPORTANT: Do not use the base64 kind of classes to target element. It's not reliable.
    The main div parent contains these li element is a "div" has these classes "artdeco-card org-people-profile-card__card-spacing org-people__card-margin-bottom".
    """
 )
 # ---------------------------------------------------------------------------
 # Utility helpers
 # ---------------------------------------------------------------------------
 def _load_or_build_schema(
    path: pathlib.Path, 
    sample_html: str, 
    query: str, 
    example_json: Dict,
    force = False
 ) -> Dict:
    """Load schema from path, else call generate_schema once and persist."""
    if path.exists() and not force:
        return json.loads(path.read_text())
    logging.info("[SCHEMA] Generating schema %s", path.name)
    schema = JsonCssExtractionStrategy.generate_schema(
        html=sample_html,
        llm_config=LLMConfig(
            provider=os.getenv("C4AI_SCHEMA_PROVIDER", "openai/gpt-4o"),
            api_token=os.getenv("OPENAI_API_KEY", "env:OPENAI_API_KEY"),
        ),
        query=query,
        target_json_example=json.dumps(example_json, indent=2),
    )
    path.write_text(json.dumps(schema, indent=2))
    return schema
 def _openai_friendly_number(text: str) -> Optional[int]:
    """Extract first int from text like '1K followers' (returns 1000)."""
    import re
    m = re.search(r"(\d[\d,]*)", text.replace(",", ""))
    if not m:
        return None
    val = int(m.group(1))
    if "k" in text.lower():
        val *= 1000
    if "m" in text.lower():
        val *= 1_000_000
    return val
 # ---------------------------------------------------------------------------
 # Core async workers
 # ---------------------------------------------------------------------------
 async def crawl_company_search(crawler: AsyncWebCrawler, url: str, schema: Dict, limit: int) -> List[Dict]:
    """Paginate 10-item company search pages until `limit` reached."""
    extraction = JsonCssExtractionStrategy(schema)
    cfg = CrawlerRunConfig(
        extraction_strategy=extraction,
        cache_mode=CacheMode.BYPASS,
        wait_for = ".search-marvel-srp",
        session_id="company_search",
        delay_before_return_html=1,
        magic = True,
        verbose= False,
    )
    companies, page = [], 1
    while len(companies) < max(limit, 10):
        paged_url = f"{url}&page={page}"
        res = await crawler.arun(paged_url, config=cfg)
        batch = json.loads(res[0].extracted_content)
        if not batch:
            break
        for item in batch:
            name = item.get("name", "").strip()
            handle = item.get("handle", "").strip()
            if not handle or not name:
                continue
            descriptor = item.get("descriptor")
            about = item.get("about")
            followers = _openai_friendly_number(str(item.get("followers", "")))
            companies.append(
                {
                    "handle": handle,
                    "name": name,
                    "descriptor": descriptor,
                    "about": about,
                    "followers": followers,
                    "people_url": f"{handle}people/",
                    "captured_at": datetime.now(UTC).isoformat(timespec="seconds") + "Z",
                }
            )
        page += 1
        logging.info(
            f"[dim]Page {page}[/] — running total: {len(companies)}/{limit} companies"
        )
    return companies[:max(limit, 10)]
 async def crawl_people_page(
    crawler: AsyncWebCrawler,
    people_url: str,
    schema: Dict,
    limit: int,
    title_kw: str,
 ) -> List[Dict]:
    people_u = f"{people_url}?keywords={quote(title_kw)}"
    extraction = JsonCssExtractionStrategy(schema)
    cfg = CrawlerRunConfig(
        extraction_strategy=extraction,
        # scan_full_page=True,
        cache_mode=CacheMode.BYPASS,
        magic=True,
        wait_for=".org-people-profile-card__card-spacing",
        delay_before_return_html=1,
        session_id="people_search",
    )
    res = await crawler.arun(people_u, config=cfg)
    if not res[0].success:
        return []
    raw = json.loads(res[0].extracted_content)
    people = []
    for p in raw[:limit]:
        followers = _openai_friendly_number(str(p.get("followers", "")))
        people.append(
            {
                "profile_url": p.get("profile_url"),
                "name": p.get("name"),
                "headline": p.get("headline"),
                "followers": followers,
                "connection_degree": p.get("connection_degree"),
                "avatar_url": p.get("avatar_url"),
            }
        )
    return people
 # ---------------------------------------------------------------------------
 # CLI + main
 # ---------------------------------------------------------------------------
 def build_arg_parser() -> argparse.ArgumentParser:
    ap = argparse.ArgumentParser("c4ai-discover — Crawl4AI LinkedIn discovery")
    sub = ap.add_subparsers(dest="cmd", required=False, help="run scope")
    def add_flags(parser: argparse.ArgumentParser):
        parser.add_argument("--query", required=False, help="query keyword(s)")
        parser.add_argument("--geo", required=False, type=int, help="LinkedIn geoUrn")
        parser.add_argument("--title-filters", default="Product,Engineering", help="comma list of job keywords")
        parser.add_argument("--max-companies", type=int, default=1000)
        parser.add_argument("--max-people", type=int, default=500)
        parser.add_argument("--profile-path", default=str(pathlib.Path.home() / ".crawl4ai/profiles/profile_linkedin_uc"))
        parser.add_argument("--outdir", default="./output")
        parser.add_argument("--concurrency", type=int, default=4)
        parser.add_argument("--log-level", default="info", choices=["debug", "info", "warn", "error"])
    add_flags(sub.add_parser("full"))
    add_flags(sub.add_parser("companies"))
    add_flags(sub.add_parser("people"))
    # global flags
    ap.add_argument(
        "--debug",
        action="store_true",
        help="Use built-in demo defaults (same as C4AI_DEMO_DEBUG=1)",
    )
    return ap
 def detect_debug_defaults(force = False) -> SimpleNamespace:
    if not force and sys.gettrace() is None and not os.getenv("C4AI_DEMO_DEBUG"):
        return SimpleNamespace()
    # ----- debug‑friendly defaults -----
    return SimpleNamespace(
        cmd="full",
        query="health insurance management",
        geo=102713980,
        # title_filters="Product,Engineering",
        title_filters="",
        max_companies=10,
        max_people=5,
        profile_name="profile_linkedin_uc",
        outdir="./debug_out",
        concurrency=2,
        log_level="debug",
    )
 async def async_main(opts):
    # ─────────── logging setup ───────────
    console = Console()
    logging.basicConfig(
        level=opts.log_level.upper(),
        format="%(message)s",
        handlers=[RichHandler(console=console, markup=True, rich_tracebacks=True)],
    )
    # -------------------------------------------------------------------
    # Load or build schemas (one‑time LLM call each)
    # -------------------------------------------------------------------
    company_schema = _load_or_build_schema(
        COMPANY_SCHEMA_PATH,
        _SAMPLE_COMPANY_HTML,
        _COMPANY_SCHEMA_QUERY,
        _COMPANY_SCHEMA_EXAMPLE,
        # True
    )
    people_schema = _load_or_build_schema(
        PEOPLE_SCHEMA_PATH,
        _SAMPLE_PEOPLE_HTML,
        _PEOPLE_SCHEMA_QUERY,
        _PEOPLE_SCHEMA_EXAMPLE,
        # True
    )
    outdir = BASE_DIR / pathlib.Path(opts.outdir)
    outdir.mkdir(parents=True, exist_ok=True)
    f_companies = (BASE_DIR / outdir / "companies.jsonl").open("a", encoding="utf-8")
    f_people = (BASE_DIR / outdir / "people.jsonl").open("a", encoding="utf-8")
    # -------------------------------------------------------------------
    # Prepare crawler with cookie pool rotation
    # -------------------------------------------------------------------
    profiler = BrowserProfiler()
    path = profiler.get_profile_path(opts.profile_name)
    bc = BrowserConfig(
        headless=False,        
        verbose=False,
        user_data_dir=path,
        use_managed_browser=True,
        user_agent_mode = "random",
        user_agent_generator_config= {
            "platforms": "mobile",
            "os": "Android"
        },
        verbose=False,
    )
    crawler = AsyncWebCrawler(config=bc)
    await crawler.start()
    # Single worker for simplicity; concurrency can be scaled by arun_many if needed.
    # crawler = await next_crawler().start()
    try:
        # Build LinkedIn search URL
        search_url = f"https://www.linkedin.com/search/results/companies/?keywords={quote(opts.query)}&geoUrn={opts.geo}"
        logging.info("Seed URL => %s", search_url)
        companies: List[Dict] = []
        if opts.cmd in ("companies", "full"):
            companies = await crawl_company_search(
                crawler, search_url, company_schema, opts.max_companies
            )
            for c in companies:
                f_companies.write(json.dumps(c, ensure_ascii=False) + "\n")
            logging.info(f"[bold green]✓[/] Companies scraped so far: {len(companies)}")
        if opts.cmd in ("people", "full"):
            if not companies:
                # load from previous run
                src = outdir / "companies.jsonl"
                if not src.exists():
                    logging.error("companies.jsonl missing — run companies/full first")
                    return 10
                companies = [json.loads(l) for l in src.read_text().splitlines()]
            total_people = 0
            title_kw = " ".join([t.strip() for t in opts.title_filters.split(",") if t.strip()]) if opts.title_filters else ""
            for comp in companies:
                people = await crawl_people_page(
                    crawler,
                    comp["people_url"],
                    people_schema,
                    opts.max_people,
                    title_kw,
                )
                for p in people:
                    rec = p | {
                        "company_handle": comp["handle"],
                        # "captured_at": datetime.now(UTC).isoformat(timespec="seconds") + "Z",
                        "captured_at": datetime.now(UTC).isoformat(timespec="seconds") + "Z",
                    }
                    f_people.write(json.dumps(rec, ensure_ascii=False) + "\n")
                total_people += len(people)
                logging.info(
                    f"{comp['name']} — [cyan]{len(people)}[/] people extracted"
                )
                await asyncio.sleep(random.uniform(0.5, 1))
            logging.info("Total people scraped: %d", total_people)
    finally:
        await crawler.close()
        f_companies.close()
        f_people.close()
    return 0
 def main():
    parser = build_arg_parser()
    cli_opts = parser.parse_args()
    # decide on debug defaults
    if cli_opts.debug:
        opts = detect_debug_defaults(force=True)
    else:
        env_defaults = detect_debug_defaults()
        env_defaults = detect_debug_defaults()
        opts = env_defaults if env_defaults else cli_opts
    if not getattr(opts, "cmd", None):
        opts.cmd = "full"
    exit_code = asyncio.run(async_main(opts))
    sys.exit(exit_code)
 if __name__ == "__main__":
    main()
--- a/docs/apps/linkdin/c4ai_insights.py
+++ b/docs/apps/linkdin/c4ai_insights.py
@@ -0,0 +1,372 @@
 #!/usr/bin/env python3
 """
 Stage-2 Insights builder
 ------------------------
 Reads companies.jsonl & people.jsonl (Stage-1 output) and produces:
 • company_graph.json
 • org_chart_<handle>.json  (one per company)
 • decision_makers.csv
 • graph_view.html          (interactive visualisation)
 Run:
    python c4ai_insights.py --in ./stage1_out --out ./stage2_out
 Author : Tom @ Kidocode, 2025-04-28
 """
 from __future__ import annotations
 # ───────────────────────────────────────────────────────────────────────────────
 # Imports & Third-party
 # ───────────────────────────────────────────────────────────────────────────────
 import argparse, asyncio, json, os, sys, pathlib, random, time, csv
 from datetime import datetime, UTC
 from types import SimpleNamespace
 from pathlib import Path
 from typing import List, Dict, Any
 # Pretty CLI UX
 from rich.console import Console
 from rich.logging import RichHandler
 from rich.progress import Progress, SpinnerColumn, BarColumn, TextColumn, TimeElapsedColumn
 import logging
 from jinja2 import Environment, FileSystemLoader, select_autoescape
 BASE_DIR = pathlib.Path(__file__).resolve().parent
 # ───────────────────────────────────────────────────────────────────────────────
 # 3rd-party deps
 # ───────────────────────────────────────────────────────────────────────────────
 import numpy as np
 # from sentence_transformers import SentenceTransformer
 # from sklearn.metrics.pairwise import cosine_similarity
 import pandas as pd
 import hashlib
 from openai import OpenAI                    # same SDK you pre-loaded
 # ───────────────────────────────────────────────────────────────────────────────
 # Utils
 # ───────────────────────────────────────────────────────────────────────────────
 def load_jsonl(path: Path) -> List[Dict[str, Any]]:
    with open(path, "r", encoding="utf-8") as f:
        return [json.loads(l) for l in f]
 def dump_json(obj, path: Path):
    with open(path, "w", encoding="utf-8") as f:
        json.dump(obj, f, ensure_ascii=False, indent=2)
 # ───────────────────────────────────────────────────────────────────────────────
 # Constants
 # ───────────────────────────────────────────────────────────────────────────────
 BASE_DIR = pathlib.Path(__file__).resolve().parent
 # ───────────────────────────────────────────────────────────────────────────────
 # Debug defaults   (mirrors Stage-1 trick)
 # ───────────────────────────────────────────────────────────────────────────────
 def dev_defaults() -> SimpleNamespace:
    return SimpleNamespace(
        in_dir="./debug_out",          
        out_dir="./insights_debug",
        embed_model="all-MiniLM-L6-v2",
        top_k=10,
        openai_model="gpt-4.1",
        max_llm_tokens=8000,
        llm_temperature=1.0,
        workers=4,           # parallel processing
        stub=False,          # manual
    )
 # ───────────────────────────────────────────────────────────────────────────────
 # Graph builders
 # ───────────────────────────────────────────────────────────────────────────────
 def embed_descriptions(companies, model_name:str, opts) -> np.ndarray:
    from sentence_transformers import SentenceTransformer
    logging.debug(f"Using embedding model: {model_name}")
    cache_path = BASE_DIR / Path(opts.out_dir) / "embeds_cache.json"
    cache = {}
    if cache_path.exists():
        with open(cache_path) as f:
            cache = json.load(f)
        # flush cache if model differs
        if cache.get("_model") != model_name:
            cache = {}
    model = SentenceTransformer(model_name)
    new_texts, new_indices = [], []
    vectors = np.zeros((len(companies), 384), dtype=np.float32)
    for idx, comp in enumerate(companies):
        text = comp.get("about") or comp.get("descriptor","")
        h = hashlib.sha1(text.encode("utf-8")).hexdigest()
        cached = cache.get(comp["handle"])
        if cached and cached["hash"] == h:
            vectors[idx] = np.array(cached["vector"], dtype=np.float32)
        else:
            new_texts.append(text)
            new_indices.append((idx, comp["handle"], h))
    if new_texts:
        embeds = model.encode(new_texts, show_progress_bar=False, convert_to_numpy=True)
        for vec, (idx, handle, h) in zip(embeds, new_indices):
            vectors[idx] = vec
            cache[handle] = {"hash": h, "vector": vec.tolist()}
        cache["_model"] = model_name
        with open(cache_path, "w") as f:
            json.dump(cache, f)
    return vectors
 def build_company_graph(companies, embeds:np.ndarray, top_k:int) -> Dict[str,Any]:
    from sklearn.metrics.pairwise import cosine_similarity
    sims = cosine_similarity(embeds)
    nodes, edges = [], []
    idx_of = {c["handle"]: i for i,c in enumerate(companies)}
    for i,c in enumerate(companies):
        node = dict(
            id=c["handle"].strip("/"),
            name=c["name"],
            handle=c["handle"],
            about=c.get("about",""),
            people_url=c.get("people_url",""),
            industry=c.get("descriptor","").split("•")[0].strip(),
            geoUrn=c.get("geoUrn"),
            followers=c.get("followers",0),
            # desc_embed=embeds[i].tolist(),
            desc_embed=[],
        )
        nodes.append(node)
        # pick top-k most similar except itself
        top_idx = np.argsort(sims[i])[::-1][1:top_k+1]
        for j in top_idx:
            tgt = companies[j]
            weight = float(sims[i,j])
            if node["industry"] == tgt.get("descriptor","").split("•")[0].strip():
                weight += 0.10
            if node["geoUrn"] == tgt.get("geoUrn"):
                weight += 0.05
            tgt['followers'] = tgt.get("followers", None) or 1
            node["followers"] = node.get("followers", None) or 1
            follower_ratio = min(node["followers"], tgt.get("followers",1)) / max(node["followers"] or 1, tgt.get("followers",1))
            weight += 0.05 * follower_ratio
            edges.append(dict(
                source=node["id"],
                target=tgt["handle"].strip("/"),
                weight=round(weight,4),
                drivers=dict(
                    embed_sim=round(float(sims[i,j]),4),
                    industry_match=0.10 if node["industry"] == tgt.get("descriptor","").split("•")[0].strip() else 0,
                    geo_overlap=0.05 if node["geoUrn"] == tgt.get("geoUrn") else 0,
                )
            ))
    # return {"nodes":nodes,"edges":edges,"meta":{"generated_at":datetime.now(UTC).isoformat()}}
    return {"nodes":nodes,"edges":edges,"meta":{"generated_at":datetime.now(UTC).isoformat()}}
 # ───────────────────────────────────────────────────────────────────────────────
 # Org-chart via LLM
 # ───────────────────────────────────────────────────────────────────────────────
 async def infer_org_chart_llm(company, people, client:OpenAI, model_name:str, max_tokens:int, temperature:float, stub:bool):
    if stub:
        # Tiny fake org-chart when debugging offline
        chief = random.choice(people)
        nodes = [{
            "id": chief["profile_url"],
            "name": chief["name"],
            "title": chief["headline"],
            "dept": chief["headline"].split()[:1][0],
            "yoe_total": 8,
            "yoe_current": 2,
            "seniority_score": 0.8,
            "decision_score": 0.9,
            "avatar_url": chief.get("avatar_url")
        }]
        return {"nodes":nodes,"edges":[],"meta":{"debug_stub":True,"generated_at":datetime.now(UTC).isoformat()}}
    prompt = [
        {"role":"system","content":"You are an expert B2B org-chart reasoner."},
        {"role":"user","content":f"""Here is the company description:
 <company>
 {json.dumps(company, ensure_ascii=False)}
 </company>
 Here is a JSON list of employees:
 <employees>
 {json.dumps(people, ensure_ascii=False)}
 </employees>
 1) Build a reporting tree (manager -> direct reports)
 2) For each person output a decision_score 0-1 for buying new software
 Return JSON: {{ "nodes":[{{id,name,title,dept,yoe_total,yoe_current,seniority_score,decision_score,avatar_url,profile_url}}], "edges":[{{source,target,type,confidence}}] }}
 """}
    ]
    resp = client.chat.completions.create(
        model=model_name,
        messages=prompt,
        max_tokens=max_tokens,
        temperature=temperature,
        response_format={"type":"json_object"}
    )
    chart = json.loads(resp.choices[0].message.content)
    chart["meta"] = dict(model=model_name, generated_at=datetime.now(UTC).isoformat())
    return chart
 # ───────────────────────────────────────────────────────────────────────────────
 # CSV flatten
 # ───────────────────────────────────────────────────────────────────────────────
 def export_decision_makers(charts_dir:Path, csv_path:Path, threshold:float=0.5):
    rows=[]
    for p in charts_dir.glob("org_chart_*.json"):
        data=json.loads(p.read_text())
        comp = p.stem.split("org_chart_")[1]
        for n in data.get("nodes",[]):
            if n.get("decision_score",0)>=threshold:
                rows.append(dict(
                    company=comp,
                    person=n["name"],
                    title=n["title"],
                    decision_score=n["decision_score"],
                    profile_url=n["id"]
                ))
    pd.DataFrame(rows).to_csv(csv_path,index=False)
 # ───────────────────────────────────────────────────────────────────────────────
 # HTML rendering
 # ───────────────────────────────────────────────────────────────────────────────
 def render_html(out:Path, template_dir:Path):
    # From template folder cp graph_view.html and ai.js in out folder
    import shutil
    shutil.copy(template_dir/"graph_view_template.html", out / "graph_view.html")
    shutil.copy(template_dir/"ai.js", out)
 # ───────────────────────────────────────────────────────────────────────────────
 # Main async pipeline
 # ───────────────────────────────────────────────────────────────────────────────
 async def run(opts):
    # ── silence SDK noise ──────────────────────────────────────────────────────
    for noisy in ("openai", "httpx", "httpcore"):
        lg = logging.getLogger(noisy)
        lg.setLevel(logging.WARNING)     # or ERROR if you want total silence
        lg.propagate = False             # optional: stop them reaching root
    # ────────────── logging bootstrap ──────────────
    console = Console()
    logging.basicConfig(
        level="INFO",
        format="%(message)s",
        handlers=[RichHandler(console=console, markup=True, rich_tracebacks=True)],
    )
    in_dir  = BASE_DIR / Path(opts.in_dir)
    out_dir = BASE_DIR / Path(opts.out_dir)
    out_dir.mkdir(parents=True, exist_ok=True)
    companies = load_jsonl(in_dir/"companies.jsonl")
    people    = load_jsonl(in_dir/"people.jsonl")
    logging.info(f"[bold cyan]Loaded[/] {len(companies)} companies, {len(people)} people")
    logging.info("[bold]⇢[/] Embedding company descriptions…")
    # embeds = embed_descriptions(companies, opts.embed_model, opts)
    logging.info("[bold]⇢[/] Building similarity graph")
    # company_graph = build_company_graph(companies, embeds, opts.top_k)
    # dump_json(company_graph, out_dir/"company_graph.json")
    # OpenAI client (only built if not debugging)
    stub = bool(opts.stub)
    client = OpenAI() if not stub else None
    # Filter companies that need processing
    to_process = []
    for comp in companies:
        handle = comp["handle"].strip("/").replace("/","_")
        out_file = out_dir/f"org_chart_{handle}.json"
        if out_file.exists() and False:
            logging.info(f"[green]✓[/] Skipping existing {comp['name']}")
            continue
        to_process.append(comp)
    if not to_process:
        logging.info("[yellow]All companies already processed[/]")
    else:
        workers = getattr(opts, 'workers', 1)
        parallel = workers > 1
        logging.info(f"[bold]⇢[/] Inferring org-charts via LLM {f'(parallel={workers} workers)' if parallel else ''}")
        with Progress(
            SpinnerColumn(),
            BarColumn(),
            TextColumn("[progress.description]{task.description}"),
            TimeElapsedColumn(),
            console=console,
        ) as progress:
            task = progress.add_task("Org charts", total=len(to_process))
            async def process_one(comp):
                handle = comp["handle"].strip("/").replace("/","_")
                persons = [p for p in people if p["company_handle"].strip("/") == comp["handle"].strip("/")]
                chart = await infer_org_chart_llm(
                    comp, persons,
                    client=client if client else OpenAI(api_key="sk-debug"),
                    model_name=opts.openai_model,
                    max_tokens=opts.max_llm_tokens,
                    temperature=opts.llm_temperature,
                    stub=stub,
                )
                chart["meta"]["company"] = comp["name"]
                # Save the result immediately
                dump_json(chart, out_dir/f"org_chart_{handle}.json")
                progress.update(task, advance=1, description=f"{comp['name']} ({len(persons)} ppl)")
            # Create tasks for all companies
            tasks = [process_one(comp) for comp in to_process]
            # Process in batches based on worker count
            semaphore = asyncio.Semaphore(workers)
            async def bounded_process(coro):
                async with semaphore:
                    return await coro
            # Run with concurrency control
            await asyncio.gather(*(bounded_process(task) for task in tasks))
    logging.info("[bold]⇢[/] Flattening decision-makers CSV")
    export_decision_makers(out_dir, out_dir/"decision_makers.csv")
    render_html(out_dir, template_dir=BASE_DIR/"templates")
    logging.success = lambda msg, **k: console.print(f"[bold green]✓[/] {msg}", **k)
    logging.success(f"Stage-2 artefacts written to {out_dir}")
 # ───────────────────────────────────────────────────────────────────────────────
 # CLI
 # ───────────────────────────────────────────────────────────────────────────────
 def build_arg_parser():
    p = argparse.ArgumentParser(description="Build graphs & visualisation from Stage-1 output")
    p.add_argument("--in",       dest="in_dir",  required=False, help="Stage-1 output dir", default=".")
    p.add_argument("--out",      dest="out_dir", required=False, help="Destination dir",   default=".")
    p.add_argument("--embed_model", default="all-MiniLM-L6-v2")
    p.add_argument("--top_k", type=int, default=10, help="Top-k neighbours per company")
    p.add_argument("--openai_model", default="gpt-4.1")
    p.add_argument("--max_llm_tokens", type=int, default=8024)
    p.add_argument("--llm_temperature", type=float, default=1.0)
    p.add_argument("--stub", action="store_true", help="Skip OpenAI call and generate tiny fake org charts")
    p.add_argument("--workers", type=int, default=4, help="Number of parallel workers for LLM inference")
    return p
 def main():
    dbg = dev_defaults()
    opts = dbg if True else build_arg_parser().parse_args()
    asyncio.run(run(opts))
 if __name__ == "__main__":
    main()
--- a/docs/apps/linkdin/schemas/company_card.json
+++ b/docs/apps/linkdin/schemas/company_card.json
@@ -0,0 +1,39 @@
 {
  "name": "LinkedIn Company Card",
  "baseSelector": "div.search-results-container ul[role='list'] > li",
  "fields": [
    {
      "name": "handle",
      "selector": "a[href*='/company/']",
      "type": "attribute",
      "attribute": "href"
    },
    {
      "name": "profile_image",
      "selector": "a[href*='/company/'] img",
      "type": "attribute",
      "attribute": "src"
    },
    {
      "name": "name",
      "selector": "span[class*='t-16'] a",
      "type": "text"
    },
    {
      "name": "descriptor",
      "selector": "div[class*='t-black t-normal']",
      "type": "text"
    },
    {
      "name": "about",
      "selector": "p[class*='entity-result__summary--2-lines']",
      "type": "text"
    },
    {
      "name": "followers",
      "selector": "div:contains('followers')",
      "type": "regex",
      "pattern": "(\\d+)\\s*followers"
    }
  ]
 }
--- a/docs/apps/linkdin/schemas/people_card.json
+++ b/docs/apps/linkdin/schemas/people_card.json
@@ -0,0 +1,38 @@
 {
  "name": "LinkedIn People Card",
  "baseSelector": "li.org-people-profile-card__profile-card-spacing",
  "fields": [
    {
      "name": "profile_url",
      "selector": "a.eETATgYTipaVsmrBChiBJJvFsdPhNpulhPZUVLHLo",
      "type": "attribute",
      "attribute": "href"
    },
    {
      "name": "name",
      "selector": ".artdeco-entity-lockup__title .lt-line-clamp--single-line",
      "type": "text"
    },
    {
      "name": "headline",
      "selector": ".artdeco-entity-lockup__subtitle .lt-line-clamp--multi-line",
      "type": "text"
    },
    {
      "name": "followers",
      "selector": ".lt-line-clamp--multi-line.t-12",
      "type": "text"
    },
    {
      "name": "connection_degree",
      "selector": ".artdeco-entity-lockup__badge .artdeco-entity-lockup__degree",
      "type": "text"
    },
    {
      "name": "avatar_url",
      "selector": ".artdeco-entity-lockup__image img",
      "type": "attribute",
      "attribute": "src"
    }
  ]
 }
--- a/docs/apps/linkdin/snippets/company.html
+++ b/docs/apps/linkdin/snippets/company.html
@@ -0,0 +1,143 @@
 <li class="yCLWzruNprmIzaZzFFonVFBtMrbaVYnuDFA">
    <!----><!---->
    <div class="IxlEPbRZwQYrRltKPvHAyjBmCdIWTAoYo" data-chameleon-result-urn="urn:li:company:362492"
        data-view-name="search-entity-result-universal-template">
        <div class="linked-area flex-1
              cursor-pointer">
            <div class="BAEgVqVuxosMJZodcelsgPoyRcrkiqgVCGHXNQ">
                <div class="afcvrbGzNuyRlhPPQWrWirJtUdHAAtUlqxwvVA">
                    <div class="display-flex align-items-center">
                        <!---->
                        <a class="eETATgYTipaVsmrBChiBJJvFsdPhNpulhPZUVLHLo  scale-down " aria-hidden="true"
                            tabindex="-1" href="https://www.linkedin.com/company/managment-research-services-inc./"
                            data-test-app-aware-link="">
                            <div class="ivm-image-view-model   ">
                                <div class="ivm-view-attr__img-wrapper
            ">
                                    <!---->
                                    <!----> <img width="48"
                                        src="https://media.licdn.com/dms/image/v2/C560BAQFWpusEOgW-ww/company-logo_100_100/company-logo_100_100/0/1630583697877/managment_research_services_inc_logo?e=1750896000&amp;v=beta&amp;t=Ch9vyEZdfng-1D1m_XqP5kjNpVXUBKkk9cNhMZUhx0E"
                                        loading="lazy" height="48" alt="Management Research Services, Inc. (MRS, Inc)"
                                        id="ember28"
                                        class="ivm-view-attr__img--centered EntityPhoto-square-3   evi-image lazy-image ember-view">
                                </div>
                            </div>
                        </a>
                    </div>
                </div>
                <div
                    class="wympnVuDByXHvafWrMGJLZuchDmCRqLmWPwg MmzCPRicJimZvjJhvqTzDcDbdHhWPzspERzA pt3 pb3 t-12 t-black--light">
                    <div class="mb1">
                        <div class="t-roman t-sans">
                            <div class="display-flex">
                                <span class="TikBXjihYvcNUoIzkslUaEjfIuLmYxfs OoHEyXgsiIqGADjcOtTmfdpoYVXrLKTvkwI ">
                                    <span class="CgaWLOzmXNuKbRIRARSErqCJcBPYudEKo
                t-16">
                                        <a class="eETATgYTipaVsmrBChiBJJvFsdPhNpulhPZUVLHLo "
                                            href="https://www.linkedin.com/company/managment-research-services-inc./"
                                            data-test-app-aware-link="">
                                            <!---->Management Research Services, Inc. (MRS, Inc)<!---->
                                            <!----> </a>
                                        <!----> </span>
                                </span>
                                <!---->
                            </div>
                        </div>
                        <div class="LjmdKCEqKITHihFOiQsBAQylkdnsWhqZii
              t-14 t-black t-normal">
                            <!---->Insurance • Milwaukee, Wisconsin<!---->
                        </div>
                        <div class="cTPhJiHyNLmxdQYFlsEOutjznmqrVHUByZwZ
              t-14 t-normal">
                            <!---->1K followers<!---->
                        </div>
                    </div>
                    <!---->
                    <p class="yWzlqwKNlvCWVNoKqmzoDDEnBMUuyynaLg
                    entity-result__summary--2-lines
                    t-12 t-black--light
                    ">
                        <!---->MRS combines 30 years of experience supporting the Life,<span class="white-space-pre">
                        </span><strong><!---->Health<!----></strong><span class="white-space-pre"> </span>and
                        Annuities<span class="white-space-pre"> </span><strong><!---->Insurance<!----></strong><span
                            class="white-space-pre"> </span>Industry with customized<span class="white-space-pre">
                        </span><strong><!---->insurance<!----></strong><span class="white-space-pre">
                        </span>underwriting solutions that efficiently support clients’ workflows. Supported by the
                        Agenium Platform (www.agenium.ai) our innovative underwriting solutions are guaranteed to
                        optimize requirements...<!---->
                    </p>
                    <!---->
                </div>
                <div class="qXxdnXtzRVFTnTnetmNpssucBwQBsWlUuk MmzCPRicJimZvjJhvqTzDcDbdHhWPzspERzA">
                    <!---->
                    <div>
                        <button aria-label="Follow Management Research Services, Inc. (MRS, Inc)" id="ember61"
                            class="artdeco-button artdeco-button--2 artdeco-button--secondary ember-view"
                            type="button"><!---->
                            <span class="artdeco-button__text">
                                Follow
                            </span></button>
                        <!---->
                        <!---->
                    </div>
                </div>
            </div>
        </div>
    </div>
 </li>
--- a/docs/apps/linkdin/snippets/people.html
+++ b/docs/apps/linkdin/snippets/people.html
@@ -0,0 +1,94 @@
 <li class="grid grid__col--lg-8 block org-people-profile-card__profile-card-spacing">
    <div>
        <section class="artdeco-card full-width qQdPErXQkSAbwApNgNfuxukTIPPykttCcZGOHk">
            <!---->
            <img width="210" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7"
                ariarole="presentation" loading="lazy" height="210" alt="" id="ember96"
                class="evi-image lazy-image ghost-default ember-view org-people-profile-card__cover-photo org-people-profile-card__cover-photo--people">
            <div class="org-people-profile-card__profile-info">
                <div id="ember97"
                    class="artdeco-entity-lockup artdeco-entity-lockup--stacked-center artdeco-entity-lockup--size-7 ember-view">
                    <div id="ember98"
                        class="artdeco-entity-lockup__image artdeco-entity-lockup__image--type-circle ember-view"
                        type="circle">
                        <a class="eETATgYTipaVsmrBChiBJJvFsdPhNpulhPZUVLHLo "
                            id="org-people-profile-card__profile-image-0"
                            href="https://www.linkedin.com/in/speakerrayna?miniProfileUrn=urn%3Ali%3Afs_miniProfile%3AACoAABsqUBoBr5x071PuGGpNtK3NlvSARiVXPIs"
                            data-test-app-aware-link="">
                            <img width="104"
                                src="https://media.licdn.com/dms/image/v2/D5603AQGs2Vyju4xZ7A/profile-displayphoto-shrink_100_100/profile-displayphoto-shrink_100_100/0/1681741067031?e=1750896000&amp;v=beta&amp;t=Hvj--IrrmpVIH7pec7-l_PQok8vsS__CGeUqBWOw7co"
                                loading="lazy" height="104" alt="Dr. Rayna S." id="ember99"
                                class="evi-image lazy-image ember-view">
                        </a>
                    </div>
                    <div id="ember100" class="artdeco-entity-lockup__content ember-view">
                        <div id="ember101" class="artdeco-entity-lockup__title ember-view">
                            <a class="eETATgYTipaVsmrBChiBJJvFsdPhNpulhPZUVLHLo  link-without-visited-state"
                                aria-label="View Dr. Rayna S.’s profile"
                                href="https://www.linkedin.com/in/speakerrayna?miniProfileUrn=urn%3Ali%3Afs_miniProfile%3AACoAABsqUBoBr5x071PuGGpNtK3NlvSARiVXPIs"
                                data-test-app-aware-link="">
                                <div id="ember103" class="ember-view lt-line-clamp lt-line-clamp--single-line AGabuksChUpCmjWshSnaZryLKSthOKkwclxY
          t-black" style="">
                                    Dr. Rayna S.
                                    <!---->
                                </div>
                            </a>
                        </div>
                        <div id="ember104" class="artdeco-entity-lockup__badge ember-view"> <span class="a11y-text">3rd+
                                degree connection</span>
                            <span class="artdeco-entity-lockup__degree" aria-hidden="true">
                                ·&nbsp;3rd
                            </span>
                            <!----><!---->
                        </div>
                        <div id="ember105" class="artdeco-entity-lockup__subtitle ember-view">
                            <div class="t-14 t-black--light t-normal">
                                <div id="ember107" class="ember-view lt-line-clamp lt-line-clamp--multi-line"
                                    style="-webkit-line-clamp: 2">
                                    Leadership and Talent Development Consultant and Professional Speaker
                                    <!---->
                                </div>
                            </div>
                        </div>
                        <div id="ember108" class="artdeco-entity-lockup__caption ember-view"></div>
                    </div>
                </div>
                <span class="text-align-center">
                    <span id="ember110"
                        class="ember-view lt-line-clamp lt-line-clamp--multi-line t-12 t-black--light mt2"
                        style="-webkit-line-clamp: 3">
                        727 followers
                        <!----> </span>
                </span>
            </div>
            <footer class="ph3 pb3">
                <button aria-label="Follow Dr. Rayna S." id="ember111"
                    class="artdeco-button artdeco-button--2 artdeco-button--secondary ember-view full-width"
                    type="button"><!---->
                    <span class="artdeco-button__text">
                        Follow
                    </span></button>
            </footer>
        </section>
    </div>
 </li>
--- a/docs/apps/linkdin/templates/ai.js
+++ b/docs/apps/linkdin/templates/ai.js
@@ -0,0 +1,50 @@
 // ==== File: ai.js ====
 class ApiHandler {
    constructor(apiKey = null) {
      this.apiKey = apiKey || localStorage.getItem("openai_api_key") || "";
      console.log("ApiHandler ready");
    }
    setApiKey(k) {
      this.apiKey = k.trim();
      if (this.apiKey) localStorage.setItem("openai_api_key", this.apiKey);
    }
    async *chatStream(messages, {model = "gpt-4o", temperature = 0.7} = {}) {
      if (!this.apiKey) throw new Error("OpenAI API key missing");
      const payload = {model, messages, stream: true, max_tokens: 1024};
      const controller = new AbortController();
      const res = await fetch("https://api.openai.com/v1/chat/completions", {
        method: "POST",
        headers: {
          "Content-Type": "application/json",
          Authorization: `Bearer ${this.apiKey}`,
        },
        body: JSON.stringify(payload),
        signal: controller.signal,
      });
      if (!res.ok) throw new Error(`OpenAI: ${res.statusText}`);
      const reader = res.body.getReader();
      const dec = new TextDecoder();
      let buf = "";
      while (true) {
        const {done, value} = await reader.read();
        if (done) break;
        buf += dec.decode(value, {stream: true});
        for (const line of buf.split("\n")) {
          if (!line.startsWith("data: ")) continue;
          if (line.includes("[DONE]")) return;
          const json = JSON.parse(line.slice(6));
          const delta = json.choices?.[0]?.delta?.content;
          if (delta) yield delta;
        }
        buf = buf.endsWith("\n") ? "" : buf; // keep partial line
      }
    }
  }
  window.API = new ApiHandler();
--- a/docs/apps/linkdin/templates/graph_view_template.html
+++ b/docs/apps/linkdin/templates/graph_view_template.html
--- a/docs/codebase/browser.md
+++ b/docs/codebase/browser.md
@@ -0,0 +1,51 @@
 ### browser_manager.py
 | Function | What it does |
 |---|---|
 | `ManagedBrowser.build_browser_flags` | Returns baseline Chromium CLI flags, disables GPU and sandbox, plugs locale, timezone, stealth tweaks, and any extras from `BrowserConfig`. |
 | `ManagedBrowser.__init__` | Stores config and logger, creates temp dir, preps internal state. |
 | `ManagedBrowser.start` | Spawns or connects to the Chromium process, returns its CDP endpoint plus the `subprocess.Popen` handle. |
 | `ManagedBrowser._initial_startup_check` | Pings the CDP endpoint once to be sure the browser is alive, raises if not. |
 | `ManagedBrowser._monitor_browser_process` | Async-loops on the subprocess, logs exits or crashes, restarts if policy allows. |
 | `ManagedBrowser._get_browser_path_WIP` | Old helper that maps OS + browser type to an executable path. |
 | `ManagedBrowser._get_browser_path` | Current helper, checks env vars, Playwright cache, and OS defaults for the real executable. |
 | `ManagedBrowser._get_browser_args` | Builds the final CLI arg list by merging user flags, stealth flags, and defaults. |
 | `ManagedBrowser.cleanup` | Terminates the browser, stops monitors, deletes the temp dir. |
 | `ManagedBrowser.create_profile` | Opens a visible browser so a human can log in, then zips the resulting user-data-dir to `~/.crawl4ai/profiles/<name>`. |
 | `ManagedBrowser.list_profiles` | Thin wrapper, now forwarded to `BrowserProfiler.list_profiles()`. |
 | `ManagedBrowser.delete_profile` | Thin wrapper, now forwarded to `BrowserProfiler.delete_profile()`. |
 | `BrowserManager.__init__` | Holds the global Playwright instance, browser handle, config signature cache, session map, and logger. |
 | `BrowserManager.start` | Boots the underlying `ManagedBrowser`, then spins up the default Playwright browser context with stealth patches. |
 | `BrowserManager._build_browser_args` | Translates `CrawlerRunConfig` (proxy, UA, timezone, headless flag, etc.) into Playwright `launch_args`. |
 | `BrowserManager.setup_context` | Applies locale, geolocation, permissions, cookies, and UA overrides on a fresh context. |
 | `BrowserManager.create_browser_context` | Internal helper that actually calls `browser.new_context(**options)` after running `setup_context`. |
 | `BrowserManager._make_config_signature` | Hashes the non-ephemeral parts of `CrawlerRunConfig` so contexts can be reused safely. |
 | `BrowserManager.get_page` | Returns a ready `Page` for a given session id, reusing an existing one or creating a new context/page, injects helper scripts, updates `last_used`. |
 | `BrowserManager.kill_session` | Force-closes a context/page for a session and removes it from the session map. |
 | `BrowserManager._cleanup_expired_sessions` | Periodic sweep that drops sessions idle longer than `ttl_seconds`. |
 | `BrowserManager.close` | Gracefully shuts down all contexts, the browser, Playwright, and background tasks. |
 ---
 ### browser_profiler.py
 | Function | What it does |
 |---|---|
 | `BrowserProfiler.__init__` | Sets up profile folder paths, async logger, and signal handlers. |
 | `BrowserProfiler.create_profile` | Launches a visible browser with a new user-data-dir for manual login, on exit compresses and stores it as a named profile. |
 | `BrowserProfiler.cleanup_handler` | General SIGTERM/SIGINT cleanup wrapper that kills child processes. |
 | `BrowserProfiler.sigint_handler` | Handles Ctrl-C during an interactive session, makes sure the browser shuts down cleanly. |
 | `BrowserProfiler.listen_for_quit_command` | Async REPL that exits when the user types `q`. |
 | `BrowserProfiler.list_profiles` | Enumerates `~/.crawl4ai/profiles`, prints profile name, browser type, size, and last modified. |
 | `BrowserProfiler.get_profile_path` | Returns the absolute path of a profile given its name, or `None` if missing. |
 | `BrowserProfiler.delete_profile` | Removes a profile folder or a direct path from disk, with optional confirmation prompt. |
 | `BrowserProfiler.interactive_manager` | Text UI loop for listing, creating, deleting, or launching profiles. |
 | `BrowserProfiler.launch_standalone_browser` | Starts a non-headless Chromium with remote debugging enabled and keeps it alive for manual tests. |
 | `BrowserProfiler.get_cdp_json` | Pulls `/json/version` from a CDP endpoint and returns the parsed JSON. |
 | `BrowserProfiler.launch_builtin_browser` | Spawns a headless Chromium in the background, saves `{wsEndpoint, pid, started_at}` to `~/.crawl4ai/builtin_browser.json`. |
 | `BrowserProfiler.get_builtin_browser_info` | Reads that JSON file, verifies the PID, and returns browser status info. |
 | `BrowserProfiler._is_browser_running` | Cross-platform helper that checks if a PID is still alive. |
 | `BrowserProfiler.kill_builtin_browser` | Terminates the background builtin browser and removes its status file. |
 | `BrowserProfiler.get_builtin_browser_status` | Returns `{running: bool, wsEndpoint, pid, started_at}` for quick health checks. |
 Let me know what you want to tweak or dive into next.
--- a/docs/codebase/cli.md
+++ b/docs/codebase/cli.md
@@ -0,0 +1,40 @@
 ### `cli.py` command surface
 | Command | Inputs / flags | What it does |
 |---|---|---|
 | **profiles** | *(none)* | Opens the interactive profile manager, lets you list, create, delete saved browser profiles that live in `~/.crawl4ai/profiles`. |
 | **browser status** | – | Prints whether the always-on *builtin* browser is running, shows its CDP URL, PID, start time. |
 | **browser stop** | – | Kills the builtin browser and deletes its status file. |
 | **browser view** | `--url, -u` URL *(optional)* | Pops a visible window of the builtin browser, navigates to `URL` or `about:blank`. |
 | **config list** | – | Dumps every global setting, showing current value, default, and description. |
 | **config get** | `key` | Prints the value of a single setting, falls back to default if unset. |
 | **config set** | `key value` | Persists a new value in the global config (stored under `~/.crawl4ai/config.yml`). |
 | **examples** | – | Just spits out real-world CLI usage samples. |
 | **crawl** | `url` *(positional)*<br>`--browser-config,-B` path<br>`--crawler-config,-C` path<br>`--filter-config,-f` path<br>`--extraction-config,-e` path<br>`--json-extract,-j` [desc]\*<br>`--schema,-s` path<br>`--browser,-b` k=v list<br>`--crawler,-c` k=v list<br>`--output,-o` all,json,markdown,md,markdown-fit,md-fit *(default all)*<br>`--output-file,-O` path<br>`--bypass-cache,-b` *(flag, default true — note flag reuse)*<br>`--question,-q` str<br>`--verbose,-v` *(flag)*<br>`--profile,-p` profile-name | One-shot crawl + extraction. Builds `BrowserConfig` and `CrawlerRunConfig` from inline flags or separate YAML/JSON files, runs `AsyncWebCrawler.run()`, can route through a named saved profile and pipe the result to stdout or a file. |
 | **(default)** | Same flags as **crawl**, plus `--example` | Shortcut so you can type just `crwl https://site.com`. When first arg is not a known sub-command, it falls through to *crawl*. |
 \* `--json-extract/-j` with no value turns on LLM-based JSON extraction using an auto schema, supplying a string lets you prompt-engineer the field descriptions.
 > Quick mental model  
 > `profiles` = manage identities,  
 > `browser ...` = control long-running headless Chrome that all crawls can piggy-back on,  
 > `crawl` = do the actual work,  
 > `config` = tweak global defaults,  
 > everything else is sugar.
 ### Quick-fire “profile” usage cheatsheet
 | Scenario | Command (copy-paste ready) | Notes |
 |---|---|---|
 | **Launch interactive Profile Manager UI** | `crwl profiles` | Opens TUI with options: 1 List, 2 Create, 3 Delete, 4 Use-to-crawl, 5 Exit. |
 | **Create a fresh profile** | `crwl profiles` → choose **2** → name it → browser opens → log in → press **q** in terminal | Saves to `~/.crawl4ai/profiles/<name>`. |
 | **List saved profiles** | `crwl profiles` → choose **1** | Shows name, browser type, size, last-modified. |
 | **Delete a profile** | `crwl profiles` → choose **3** → pick the profile index → confirm | Removes the folder. |
 | **Crawl with a profile (default alias)** | `crwl https://site.com/dashboard -p my-profile` | Keeps login cookies, sets `use_managed_browser=true` under the hood. |
 | **Crawl + verbose JSON output** | `crwl https://site.com -p my-profile -o json -v` | Any other `crawl` flags work the same. |
 | **Crawl with extra browser tweaks** | `crwl https://site.com -p my-profile -b "headless=true,viewport_width=1680"` | CLI overrides go on top of the profile. |
 | **Same but via explicit sub-command** | `crwl crawl https://site.com -p my-profile` | Identical to default alias. |
 | **Use profile from inside Profile Manager** | `crwl profiles` → choose **4** → pick profile → enter URL → follow prompts | Handy when demo-ing to non-CLI folks. |
 | **One-off crawl with a profile folder path (no name lookup)** | `crwl https://site.com -b "user_data_dir=$HOME/.crawl4ai/profiles/my-profile,use_managed_browser=true"` | Bypasses registry, useful for CI scripts. |
 | **Launch a dev browser on CDP port with the same identity** | `crwl cdp -d $HOME/.crawl4ai/profiles/my-profile -P 9223` | Lets Puppeteer/Playwright attach for debugging. |
--- a/docs/examples/crypto_analysis_example.py
+++ b/docs/examples/crypto_analysis_example.py
@@ -391,12 +391,14 @@ async def main():
        # Process results
        raw_df = pd.DataFrame()
        for result in results:
-            if result.success and result.media["tables"]:
+            # Use the new tables field, falling back to media["tables"] for backward compatibility
            tables = result.tables if hasattr(result, "tables") and result.tables else result.media.get("tables", [])
            if result.success and tables:
                # Extract primary market table
                # DataFrame
                raw_df = pd.DataFrame(
-                    result.media["tables"][0]["rows"],
+                    tables[0]["rows"],
-                    columns=result.media["tables"][0]["headers"],
+                    columns=tables[0]["headers"],
                )
                break
--- a/docs/examples/docker/demo_docker_api.py
+++ b/docs/examples/docker/demo_docker_api.py
--- a/docs/examples/hello_world.py
+++ b/docs/examples/hello_world.py
@@ -31,7 +31,7 @@ async def example_cdp():
 async def main():
-    browser_config = BrowserConfig(headless=True, verbose=True)
+    browser_config = BrowserConfig(headless=False, verbose=True)
    async with AsyncWebCrawler(config=browser_config) as crawler:
        crawler_config = CrawlerRunConfig(
            cache_mode=CacheMode.BYPASS,
--- a/docs/md_v2/assets/layout.css
+++ b/docs/md_v2/assets/layout.css
@@ -412,17 +412,41 @@ footer {
    background-color: var(--primary-dimmed-color, #09b5a5);
    color: var(--background-color, #070708);
    border: none;
-    padding: 4px 8px;
+    padding: 6px 10px;
    font-size: 0.8em;
    border-radius: 4px;
    cursor: pointer;
-    box-shadow: 0 2px 5px rgba(0, 0, 0, 0.3);
+    box-shadow: 0 3px 8px rgba(0, 0, 0, 0.3);
-    transition: background-color 0.2s ease;
+    transition: background-color 0.2s ease, transform 0.15s ease;
    white-space: nowrap;
    display: flex;
    align-items: center;
    font-weight: 500;
    animation: askAiButtonAppear 0.2s ease-out;
 }
@keyframes askAiButtonAppear {
    from {
        opacity: 0;
        transform: scale(0.9);
    }
    to {
        opacity: 1;
        transform: scale(1);
    }
 }
 .ask-ai-selection-button:hover {
    background-color: var(--primary-color, #50ffff);
    transform: scale(1.05);
 }
 /* Mobile styles for Ask AI button */
@media screen and (max-width: 768px) {
    .ask-ai-selection-button {
        padding: 8px 12px; /* Larger touch target on mobile */
        font-size: 0.9em; /* Slightly larger text */
    }
 }
 /* ==== File: docs/assets/layout.css (Additions) ==== */
--- a/docs/md_v2/assets/selection_ask_ai.js
+++ b/docs/md_v2/assets/selection_ask_ai.js
@@ -8,12 +8,32 @@ document.addEventListener('DOMContentLoaded', () => {
        const button = document.createElement('button');
        button.id = 'ask-ai-selection-btn';
        button.className = 'ask-ai-selection-button';
-        button.textContent = 'Ask AI'; // Or use an icon
+        
        // Add icon and text for better visibility
        button.innerHTML = `
            <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" width="12" height="12" fill="currentColor" style="margin-right: 4px; vertical-align: middle;">
                <path d="M20 2H4c-1.1 0-2 .9-2 2v12c0 1.1.9 2 2 2h14l4 4V4c0-1.1-.9-2-2-2z"/>
            </svg>
            <span>Ask AI</span>
        `;
        // Common styles
        button.style.display = 'none'; // Initially hidden
        button.style.position = 'absolute';
        button.style.zIndex = '1500'; // Ensure it's on top
-        document.body.appendChild(button);
+        button.style.boxShadow = '0 3px 8px rgba(0, 0, 0, 0.4)'; // More pronounced shadow
        button.style.transition = 'transform 0.15s ease, background-color 0.2s ease'; // Smooth hover effect
        // Add transform on hover
        button.addEventListener('mouseover', () => {
            button.style.transform = 'scale(1.05)';
        });
        button.addEventListener('mouseout', () => {
            button.style.transform = 'scale(1)';
        });
        document.body.appendChild(button);
        button.addEventListener('click', handleAskAiClick);
        return button;
    }
@@ -43,11 +63,38 @@ document.addEventListener('DOMContentLoaded', () => {
        const range = selection.getRangeAt(0);
        const rect = range.getBoundingClientRect();
-        // Calculate position: top-right of the selection
+        // Get viewport dimensions
        const viewportWidth = window.innerWidth;
        const viewportHeight = window.innerHeight;
        // Calculate position based on selection
        const scrollX = window.scrollX;
        const scrollY = window.scrollY;
-        const buttonTop = rect.top + scrollY - askAiButton.offsetHeight - 5; // 5px above
+        
-        const buttonLeft = rect.right + scrollX + 5; // 5px to the right
+        // Default position (top-right of selection)
        let buttonTop = rect.top + scrollY - askAiButton.offsetHeight - 5; // 5px above
        let buttonLeft = rect.right + scrollX + 5; // 5px to the right
        // Check if we're on mobile (which we define as less than 768px)
        const isMobile = viewportWidth <= 768;
        if (isMobile) {
            // On mobile, position centered above selection to avoid edge issues
            buttonTop = rect.top + scrollY - askAiButton.offsetHeight - 10; // 10px above on mobile
            buttonLeft = rect.left + scrollX + (rect.width / 2) - (askAiButton.offsetWidth / 2); // Centered
        } else {
            // For desktop, ensure the button doesn't go off screen
            // Check right edge
            if (buttonLeft + askAiButton.offsetWidth > scrollX + viewportWidth) {
                buttonLeft = scrollX + viewportWidth - askAiButton.offsetWidth - 10; // 10px from right edge
            }
        }
        // Check top edge (for all devices)
        if (buttonTop < scrollY) {
            // If would go above viewport, position below selection instead
            buttonTop = rect.bottom + scrollY + 5; // 5px below
        }
        askAiButton.style.top = `${buttonTop}px`;
        askAiButton.style.left = `${buttonLeft}px`;
@@ -77,8 +124,8 @@ document.addEventListener('DOMContentLoaded', () => {
    // --- Event Listeners ---
-    // Show button on mouse up after selection
+    // Function to handle selection events (both mouse and touch)
-    document.addEventListener('mouseup', (event) => {
+    function handleSelectionEvent(event) {
        // Slight delay to ensure selection is registered
        setTimeout(() => {
            const selectedText = getSafeSelectedText();
@@ -86,7 +133,7 @@ document.addEventListener('DOMContentLoaded', () => {
                if (!askAiButton) {
                    askAiButton = createAskAiButton();
                }
-                // Don't position if the click was ON the button itself
+                // Don't position if the event was ON the button itself
                if (event.target !== askAiButton) {
                     positionButton(event);
                }
@@ -94,16 +141,46 @@ document.addEventListener('DOMContentLoaded', () => {
                hideButton();
            }
        }, 10); // Small delay
    }
    // Mouse selection events (desktop)
    document.addEventListener('mouseup', handleSelectionEvent);
    // Touch selection events (mobile)
    document.addEventListener('touchend', handleSelectionEvent);
    document.addEventListener('selectionchange', () => {
        // This helps with mobile selection which can happen without mouseup/touchend
        setTimeout(() => {
            const selectedText = getSafeSelectedText();
            if (selectedText && askAiButton) {
                positionButton();
            }
        }, 300); // Longer delay for selection change
    });
-    // Hide button on scroll or click elsewhere
+    // Hide button on various events
    document.addEventListener('mousedown', (event) => {
        // Hide if clicking anywhere EXCEPT the button itself
        if (askAiButton && event.target !== askAiButton) {
            hideButton();
        }
    });
    document.addEventListener('touchstart', (event) => {
        // Same for touch events, but only hide if not on the button
        if (askAiButton && event.target !== askAiButton) {
            hideButton();
        }
    });
    document.addEventListener('scroll', hideButton, true); // Capture scroll events
    // Also hide when pressing Escape key
    document.addEventListener('keydown', (event) => {
        if (event.key === 'Escape') {
            hideButton();
        }
    });
    console.log("Selection Ask AI script loaded.");
 });
--- a/docs/md_v2/blog/index.md
+++ b/docs/md_v2/blog/index.md
@@ -4,6 +4,32 @@ Welcome to the Crawl4AI blog! Here you'll find detailed release notes, technical
 ## Latest Release
 Here’s the blog index entry for **v0.6.0**, written to match the exact tone and structure of your previous entries:
 ---
 ### [Crawl4AI v0.6.0 – World-Aware Crawling, Pre-Warmed Browsers, and the MCP API](releases/0.6.0.md)
 *April 23, 2025*
 Crawl4AI v0.6.0 is our most powerful release yet. This update brings major architectural upgrades including world-aware crawling (set geolocation, locale, and timezone), real-time traffic capture, and a memory-efficient crawler pool with pre-warmed pages.  
 The Docker server now exposes a full-featured MCP socket + SSE interface, supports streaming, and comes with a new Playground UI. Plus, table extraction is now native, and the new stress-test framework supports crawling 1,000+ URLs.  
 Other key changes:  
 *   Native support for `result.media["tables"]` to export DataFrames  
 * Full network + console logs and MHTML snapshot per crawl  
 * Browser pooling and pre-warming for faster cold starts  
 * New streaming endpoints via MCP API and Playground  
 * Robots.txt support, proxy rotation, and improved session handling  
 * Deprecated old markdown names, legacy modules cleaned up  
 * Massive repo cleanup: ~36K insertions, ~5K deletions across 121 files
 [Read full release notes →](releases/0.6.0.md)
 ---
 Let me know if you want me to auto-update the actual file or just paste this into the markdown.
 ### [Crawl4AI v0.5.0: Deep Crawling, Scalability, and a New CLI!](releases/0.5.0.md)
--- a/docs/md_v2/blog/releases/0.6.0.md
+++ b/docs/md_v2/blog/releases/0.6.0.md
@@ -1,51 +1,143 @@
-# Crawl4AI 0.6.0
+# Crawl4AI v0.6.0 Release Notes
-*Release date: 2025‑04‑22*
+We're excited to announce the release of **Crawl4AI v0.6.0**, our biggest and most feature-rich update yet. This version introduces major architectural upgrades, brand-new capabilities for geo-aware crawling, high-efficiency scraping, and real-time streaming support for scalable deployments.
 0.6.0 is the **biggest jump** since the 0.5 series, packing a smarter browser core, pool‑based crawlers, and a ton of DX candy. Expect faster runs, lower RAM burn, and richer diagnostics.
 ---
-## 🚀 Key upgrades
+## Highlights
-| Area | What changed |
+### 1. **World-Aware Crawlers**
-|------|--------------|
+Crawl as if you’re anywhere in the world. With v0.6.0, each crawl can simulate:
-| **Browser** | New **Browser** management with pooling, page pre‑warm, geolocation + locale + timezone switches |
+- Specific GPS coordinates
-| **Crawler** | Console and network log capture, MHTML snapshots, safer `get_page` API |
+- Browser locale
-| **Server & API** | **Crawler Pool Manager** endpoint, MCP socket + SSE support |
+- Timezone
-| **Docs** | v2 layout, floating Ask‑AI helper, GitHub stats badge, copy‑code buttons, Docker API demo |
+
-| **Tests** | Memory + load benchmarks, 90+ new cases covering MCP and Docker |
+Example:
 ```python
 CrawlerRunConfig(
    url="https://browserleaks.com/geo",
    locale="en-US",
    timezone_id="America/Los_Angeles",
    geolocation=GeolocationConfig(
        latitude=34.0522,
        longitude=-118.2437,
        accuracy=10.0
    )
 )
 ```
 Great for accessing region-specific content or testing global behavior.
 ---
-## ⚠️ Breaking changes
+### 2. **Native Table Extraction**
 Extract HTML tables directly into usable formats like Pandas DataFrames or CSV with zero parsing hassle. All table data is available under `result.media["tables"]`.
-1. **`get_page` signature** – returns `(html, metadata)` instead of plain html.
+Example:
-2. **Docker** – new Chromium base layer, rebuild images.
+```python
 raw_df = pd.DataFrame(
    result.media["tables"][0]["rows"],
    columns=result.media["tables"][0]["headers"]
 )
 ```
 This makes it ideal for scraping financial data, pricing pages, or anything tabular.
 ---
-## How to upgrade
+### 3. **Browser Pooling & Pre-Warming**
 We've overhauled browser management. Now, multiple browser instances can be pooled and pages pre-warmed for ultra-fast launches:
 - Reduces cold-start latency
 - Lowers memory spikes
 - Enhances parallel crawling stability
 This powers the new **Docker Playground** experience and streamlines heavy-load crawling.
 ---
 ### 4. **Traffic & Snapshot Capture**
 Need full visibility? You can now capture:
 - Full network traffic logs
 - Console output
 - MHTML page snapshots for post-crawl audits and debugging
 No more guesswork on what happened during your crawl.
 ---
 ### 5. **MCP API and Streaming Support**
 We’re exposing **MCP socket and SSE endpoints**, allowing:
 - Live streaming of crawl results
 - Real-time integration with agents or frontends
 - A new Playground UI for interactive crawling
 This is a major step towards making Crawl4AI real-time ready.
 ---
 ### 6. **Stress-Test Framework**
 Want to test performance under heavy load? v0.6.0 includes a new memory stress-test suite that supports 1,000+ URL workloads. Ideal for:
 - Load testing
 - Performance benchmarking
 - Validating memory efficiency
 ---
 ## Core Improvements
 - Robots.txt compliance
 - Proxy rotation support
 - Improved URL normalization and session reuse
 - Shared data across crawler hooks
 - New page routing logic
 ---
 ## Breaking Changes & Deprecations
 - Legacy `crawl4ai/browser/*` modules are removed. Update imports accordingly.
 - `AsyncPlaywrightCrawlerStrategy.get_page` now uses a new function signature.
 - Deprecated markdown generator aliases now point to `DefaultMarkdownGenerator` with warning.
 ---
 ## Miscellaneous Updates
 - FastAPI validators replaced custom validation logic
 - Docker build now based on a Chromium layer
 - Repo-wide cleanup: ~36,000 insertions, ~5,000 deletions
 ---
 ## New Examples Included
 - Geo-location crawling
 - Network + console log capture
 - Docker MCP API usage
 - Markdown selector usage
 - Crypto project data extraction
 ---
 ## Watch the Release Video
 Want a visual walkthrough of all these updates? Watch the video:
 🔗 https://youtu.be/9x7nVcjOZks
 If you're new to Crawl4AI, start here:
 🔗 https://www.youtube.com/watch?v=xo3qK6Hg9AA&t=15s
 ---
 ## Join the Community
 We’ve just opened up our **Discord** for the public. Join us to:
 - Ask questions
 - Share your projects
 - Get help or contribute
 💬 https://discord.gg/wpYFACrHR4
 ---
 ## Install or Upgrade
 ```bash
-pip install -U crawl4ai==0.6.0
+pip install -U crawl4ai
 ```
 ---
-## Full changelog
+Live long and import crawl4ai. 🖖
 The diff between `main` and `next` spans **36 k insertions, 4.9 k deletions** over 121 files. Read the [compare view](https://github.com/unclecode/crawl4ai/compare/0.5.0.post8...0.6.0) or see `CHANGELOG.md` for the granular list.
 ---
 ## Upgrade tips
 * Using the Docker API? Pull `unclecode/crawl4ai:0.6.0`, new args are documented in `/deploy/docker/README.md`.
 * Stress‑test your stack with `tests/memory/run_benchmark.py` before production rollout.
 * Markdown generators renamed but aliased, update when convenient, warnings will remind you.
 ---
 Happy crawling, ping `@unclecode` on X for questions or memes.
--- a/docs/md_v2/core/docker-deployment.md
+++ b/docs/md_v2/core/docker-deployment.md
@@ -58,7 +58,7 @@ Pull and run images directly from Docker Hub without building locally.
 #### 1. Pull the Image
-Our latest release candidate is `0.6.0rc1-r2`. Images are built with multi-arch manifests, so Docker automatically pulls the correct version for your system.
+Our latest release candidate is `0.6.0-r2`. Images are built with multi-arch manifests, so Docker automatically pulls the correct version for your system.
 ```bash
 # Pull the release candidate (recommended for latest features)
@@ -124,9 +124,9 @@ docker stop crawl4ai && docker rm crawl4ai
 #### Docker Hub Versioning Explained
 *   **Image Name:** `unclecode/crawl4ai`
-*   **Tag Format:** `LIBRARY_VERSION[-SUFFIX]` (e.g., `0.6.0rc1-r2`)
+*   **Tag Format:** `LIBRARY_VERSION[-SUFFIX]` (e.g., `0.6.0-r2`)
    *   `LIBRARY_VERSION`: The semantic version of the core `crawl4ai` Python library
-    *   `SUFFIX`: Optional tag for release candidates (`rc1`) and revisions (`r1`)
+    *   `SUFFIX`: Optional tag for release candidates (``) and revisions (`r1`)
 *   **`latest` Tag:** Points to the most recent stable version
 *   **Multi-Architecture Support:** All images support both `linux/amd64` and `linux/arm64` architectures through a single tag
--- a/tests/profiler/test_crteate_profile.py
+++ b/tests/profiler/test_crteate_profile.py
@@ -0,0 +1,32 @@
 from crawl4ai import BrowserProfiler
 import asyncio
 if __name__ == "__main__":
    # Example usage
    profiler = BrowserProfiler()
    # Create a new profile
    import os
    from pathlib import Path
    home_dir = Path.home()
    profile_path = asyncio.run(profiler.create_profile( str(home_dir / ".crawl4ai/profiles/test-profile")))
    print(f"Profile created at: {profile_path}")
    # # Launch a standalone browser
    # asyncio.run(profiler.launch_standalone_browser())
    # # List profiles
    # profiles = profiler.list_profiles()
    # for profile in profiles:
    #     print(f"Profile: {profile['name']}, Path: {profile['path']}")
    # # Delete a profile
    # success = profiler.delete_profile("my-profile")
    # if success:
    #     print("Profile deleted successfully")
    # else:
    #     print("Failed to delete profile")
Author	SHA1	Message	Date
UncleCode	0e5d672763	Merge branch 'pr-971' into merge-pr971	2025-05-01 18:57:28 +08:00
wakaka6	cd2b490b40	refactor(logger): Apply the Enumeration for color	2025-05-01 17:04:44 +08:00
UncleCode	50f0b83fcd	feat(linkedin): add prospect-wizard app with scraping and visualization Add new LinkedIn prospect discovery tool with three main components: - c4ai_discover.py for company and people scraping - c4ai_insights.py for org chart and decision maker analysis - Interactive graph visualization with company/people exploration Features include: - Configurable LinkedIn search and scraping - Org chart generation with decision maker scoring - Interactive network graph visualization - Company similarity analysis - Chat interface for data exploration Requires: crawl4ai, openai, sentence-transformers, networkx	2025-04-30 19:38:25 +08:00
UncleCode	9499164d3c	feat(browser): improve browser profile management and cleanup Enhance browser profile handling with better process cleanup and documentation: - Add process cleanup for existing Chromium instances on Windows/Unix - Fix profile creation by passing complete browser config - Add comprehensive documentation for browser and CLI components - Add initial profile creation test - Bump version to 0.6.3 This change improves reliability when managing browser profiles and provides better documentation for developers.	2025-04-29 23:04:32 +08:00
UncleCode	2140d9aca4	fix(browser): correct headless mode default behavior Modify BrowserConfig to respect explicit headless parameter setting instead of forcing True. Update version to 0.6.2 and clean up code formatting in examples. BREAKING CHANGE: BrowserConfig no longer defaults to headless=True when explicitly set to False	2025-04-26 21:09:50 +08:00
UncleCode	ccec40ed17	feat(models): add dedicated tables field to CrawlResult - Add tables field to CrawlResult model while maintaining backward compatibility - Update async_webcrawler.py to extract tables from media and pass to tables field - Update crypto_analysis_example.py to use the new tables field - Add /config/dump examples to demo_docker_api.py - Bump version to 0.6.1	2025-04-24 18:36:25 +08:00
UncleCode	ad4dfb21e1	Remoce "rc1"	2025-04-23 21:00:00 +08:00
UncleCode	7784b2468e	feat(docs): enhance Ask AI button UX and add v0.6.0 release notes Improve Ask AI button with better mobile support, animations, and positioning: - Add button animations and hover effects - Improve mobile responsiveness - Add icon to button - Fix positioning logic for different viewport sizes - Add keyboard (Escape) support Add comprehensive v0.6.0 release documentation: - Create detailed release notes - Update blog index with latest release - Document all major features and breaking changes BREAKING CHANGE: Documentation structure updated with new v0.6.0 section	2025-04-23 20:07:03 +08:00
wakaka6	b2f3cb0dfa	WIP: logger migriate to rich	2025-04-11 00:44:43 +08:00
`@@ -1,3 +1,3 @@`
	`# crawl4ai/_version.py`	`# crawl4ai/_version.py`
	`__version__ = "0.6.0"`	`__version__ = "0.6.3"`