Merge branch 'pr-971' into merge-pr971

refactor(logger): Apply the Enumeration for color
feat(linkedin): add prospect-wizard app with scraping and visualization
2025-05-01 18:57:28 +08:00 · 2025-05-01 17:04:44 +08:00 · 2025-04-30 19:38:25 +08:00 · 2025-04-29 23:04:32 +08:00 · 2025-04-26 21:09:50 +08:00 · 2025-04-24 18:36:25 +08:00
34 changed files with 3769 additions and 420 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,7 +5,16 @@ All notable changes to Crawl4AI will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

-## [0.6.0rc1] ‑ 2025‑04‑22
+## [0.6.1] - 2025-04-24
+
+### Added
+- New dedicated `tables` field in `CrawlResult` model for better table extraction handling
+- Updated crypto_analysis_example.py to use the new tables field with backward compatibility
+
+### Changed
+- Improved playground UI in Docker deployment with better endpoint handling and UI feedback
+
+## [0.6.0] ‑ 2025‑04‑22

 ### Added
 - Browser pooling with page pre‑warming and fine‑grained **geolocation, locale, and timezone** controls  
--- a/README.md
+++ b/README.md
@@ -21,9 +21,9 @@

 Crawl4AI is the #1 trending GitHub repository, actively maintained by a vibrant community. It delivers blazing-fast, AI-ready web crawling tailored for LLMs, AI agents, and data pipelines. Open source, flexible, and built for real-time performance, Crawl4AI empowers developers with unmatched speed, precision, and deployment ease.  

-[✨ Check out latest update v0.6.0rc1](#-recent-updates)
+[✨ Check out latest update v0.6.0](#-recent-updates)

-🎉 **Version 0.6.0rc1 is now available!** This release candidate introduces World-aware Crawling with geolocation and locale settings, Table-to-DataFrame extraction, Browser pooling with pre-warming, Network and console traffic capture, MCP integration for AI tools, and a completely revamped Docker deployment! [Read the release notes →](https://docs.crawl4ai.com/blog)
+🎉 **Version 0.6.0 is now available!** This release candidate introduces World-aware Crawling with geolocation and locale settings, Table-to-DataFrame extraction, Browser pooling with pre-warming, Network and console traffic capture, MCP integration for AI tools, and a completely revamped Docker deployment! [Read the release notes →](https://docs.crawl4ai.com/blog)

 <details>
 <summary>🤓 <strong>My Personal Story</strong></summary>
@@ -505,7 +505,7 @@ async def test_news_crawl():

 ## ✨ Recent Updates

-### Version 0.6.0rc1 Release Highlights
+### Version 0.6.0 Release Highlights

 - **🌎 World-aware Crawling**: Set geolocation, language, and timezone for authentic locale-specific content:
  ```python
@@ -575,7 +575,7 @@ async def test_news_crawl():

 - **📱 Multi-stage Build System**: Optimized Dockerfile with platform-specific performance enhancements

-Read the full details in our [0.6.0rc1 Release Notes](https://docs.crawl4ai.com/blog/releases/0.6.0.html) or check the [CHANGELOG](https://github.com/unclecode/crawl4ai/blob/main/CHANGELOG.md).
+Read the full details in our [0.6.0 Release Notes](https://docs.crawl4ai.com/blog/releases/0.6.0.html) or check the [CHANGELOG](https://github.com/unclecode/crawl4ai/blob/main/CHANGELOG.md).

 ### Previous Version: 0.5.0 Major Release Highlights

@@ -606,7 +606,7 @@ We use different suffixes to indicate development stages:
 - `dev` (0.4.3dev1): Development versions, unstable
 - `a` (0.4.3a1): Alpha releases, experimental features
 - `b` (0.4.3b1): Beta releases, feature complete but needs testing
- `rc` (0.4.3rc1): Release candidates, potential final version
+- `rc` (0.4.3): Release candidates, potential final version

 #### Installation
 - Regular installation (stable version):
--- a/crawl4ai/version.py
+++ b/crawl4ai/version.py
@@ -1,3 +1,3 @@
 # crawl4ai/_version.py
-__version__ = "0.6.0"
+__version__ = "0.6.3"

--- a/crawl4ai/async_configs.py
+++ b/crawl4ai/async_configs.py
@@ -427,7 +427,7 @@ class BrowserConfig:
        host: str = "localhost",
    ):
        self.browser_type = browser_type
-        self.headless = headless or True
+        self.headless = headless 
        self.browser_mode = browser_mode
        self.use_managed_browser = use_managed_browser
        self.cdp_url = cdp_url
--- a/crawl4ai/async_database.py
+++ b/crawl4ai/async_database.py
@@ -171,7 +171,10 @@ class AsyncDatabaseManager:
                            f"Code context:\n{error_context['code_context']}"
                        )
                        self.logger.error(
-                            message=create_box_message(error_message, type="error"),
+                            message="{error}",
+                            tag="ERROR",
+                            params={"error": str(error_message)},
+                            boxes=["error"],
                        )

                        raise
@@ -189,7 +192,10 @@ class AsyncDatabaseManager:
                f"Code context:\n{error_context['code_context']}"
            )
            self.logger.error(
-                message=create_box_message(error_message, type="error"),
+                message="{error}",
+                tag="ERROR",
+                params={"error": str(error_message)},
+                boxes=["error"],
            )
            raise
        finally:
--- a/crawl4ai/async_logger.py
+++ b/crawl4ai/async_logger.py
@@ -1,10 +1,12 @@
 from abc import ABC, abstractmethod
 from enum import Enum
-from typing import Optional, Dict, Any
-from colorama import Fore, Style, init
+from typing import Optional, Dict, Any, List
 import os
 from datetime import datetime
 from urllib.parse import unquote
+from rich.console import Console
+from rich.text import Text
+from .utils import create_box_message


 class LogLevel(Enum):
@@ -21,6 +23,26 @@ class LogLevel(Enum):
    FATAL = 10
    

+    def __str__(self):
+        return self.name.lower()
+
+class LogColor(str, Enum):
+    """Enum for log colors."""
+
+    DEBUG = "lightblack"
+    INFO = "cyan"
+    SUCCESS = "green"
+    WARNING = "yellow"
+    ERROR = "red"
+    CYAN = "cyan"
+    GREEN = "green"
+    YELLOW = "yellow"
+    MAGENTA = "magenta"
+    DIM_MAGENTA = "dim magenta"
+
+    def __str__(self):
+        """Automatically convert rich color to string."""
+        return self.value


 class AsyncLoggerBase(ABC):
@@ -52,6 +74,7 @@ class AsyncLoggerBase(ABC):
    def error_status(self, url: str, error: str, tag: str = "ERROR", url_length: int = 100):
        pass

+
 class AsyncLogger(AsyncLoggerBase):
    """
    Asynchronous logger with support for colored console output and file logging.
@@ -79,17 +102,11 @@ class AsyncLogger(AsyncLoggerBase):
    }

    DEFAULT_COLORS = {
-        LogLevel.DEBUG: Fore.LIGHTBLACK_EX,
-        LogLevel.INFO: Fore.CYAN,
-        LogLevel.SUCCESS: Fore.GREEN,
-        LogLevel.WARNING: Fore.YELLOW,
-        LogLevel.ERROR: Fore.RED,
-        LogLevel.CRITICAL: Fore.RED + Style.BRIGHT,
-        LogLevel.ALERT: Fore.RED + Style.BRIGHT,
-        LogLevel.NOTICE: Fore.BLUE,
-        LogLevel.EXCEPTION: Fore.RED + Style.BRIGHT,
-        LogLevel.FATAL: Fore.RED + Style.BRIGHT,
-        LogLevel.DEFAULT: Fore.WHITE,
+        LogLevel.DEBUG: LogColor.DEBUG,
+        LogLevel.INFO: LogColor.INFO,
+        LogLevel.SUCCESS: LogColor.SUCCESS,
+        LogLevel.WARNING: LogColor.WARNING,
+        LogLevel.ERROR: LogColor.ERROR,
    }

    def __init__(
@@ -98,7 +115,7 @@ class AsyncLogger(AsyncLoggerBase):
        log_level: LogLevel = LogLevel.DEBUG,
        tag_width: int = 10,
        icons: Optional[Dict[str, str]] = None,
-        colors: Optional[Dict[LogLevel, str]] = None,
+        colors: Optional[Dict[LogLevel, LogColor]] = None,
        verbose: bool = True,
    ):
        """
@@ -112,13 +129,13 @@ class AsyncLogger(AsyncLoggerBase):
            colors: Custom colors for different log levels
            verbose: Whether to output to console
        """
-        init()  # Initialize colorama
        self.log_file = log_file
        self.log_level = log_level
        self.tag_width = tag_width
        self.icons = icons or self.DEFAULT_ICONS
        self.colors = colors or self.DEFAULT_COLORS
        self.verbose = verbose
+        self.console = Console()

        # Create log file directory if needed
        if log_file:
@@ -143,16 +160,11 @@ class AsyncLogger(AsyncLoggerBase):
    def _write_to_file(self, message: str):
        """Write a message to the log file if configured."""
        if self.log_file:
+            text = Text.from_markup(message)
+            plain_text = text.plain
            timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S.%f")[:-3]
            with open(self.log_file, "a", encoding="utf-8") as f:
-                # Strip ANSI color codes for file output
-                clean_message = message.replace(Fore.RESET, "").replace(
-                    Style.RESET_ALL, ""
-                )
-                for color in vars(Fore).values():
-                    if isinstance(color, str):
-                        clean_message = clean_message.replace(color, "")
-                f.write(f"[{timestamp}] {clean_message}\n")
+                f.write(f"[{timestamp}] {plain_text}\n")

    def _log(
        self,
@@ -160,8 +172,9 @@ class AsyncLogger(AsyncLoggerBase):
        message: str,
        tag: str,
        params: Optional[Dict[str, Any]] = None,
-        colors: Optional[Dict[str, str]] = None,
-        base_color: Optional[str] = None,
+        colors: Optional[Dict[str, LogColor]] = None,
+        boxes: Optional[List[str]] = None,
+        base_color: Optional[LogColor] = None,
        **kwargs,
    ):
        """
@@ -173,55 +186,44 @@ class AsyncLogger(AsyncLoggerBase):
            tag: Tag for the message
            params: Parameters to format into the message
            colors: Color overrides for specific parameters
+            boxes: Box overrides for specific parameters
            base_color: Base color for the entire message
        """
        if level.value < self.log_level.value:
            return

-        # Format the message with parameters if provided
+        # avoid conflict with rich formatting
+        parsed_message = message.replace("[", "[[").replace("]", "]]")
        if params:
-            try:
-                # First format the message with raw parameters
-                formatted_message = message.format(**params)
+            # FIXME: If there are formatting strings in floating point format, 
+            # this may result in colors and boxes not being applied properly.
+            # such as {value:.2f}, the value is 0.23333 format it to 0.23,
+            # but we replace("0.23333", "[color]0.23333[/color]")
+            formatted_message = parsed_message.format(**params)
+            for key, value in params.items():
+                # value_str may discard `[` and `]`, so we need to replace it. 
+                value_str = str(value).replace("[", "[[").replace("]", "]]")
+                # check is need apply color
+                if colors and key in colors:
+                    color_str = f"[{colors[key]}]{value_str}[/{colors[key]}]"
+                    formatted_message = formatted_message.replace(value_str, color_str)
+                    value_str = color_str

-                # Then apply colors if specified
-                color_map = {
-                    "green": Fore.GREEN,
-                    "red": Fore.RED,
-                    "yellow": Fore.YELLOW,
-                    "blue": Fore.BLUE,
-                    "cyan": Fore.CYAN,
-                    "magenta": Fore.MAGENTA,
-                    "white": Fore.WHITE,
-                    "black": Fore.BLACK,
-                    "reset": Style.RESET_ALL,
-                }
-                if colors:
-                    for key, color in colors.items():
-                        # Find the formatted value in the message and wrap it with color
-                        if color in color_map:
-                            color = color_map[color]
-                        if key in params:
-                            value_str = str(params[key])
-                            formatted_message = formatted_message.replace(
-                                value_str, f"{color}{value_str}{Style.RESET_ALL}"
-                            )
+                # check is need apply box
+                if boxes and key in boxes:
+                    formatted_message = formatted_message.replace(value_str,
+                        create_box_message(value_str, type=str(level)))

-            except KeyError as e:
-                formatted_message = (
-                    f"LOGGING ERROR: Missing parameter {e} in message template"
-                )
-                level = LogLevel.ERROR
        else:
-            formatted_message = message
+            formatted_message = parsed_message

        # Construct the full log line
-        color = base_color or self.colors[level]
-        log_line = f"{color}{self._format_tag(tag)} {self._get_icon(tag)} {formatted_message}{Style.RESET_ALL}"
+        color: LogColor = base_color or self.colors[level]
+        log_line = f"[{color}]{self._format_tag(tag)} {self._get_icon(tag)} {formatted_message} [/{color}]"

        # Output to console if verbose
        if self.verbose or kwargs.get("force_verbose", False):
-            print(log_line)
+            self.console.print(log_line)

        # Write to file if configured
        self._write_to_file(log_line)
@@ -292,8 +294,8 @@ class AsyncLogger(AsyncLoggerBase):
                "timing": timing,
            },
            colors={
-                "status": Fore.GREEN if success else Fore.RED,
-                "timing": Fore.YELLOW,
+                "status": LogColor.SUCCESS if success else LogColor.ERROR,
+                "timing": LogColor.WARNING,
            },
        )

--- a/crawl4ai/async_webcrawler.py
+++ b/crawl4ai/async_webcrawler.py
@@ -2,7 +2,6 @@ from .__version__ import __version__ as crawl4ai_version
 import os
 import sys
 import time
-from colorama import Fore
 from pathlib import Path
 from typing import Optional, List
 import json
@@ -44,7 +43,6 @@ from .utils import (
    sanitize_input_encode,
    InvalidCSSSelectorError,
    fast_format_html,
-    create_box_message,
    get_error_context,
    RobotsParser,
    preprocess_html_for_schema,
@@ -419,7 +417,7 @@ class AsyncWebCrawler:

                self.logger.error_status(
                    url=url,
-                    error=create_box_message(error_message, type="error"),
+                    error=error_message,
                    tag="ERROR",
                )

@@ -496,11 +494,13 @@ class AsyncWebCrawler:
            cleaned_html = sanitize_input_encode(
                result.get("cleaned_html", ""))
            media = result.get("media", {})
+            tables = media.pop("tables", []) if isinstance(media, dict) else []
            links = result.get("links", {})
            metadata = result.get("metadata", {})
        else:
            cleaned_html = sanitize_input_encode(result.cleaned_html)
            media = result.media.model_dump()
+            tables = media.pop("tables", [])
            links = result.links.model_dump()
            metadata = result.metadata

@@ -627,6 +627,7 @@ class AsyncWebCrawler:
            cleaned_html=cleaned_html,
            markdown=markdown_result,
            media=media,
+            tables=tables,                       # NEW
            links=links,
            metadata=metadata,
            screenshot=screenshot_data,
--- a/crawl4ai/browser_manager.py
+++ b/crawl4ai/browser_manager.py
@@ -5,7 +5,10 @@ import os
 import sys
 import shutil
 import tempfile
+import psutil  
+import signal
 import subprocess
+import shlex
 from playwright.async_api import BrowserContext
 import hashlib
 from .js_snippet import load_js_script
@@ -193,6 +196,45 @@ class ManagedBrowser:
        
        if self.browser_config.extra_args:
            args.extend(self.browser_config.extra_args)
+            
+
+        # ── make sure no old Chromium instance is owning the same port/profile ──
+        try:
+            if sys.platform == "win32":
+                if psutil is None:
+                    raise RuntimeError("psutil not available, cannot clean old browser")
+                for p in psutil.process_iter(["pid", "name", "cmdline"]):
+                    cl = " ".join(p.info.get("cmdline") or [])
+                    if (
+                        f"--remote-debugging-port={self.debugging_port}" in cl
+                        and f"--user-data-dir={self.user_data_dir}" in cl
+                    ):
+                        p.kill()
+                        p.wait(timeout=5)
+            else:  # macOS / Linux
+                # kill any process listening on the same debugging port
+                pids = (
+                    subprocess.check_output(shlex.split(f"lsof -t -i:{self.debugging_port}"))
+                    .decode()
+                    .strip()
+                    .splitlines()
+                )
+                for pid in pids:
+                    try:
+                        os.kill(int(pid), signal.SIGTERM)
+                    except ProcessLookupError:
+                        pass
+
+                # remove Chromium singleton locks, or new launch exits with
+                # “Opening in existing browser session.”
+                for f in ("SingletonLock", "SingletonSocket", "SingletonCookie"):
+                    fp = os.path.join(self.user_data_dir, f)
+                    if os.path.exists(fp):
+                        os.remove(fp)
+        except Exception as _e:
+            # non-fatal — we'll try to start anyway, but log what happened
+            self.logger.warning(f"pre-launch cleanup failed: {_e}", tag="BROWSER")            
+            

        # Start browser process
        try:
@@ -922,7 +964,7 @@ class BrowserManager:
            pages = context.pages
            page = next((p for p in pages if p.url == crawlerRunConfig.url), None)
            if not page:
-                page = await context.new_page()
+                page = context.pages[0] # await context.new_page()
        else:
            # Otherwise, check if we have an existing context for this config
            config_signature = self._make_config_signature(crawlerRunConfig)
--- a/crawl4ai/browser_profiler.py
+++ b/crawl4ai/browser_profiler.py
@@ -15,12 +15,12 @@ import shutil
 import json
 import subprocess
 import time
-from typing import List, Dict, Optional, Any, Tuple
-from colorama import Fore, Style, init
+from typing import List, Dict, Optional, Any
+from rich.console import Console

 from .async_configs import BrowserConfig
 from .browser_manager import ManagedBrowser
-from .async_logger import AsyncLogger, AsyncLoggerBase
+from .async_logger import AsyncLogger, AsyncLoggerBase, LogColor
 from .utils import get_home_folder


@@ -45,8 +45,8 @@ class BrowserProfiler:
            logger (AsyncLoggerBase, optional): Logger for outputting messages.
                If None, a default AsyncLogger will be created.
        """
-        # Initialize colorama for colorful terminal output
-        init()
+        # Initialize rich console for colorful input prompts
+        self.console = Console()
        
        # Create a logger if not provided
        if logger is None:
@@ -127,26 +127,30 @@ class BrowserProfiler:
        profile_path = os.path.join(self.profiles_dir, profile_name)
        os.makedirs(profile_path, exist_ok=True)
        
-        # Print instructions for the user with colorama formatting
-        border = f"{Fore.CYAN}{'='*80}{Style.RESET_ALL}"
-        self.logger.info(f"\n{border}", tag="PROFILE")
-        self.logger.info(f"Creating browser profile: {Fore.GREEN}{profile_name}{Style.RESET_ALL}", tag="PROFILE")
-        self.logger.info(f"Profile directory: {Fore.YELLOW}{profile_path}{Style.RESET_ALL}", tag="PROFILE")
+        # Print instructions for the user with rich formatting
+        border = "{'='*80}"
+        self.logger.info("{border}", tag="PROFILE", params={"border": f"\n{border}"}, colors={"border": LogColor.CYAN})
+        self.logger.info("Creating browser profile: {profile_name}", tag="PROFILE", params={"profile_name": profile_name}, colors={"profile_name": LogColor.GREEN})
+        self.logger.info("Profile directory: {profile_path}", tag="PROFILE", params={"profile_path": profile_path}, colors={"profile_path": LogColor.YELLOW})
        
        self.logger.info("\nInstructions:", tag="PROFILE")
        self.logger.info("1. A browser window will open for you to set up your profile.", tag="PROFILE")
-        self.logger.info(f"2. {Fore.CYAN}Log in to websites{Style.RESET_ALL}, configure settings, etc. as needed.", tag="PROFILE")
-        self.logger.info(f"3. When you're done, {Fore.YELLOW}press 'q' in this terminal{Style.RESET_ALL} to close the browser.", tag="PROFILE")
+        self.logger.info("{segment}, configure settings, etc. as needed.", tag="PROFILE", params={"segment": "2. Log in to websites"}, colors={"segment": LogColor.CYAN})
+        self.logger.info("3. When you're done, {segment} to close the browser.", tag="PROFILE", params={"segment": "press 'q' in this terminal"}, colors={"segment": LogColor.YELLOW})
        self.logger.info("4. The profile will be saved and ready to use with Crawl4AI.", tag="PROFILE")
-        self.logger.info(f"{border}\n", tag="PROFILE")
+        self.logger.info("{border}", tag="PROFILE", params={"border": f"{border}\n"}, colors={"border": LogColor.CYAN})
+        
+        browser_config.headless = False
+        browser_config.user_data_dir = profile_path
+        
        
        # Create managed browser instance
        managed_browser = ManagedBrowser(
-            browser_type=browser_config.browser_type,
-            user_data_dir=profile_path,
-            headless=False,  # Must be visible
+            browser_config=browser_config,
+            # user_data_dir=profile_path,
+            # headless=False,  # Must be visible
            logger=self.logger,
-            debugging_port=browser_config.debugging_port
+            # debugging_port=browser_config.debugging_port
        )
        
        # Set up signal handlers to ensure cleanup on interrupt
@@ -181,7 +185,7 @@ class BrowserProfiler:
            import select
            
            # First output the prompt
-            self.logger.info(f"{Fore.CYAN}Press '{Fore.WHITE}q{Fore.CYAN}' when you've finished using the browser...{Style.RESET_ALL}", tag="PROFILE")
+            self.logger.info("Press 'q' when you've finished using the browser...", tag="PROFILE")
            
            # Save original terminal settings
            fd = sys.stdin.fileno()
@@ -197,7 +201,7 @@ class BrowserProfiler:
                    if readable:
                        key = sys.stdin.read(1)
                        if key.lower() == 'q':
-                            self.logger.info(f"{Fore.GREEN}Closing browser and saving profile...{Style.RESET_ALL}", tag="PROFILE")
+                            self.logger.info("Closing browser and saving profile...", tag="PROFILE", base_color=LogColor.GREEN)
                            user_done_event.set()
                            return
                    
@@ -223,7 +227,7 @@ class BrowserProfiler:
                self.logger.error("Failed to start browser process.", tag="PROFILE")
                return None
            
-            self.logger.info(f"Browser launched. {Fore.CYAN}Waiting for you to finish...{Style.RESET_ALL}", tag="PROFILE") 
+            self.logger.info("Browser launched. Waiting for you to finish...", tag="PROFILE") 
            
            # Start listening for keyboard input
            listener_task = asyncio.create_task(listen_for_quit_command())
@@ -245,10 +249,10 @@ class BrowserProfiler:
                self.logger.info("Terminating browser process...", tag="PROFILE")
                await managed_browser.cleanup()
            
-            self.logger.success(f"Browser closed. Profile saved at: {Fore.GREEN}{profile_path}{Style.RESET_ALL}", tag="PROFILE")
+            self.logger.success(f"Browser closed. Profile saved at: {profile_path}", tag="PROFILE")
                
        except Exception as e:
-            self.logger.error(f"Error creating profile: {str(e)}", tag="PROFILE")
+            self.logger.error(f"Error creating profile: {e!s}", tag="PROFILE")
            await managed_browser.cleanup()
            return None
        finally:
@@ -440,25 +444,27 @@ class BrowserProfiler:
            ```
        """
        while True:
-            self.logger.info(f"\n{Fore.CYAN}Profile Management Options:{Style.RESET_ALL}", tag="MENU")
-            self.logger.info(f"1. {Fore.GREEN}Create a new profile{Style.RESET_ALL}", tag="MENU")
-            self.logger.info(f"2. {Fore.YELLOW}List available profiles{Style.RESET_ALL}", tag="MENU")
-            self.logger.info(f"3. {Fore.RED}Delete a profile{Style.RESET_ALL}", tag="MENU")
+            self.logger.info("\nProfile Management Options:", tag="MENU")
+            self.logger.info("1. Create a new profile", tag="MENU", base_color=LogColor.GREEN)
+            self.logger.info("2. List available profiles", tag="MENU", base_color=LogColor.YELLOW)
+            self.logger.info("3. Delete a profile", tag="MENU", base_color=LogColor.RED)
            
            # Only show crawl option if callback provided
            if crawl_callback:
-                self.logger.info(f"4. {Fore.CYAN}Use a profile to crawl a website{Style.RESET_ALL}", tag="MENU")
-                self.logger.info(f"5. {Fore.MAGENTA}Exit{Style.RESET_ALL}", tag="MENU")
+                self.logger.info("4. Use a profile to crawl a website", tag="MENU", base_color=LogColor.CYAN)
+                self.logger.info("5. Exit", tag="MENU", base_color=LogColor.MAGENTA)
                exit_option = "5"
            else:
-                self.logger.info(f"4. {Fore.MAGENTA}Exit{Style.RESET_ALL}", tag="MENU")
+                self.logger.info("4. Exit", tag="MENU", base_color=LogColor.MAGENTA)
                exit_option = "4"
            
-            choice = input(f"\n{Fore.CYAN}Enter your choice (1-{exit_option}): {Style.RESET_ALL}")
+            self.logger.print(f"\n[cyan]Enter your choice (1-{exit_option}): [/cyan]", end="")
+            choice = input()
            
            if choice == "1":
                # Create new profile
-                name = input(f"{Fore.GREEN}Enter a name for the new profile (or press Enter for auto-generated name): {Style.RESET_ALL}")
+                self.console.print("[green]Enter a name for the new profile (or press Enter for auto-generated name): [/green]", end="")
+                name = input()
                await self.create_profile(name or None)
                
            elif choice == "2":
@@ -472,8 +478,8 @@ class BrowserProfiler:
                # Print profile information with colorama formatting
                self.logger.info("\nAvailable profiles:", tag="PROFILES")
                for i, profile in enumerate(profiles):
-                    self.logger.info(f"[{i+1}] {Fore.CYAN}{profile['name']}{Style.RESET_ALL}", tag="PROFILES")
-                    self.logger.info(f"    Path: {Fore.YELLOW}{profile['path']}{Style.RESET_ALL}", tag="PROFILES")
+                    self.logger.info(f"[{i+1}] {profile['name']}", tag="PROFILES")
+                    self.logger.info(f"    Path: {profile['path']}", tag="PROFILES", base_color=LogColor.YELLOW)
                    self.logger.info(f"    Created: {profile['created'].strftime('%Y-%m-%d %H:%M:%S')}", tag="PROFILES")
                    self.logger.info(f"    Browser type: {profile['type']}", tag="PROFILES")
                    self.logger.info("", tag="PROFILES")  # Empty line for spacing
@@ -486,12 +492,13 @@ class BrowserProfiler:
                    continue
                    
                # Display numbered list
-                self.logger.info(f"\n{Fore.YELLOW}Available profiles:{Style.RESET_ALL}", tag="PROFILES")
+                self.logger.info("\nAvailable profiles:", tag="PROFILES", base_color=LogColor.YELLOW)
                for i, profile in enumerate(profiles):
                    self.logger.info(f"[{i+1}] {profile['name']}", tag="PROFILES")
                    
                # Get profile to delete
-                profile_idx = input(f"{Fore.RED}Enter the number of the profile to delete (or 'c' to cancel): {Style.RESET_ALL}")
+                self.console.print("[red]Enter the number of the profile to delete (or 'c' to cancel): [/red]", end="")
+                profile_idx = input()
                if profile_idx.lower() == 'c':
                    continue
                    
@@ -499,17 +506,18 @@ class BrowserProfiler:
                    idx = int(profile_idx) - 1
                    if 0 <= idx < len(profiles):
                        profile_name = profiles[idx]["name"]
-                        self.logger.info(f"Deleting profile: {Fore.YELLOW}{profile_name}{Style.RESET_ALL}", tag="PROFILES")
+                        self.logger.info(f"Deleting profile: [yellow]{profile_name}[/yellow]", tag="PROFILES")
                        
                        # Confirm deletion
-                        confirm = input(f"{Fore.RED}Are you sure you want to delete this profile? (y/n): {Style.RESET_ALL}")
+                        self.console.print("[red]Are you sure you want to delete this profile? (y/n): [/red]", end="")
+                        confirm = input()
                        if confirm.lower() == 'y':
                            success = self.delete_profile(profiles[idx]["path"])
                            
                            if success:
-                                self.logger.success(f"Profile {Fore.GREEN}{profile_name}{Style.RESET_ALL} deleted successfully", tag="PROFILES")
+                                self.logger.success(f"Profile {profile_name} deleted successfully", tag="PROFILES")
                            else:
-                                self.logger.error(f"Failed to delete profile {Fore.RED}{profile_name}{Style.RESET_ALL}", tag="PROFILES")
+                                self.logger.error(f"Failed to delete profile {profile_name}", tag="PROFILES")
                    else:
                        self.logger.error("Invalid profile number", tag="PROFILES")
                except ValueError:
@@ -523,12 +531,13 @@ class BrowserProfiler:
                    continue
                    
                # Display numbered list
-                self.logger.info(f"\n{Fore.YELLOW}Available profiles:{Style.RESET_ALL}", tag="PROFILES")
+                self.logger.info("\nAvailable profiles:", tag="PROFILES", base_color=LogColor.YELLOW)
                for i, profile in enumerate(profiles):
                    self.logger.info(f"[{i+1}] {profile['name']}", tag="PROFILES")
                    
                # Get profile to use
-                profile_idx = input(f"{Fore.CYAN}Enter the number of the profile to use (or 'c' to cancel): {Style.RESET_ALL}")
+                self.console.print("[cyan]Enter the number of the profile to use (or 'c' to cancel): [/cyan]", end="")
+                profile_idx = input()
                if profile_idx.lower() == 'c':
                    continue
                    
@@ -536,7 +545,8 @@ class BrowserProfiler:
                    idx = int(profile_idx) - 1
                    if 0 <= idx < len(profiles):
                        profile_path = profiles[idx]["path"]
-                        url = input(f"{Fore.CYAN}Enter the URL to crawl: {Style.RESET_ALL}")
+                        self.console.print("[cyan]Enter the URL to crawl: [/cyan]", end="")
+                        url = input()
                        if url:
                            # Call the provided crawl callback
                            await crawl_callback(profile_path, url)
@@ -599,11 +609,11 @@ class BrowserProfiler:
        # Print initial information
        border = f"{Fore.CYAN}{'='*80}{Style.RESET_ALL}"
        self.logger.info(f"\n{border}", tag="CDP")
-        self.logger.info(f"Launching standalone browser with CDP debugging", tag="CDP")
-        self.logger.info(f"Browser type: {Fore.GREEN}{browser_type}{Style.RESET_ALL}", tag="CDP")
-        self.logger.info(f"Profile path: {Fore.YELLOW}{profile_path}{Style.RESET_ALL}", tag="CDP")
-        self.logger.info(f"Debugging port: {Fore.CYAN}{debugging_port}{Style.RESET_ALL}", tag="CDP")
-        self.logger.info(f"Headless mode: {Fore.CYAN}{headless}{Style.RESET_ALL}", tag="CDP")
+        self.logger.info("Launching standalone browser with CDP debugging", tag="CDP")
+        self.logger.info("Browser type: {browser_type}", tag="CDP", params={"browser_type": browser_type}, colors={"browser_type": LogColor.CYAN})
+        self.logger.info("Profile path: {profile_path}", tag="CDP", params={"profile_path": profile_path}, colors={"profile_path": LogColor.YELLOW})
+        self.logger.info(f"Debugging port: {debugging_port}", tag="CDP")
+        self.logger.info(f"Headless mode: {headless}", tag="CDP")
        
        # Create managed browser instance
        managed_browser = ManagedBrowser(
@@ -646,7 +656,7 @@ class BrowserProfiler:
            import select
            
            # First output the prompt
-            self.logger.info(f"{Fore.CYAN}Press '{Fore.WHITE}q{Fore.CYAN}' to stop the browser and exit...{Style.RESET_ALL}", tag="CDP")
+            self.logger.info("Press 'q' to stop the browser and exit...", tag="CDP")
            
            # Save original terminal settings
            fd = sys.stdin.fileno()
@@ -662,7 +672,7 @@ class BrowserProfiler:
                    if readable:
                        key = sys.stdin.read(1)
                        if key.lower() == 'q':
-                            self.logger.info(f"{Fore.GREEN}Closing browser...{Style.RESET_ALL}", tag="CDP")
+                            self.logger.info("Closing browser...", tag="CDP")
                            user_done_event.set()
                            return
                    
@@ -716,20 +726,20 @@ class BrowserProfiler:
                self.logger.error("Failed to start browser process.", tag="CDP")
                return None
            
-            self.logger.info(f"Browser launched successfully. Retrieving CDP information...", tag="CDP") 
+            self.logger.info("Browser launched successfully. Retrieving CDP information...", tag="CDP") 
            
            # Get CDP URL and JSON config
            cdp_url, config_json = await get_cdp_json(debugging_port)
            
            if cdp_url:
-                self.logger.success(f"CDP URL: {Fore.GREEN}{cdp_url}{Style.RESET_ALL}", tag="CDP")
+                self.logger.success(f"CDP URL: {cdp_url}", tag="CDP")
                
                if config_json:
                    # Display relevant CDP information
-                    self.logger.info(f"Browser: {Fore.CYAN}{config_json.get('Browser', 'Unknown')}{Style.RESET_ALL}", tag="CDP")
-                    self.logger.info(f"Protocol Version: {config_json.get('Protocol-Version', 'Unknown')}", tag="CDP")
+                    self.logger.info(f"Browser: {config_json.get('Browser', 'Unknown')}", tag="CDP", colors={"Browser": LogColor.CYAN})
+                    self.logger.info(f"Protocol Version: {config_json.get('Protocol-Version', 'Unknown')}", tag="CDP", colors={"Protocol-Version": LogColor.CYAN})
                    if 'webSocketDebuggerUrl' in config_json:
-                        self.logger.info(f"WebSocket URL: {Fore.GREEN}{config_json['webSocketDebuggerUrl']}{Style.RESET_ALL}", tag="CDP")
+                        self.logger.info("WebSocket URL: {webSocketDebuggerUrl}", tag="CDP", params={"webSocketDebuggerUrl": config_json['webSocketDebuggerUrl']}, colors={"webSocketDebuggerUrl": LogColor.GREEN})
                else:
                    self.logger.warning("Could not retrieve CDP configuration JSON", tag="CDP")
            else:
@@ -757,7 +767,7 @@ class BrowserProfiler:
                self.logger.info("Terminating browser process...", tag="CDP")
                await managed_browser.cleanup()
            
-            self.logger.success(f"Browser closed.", tag="CDP")
+            self.logger.success("Browser closed.", tag="CDP")
                
        except Exception as e:
            self.logger.error(f"Error launching standalone browser: {str(e)}", tag="CDP")
@@ -972,3 +982,30 @@ class BrowserProfiler:
            'info': browser_info
        }

+
+if __name__ == "__main__":
+    # Example usage
+    profiler = BrowserProfiler()
+    
+    # Create a new profile
+    import os
+    from pathlib import Path
+    home_dir = Path.home()
+    profile_path = asyncio.run(profiler.create_profile( str(home_dir / ".crawl4ai/profiles/test-profile")))
+
+        
+            
+    # Launch a standalone browser
+    asyncio.run(profiler.launch_standalone_browser())
+    
+    # List profiles
+    profiles = profiler.list_profiles()
+    for profile in profiles:
+        print(f"Profile: {profile['name']}, Path: {profile['path']}")
+    
+    # Delete a profile
+    success = profiler.delete_profile("my-profile")
+    if success:
+        print("Profile deleted successfully")
+    else:
+        print("Failed to delete profile")
--- a/crawl4ai/content_filter_strategy.py
+++ b/crawl4ai/content_filter_strategy.py
@@ -27,8 +27,7 @@ import json
 import hashlib
 from pathlib import Path
 from concurrent.futures import ThreadPoolExecutor
-from .async_logger import AsyncLogger, LogLevel
-from colorama import Fore, Style
+from .async_logger import AsyncLogger, LogLevel, LogColor


 class RelevantContentFilter(ABC):
@@ -846,8 +845,7 @@ class LLMContentFilter(RelevantContentFilter):
                },
                colors={
                    **AsyncLogger.DEFAULT_COLORS,
-                    LogLevel.INFO: Fore.MAGENTA
-                    + Style.DIM,  # Dimmed purple for LLM ops
+                    LogLevel.INFO: LogColor.DIM_MAGENTA  # Dimmed purple for LLM ops
                },
            )
        else:
@@ -892,7 +890,7 @@ class LLMContentFilter(RelevantContentFilter):
                "Starting LLM markdown content filtering process",
                tag="LLM",
                params={"provider": self.llm_config.provider},
-                colors={"provider": Fore.CYAN},
+                colors={"provider": LogColor.CYAN},
            )

        # Cache handling
@@ -929,7 +927,7 @@ class LLMContentFilter(RelevantContentFilter):
                "LLM markdown: Split content into {chunk_count} chunks",
                tag="CHUNK",
                params={"chunk_count": len(html_chunks)},
-                colors={"chunk_count": Fore.YELLOW},
+                colors={"chunk_count": LogColor.YELLOW},
            )

        start_time = time.time()
@@ -1038,7 +1036,7 @@ class LLMContentFilter(RelevantContentFilter):
                "LLM markdown: Completed processing in {time:.2f}s",
                tag="LLM",
                params={"time": end_time - start_time},
-                colors={"time": Fore.YELLOW},
+                colors={"time": LogColor.YELLOW},
            )

        result = ordered_results if ordered_results else []
--- a/crawl4ai/models.py
+++ b/crawl4ai/models.py
@@ -1,4 +1,4 @@
-from pydantic import BaseModel, HttpUrl, PrivateAttr
+from pydantic import BaseModel, HttpUrl, PrivateAttr, Field
 from typing import List, Dict, Optional, Callable, Awaitable, Union, Any
 from typing import AsyncGenerator
 from typing import Generic, TypeVar
@@ -150,6 +150,7 @@ class CrawlResult(BaseModel):
    redirected_url: Optional[str] = None
    network_requests: Optional[List[Dict[str, Any]]] = None
    console_messages: Optional[List[Dict[str, Any]]] = None
+    tables: List[Dict] = Field(default_factory=list)  # NEW – [{headers,rows,caption,summary}]

    class Config:
        arbitrary_types_allowed = True
--- a/crawl4ai/utils.py
+++ b/crawl4ai/utils.py
@@ -20,7 +20,6 @@ from urllib.parse import urljoin
 import requests
 from requests.exceptions import InvalidSchema
 import xxhash
-from colorama import Fore, Style, init
 import textwrap
 import cProfile
 import pstats
@@ -441,14 +440,13 @@ def create_box_message(
        str: A formatted string containing the styled message box.
    """

-    init()
-
    # Define border and text colors for different types
    styles = {
-        "warning": (Fore.YELLOW, Fore.LIGHTYELLOW_EX, "⚠"),
-        "info": (Fore.BLUE, Fore.LIGHTBLUE_EX, "ℹ"),
-        "success": (Fore.GREEN, Fore.LIGHTGREEN_EX, "✓"),
-        "error": (Fore.RED, Fore.LIGHTRED_EX, "×"),
+        "warning": ("yellow", "bright_yellow", "⚠"),
+        "info": ("blue", "bright_blue", "ℹ"),
+        "debug": ("lightblack", "bright_black", "⋯"),
+        "success": ("green", "bright_green", "✓"),
+        "error": ("red", "bright_red", "×"),
    }

    border_color, text_color, prefix = styles.get(type.lower(), styles["info"])
@@ -480,12 +478,12 @@ def create_box_message(
    # Create the box with colored borders and lighter text
    horizontal_line = h_line * (width - 1)
    box = [
-        f"{border_color}{tl}{horizontal_line}{tr}",
+        f"[{border_color}]{tl}{horizontal_line}{tr}[/{border_color}]",
        *[
-            f"{border_color}{v_line}{text_color} {line:<{width-2}}{border_color}{v_line}"
+            f"[{border_color}]{v_line}[{text_color}] {line:<{width-2}}[/{text_color}][{border_color}]{v_line}[/{border_color}]"
            for line in formatted_lines
        ],
-        f"{border_color}{bl}{horizontal_line}{br}{Style.RESET_ALL}",
+        f"[{border_color}]{bl}{horizontal_line}{br}[/{border_color}]",
    ]

    result = "\n".join(box)
@@ -2778,4 +2776,3 @@ def preprocess_html_for_schema(html_content, text_threshold=100, attr_value_thre
        # Fallback for parsing errors
        return html_content[:max_size] if len(html_content) > max_size else html_content
    
-
--- a/deploy/docker/README.md
+++ b/deploy/docker/README.md
@@ -58,7 +58,7 @@ Pull and run images directly from Docker Hub without building locally.

 #### 1. Pull the Image

-Our latest release candidate is `0.6.0rc1-r1`. Images are built with multi-arch manifests, so Docker automatically pulls the correct version for your system.
+Our latest release candidate is `0.6.0-r1`. Images are built with multi-arch manifests, so Docker automatically pulls the correct version for your system.

 ```bash
 # Pull the release candidate (recommended for latest features)
@@ -124,9 +124,9 @@ docker stop crawl4ai && docker rm crawl4ai
 #### Docker Hub Versioning Explained

 *   **Image Name:** `unclecode/crawl4ai`
-*   **Tag Format:** `LIBRARY_VERSION[-SUFFIX]` (e.g., `0.6.0rc1-r1`)
+*   **Tag Format:** `LIBRARY_VERSION[-SUFFIX]` (e.g., `0.6.0-r1`)
    *   `LIBRARY_VERSION`: The semantic version of the core `crawl4ai` Python library
-    *   `SUFFIX`: Optional tag for release candidates (`rc1`) and revisions (`r1`)
+    *   `SUFFIX`: Optional tag for release candidates (``) and revisions (`r1`)
 *   **`latest` Tag:** Points to the most recent stable version
 *   **Multi-Architecture Support:** All images support both `linux/amd64` and `linux/arm64` architectures through a single tag

--- a/deploy/docker/static/playground/index.html
+++ b/deploy/docker/static/playground/index.html
@@ -193,7 +193,48 @@
                <textarea id="urls" class="w-full bg-dark border border-border rounded p-2 h-32 text-sm mb-4"
                    spellcheck="false">https://example.com</textarea>

-                <details class="mb-4">
+                <!-- Specific options for /md endpoint -->
+                <details id="md-options" class="mb-4 hidden">
+                    <summary class="text-sm text-secondary cursor-pointer">/md Options</summary>
+                    <div class="mt-2 space-y-3 p-2 border border-border rounded">
+                        <div>
+                            <label for="md-filter" class="block text-xs text-secondary mb-1">Filter Type</label>
+                            <select id="md-filter" class="bg-dark border border-border rounded px-2 py-1 text-sm w-full">
+                                <option value="fit">fit - Adaptive content filtering</option>
+                                <option value="raw">raw - No filtering</option>
+                                <option value="bm25">bm25 - BM25 keyword relevance</option>
+                                <option value="llm">llm - LLM-based filtering</option>
+                            </select>
+                        </div>
+                        <div>
+                            <label for="md-query" class="block text-xs text-secondary mb-1">Query (for BM25/LLM filters)</label>
+                            <input id="md-query" type="text" placeholder="Enter search terms or instructions" 
+                                class="bg-dark border border-border rounded px-2 py-1 text-sm w-full">
+                        </div>
+                        <div>
+                            <label for="md-cache" class="block text-xs text-secondary mb-1">Cache Mode</label>
+                            <select id="md-cache" class="bg-dark border border-border rounded px-2 py-1 text-sm w-full">
+                                <option value="0">Write-Only (0)</option>
+                                <option value="1">Enabled (1)</option>
+                            </select>
+                        </div>
+                    </div>
+                </details>
+
+                <!-- Specific options for /llm endpoint -->
+                <details id="llm-options" class="mb-4 hidden">
+                    <summary class="text-sm text-secondary cursor-pointer">/llm Options</summary>
+                    <div class="mt-2 space-y-3 p-2 border border-border rounded">
+                        <div>
+                            <label for="llm-question" class="block text-xs text-secondary mb-1">Question</label>
+                            <input id="llm-question" type="text" value="What is this page about?" 
+                                class="bg-dark border border-border rounded px-2 py-1 text-sm w-full">
+                        </div>
+                    </div>
+                </details>
+
+                <!-- Advanced config for /crawl endpoints -->
+                <details id="adv-config" class="mb-4">
                    <summary class="text-sm text-secondary cursor-pointer">Advanced Config <span
                        class="text-xs text-primary">(Python → auto‑JSON)</span></summary>

@@ -437,6 +478,33 @@
            cm.setValue(TEMPLATES[e.target.value]);
            document.getElementById('cfg-status').textContent = '';
        });
+        
+        // Handle endpoint selection change to show appropriate options
+        document.getElementById('endpoint').addEventListener('change', function(e) {
+            const endpoint = e.target.value;
+            const mdOptions = document.getElementById('md-options');
+            const llmOptions = document.getElementById('llm-options');
+            const advConfig = document.getElementById('adv-config');
+            
+            // Hide all option sections first
+            mdOptions.classList.add('hidden');
+            llmOptions.classList.add('hidden');
+            advConfig.classList.add('hidden');
+            
+            // Show the appropriate section based on endpoint
+            if (endpoint === 'md') {
+                mdOptions.classList.remove('hidden');
+                // Auto-open the /md options
+                mdOptions.setAttribute('open', '');
+            } else if (endpoint === 'llm') {
+                llmOptions.classList.remove('hidden');
+                // Auto-open the /llm options
+                llmOptions.setAttribute('open', '');
+            } else {
+                // For /crawl endpoints, show the advanced config
+                advConfig.classList.remove('hidden');
+            }
+        });

        async function pyConfigToJson() {
            const code = cm.getValue().trim();
@@ -494,10 +562,18 @@
        }

        // Generate code snippets
-        function generateSnippets(api, payload) {
+        function generateSnippets(api, payload, method = 'POST') {
            // Python snippet
            const pyCodeEl = document.querySelector('#python-content code');
-            const pySnippet = `import httpx\n\nasync def crawl():\n    async with httpx.AsyncClient() as client:\n        response = await client.post(\n            "${window.location.origin}${api}",\n            json=${JSON.stringify(payload, null, 4).replace(/\n/g, '\n            ')}\n        )\n        return response.json()`;
+            let pySnippet;
+            
+            if (method === 'GET') {
+                // GET request (for /llm endpoint)
+                pySnippet = `import httpx\n\nasync def crawl():\n    async with httpx.AsyncClient() as client:\n        response = await client.get(\n            "${window.location.origin}${api}"\n        )\n        return response.json()`;
+            } else {
+                // POST request (for /crawl and /md endpoints)
+                pySnippet = `import httpx\n\nasync def crawl():\n    async with httpx.AsyncClient() as client:\n        response = await client.post(\n            "${window.location.origin}${api}",\n            json=${JSON.stringify(payload, null, 4).replace(/\n/g, '\n            ')}\n        )\n        return response.json()`;
+            }

            pyCodeEl.textContent = pySnippet;
            pyCodeEl.className = 'python hljs'; // Reset classes
@@ -505,7 +581,15 @@

            // cURL snippet
            const curlCodeEl = document.querySelector('#curl-content code');
-            const curlSnippet = `curl -X POST ${window.location.origin}${api} \\\n  -H "Content-Type: application/json" \\\n  -d '${JSON.stringify(payload)}'`;
+            let curlSnippet;
+            
+            if (method === 'GET') {
+                // GET request (for /llm endpoint)
+                curlSnippet = `curl -X GET "${window.location.origin}${api}"`;
+            } else {
+                // POST request (for /crawl and /md endpoints)
+                curlSnippet = `curl -X POST ${window.location.origin}${api} \\\n  -H "Content-Type: application/json" \\\n  -d '${JSON.stringify(payload)}'`;
+            }

            curlCodeEl.textContent = curlSnippet;
            curlCodeEl.className = 'bash hljs'; // Reset classes
@@ -536,20 +620,39 @@

            const endpointMap = {
                crawl: '/crawl',
-            };
-
-            /*const endpointMap = {
-                crawl: '/crawl',
-                crawl_stream: '/crawl/stream',
+                // crawl_stream: '/crawl/stream',
                md: '/md',
                llm: '/llm'
-            };*/
+            };

            const api = endpointMap[endpoint];
-            const payload = {
-                urls,
-                ...advConfig
-            };
+            let payload;
+            
+            // Create appropriate payload based on endpoint type
+            if (endpoint === 'md') {
+                // Get values from the /md specific inputs
+                const filterType = document.getElementById('md-filter').value;
+                const query = document.getElementById('md-query').value.trim();
+                const cache = document.getElementById('md-cache').value;
+                
+                // MD endpoint expects: { url, f, q, c }
+                payload = {
+                    url: urls[0], // Take first URL
+                    f: filterType, // Lowercase filter type as required by server
+                    q: query || null, // Use the query if provided, otherwise null
+                    c: cache
+                };
+            } else if (endpoint === 'llm') {
+                // LLM endpoint has a different URL pattern and uses query params
+                // This will be handled directly in the fetch below
+                payload = null;
+            } else {
+                // Default payload for /crawl and /crawl/stream
+                payload = {
+                    urls,
+                    ...advConfig
+                };
+            }

            updateStatus('processing');

@@ -557,7 +660,18 @@
                const startTime = performance.now();
                let response, responseData;

-                if (endpoint === 'crawl_stream') {
+                if (endpoint === 'llm') {
+                    // Special handling for LLM endpoint which uses URL pattern: /llm/{encoded_url}?q={query}
+                    const url = urls[0];
+                    const encodedUrl = encodeURIComponent(url);
+                    // Get the question from the LLM-specific input
+                    const question = document.getElementById('llm-question').value.trim() || "What is this page about?";
+                    
+                    response = await fetch(`${api}/${encodedUrl}?q=${encodeURIComponent(question)}`, {
+                        method: 'GET',
+                        headers: { 'Accept': 'application/json' }
+                    });
+                } else if (endpoint === 'crawl_stream') {
                    // Stream processing
                    response = await fetch(api, {
                        method: 'POST',
@@ -597,7 +711,7 @@
                    document.querySelector('#response-content code').className = 'json hljs'; // Reset classes
                    forceHighlightElement(document.querySelector('#response-content code'));
                } else {
-                    // Regular request
+                    // Regular request (handles /crawl and /md)
                    response = await fetch(api, {
                        method: 'POST',
                        headers: { 'Content-Type': 'application/json' },
@@ -625,7 +739,16 @@
                }

                forceHighlightElement(document.querySelector('#response-content code'));
-                generateSnippets(api, payload);
+                
+                // For generateSnippets, handle the LLM case specially
+                if (endpoint === 'llm') {
+                    const url = urls[0];
+                    const encodedUrl = encodeURIComponent(url);
+                    const question = document.getElementById('llm-question').value.trim() || "What is this page about?";
+                    generateSnippets(`${api}/${encodedUrl}?q=${encodeURIComponent(question)}`, null, 'GET');
+                } else {
+                    generateSnippets(api, payload);
+                }
            } catch (error) {
                console.error('Error:', error);
                updateStatus('error');
@@ -807,9 +930,24 @@
                });
            });
        }
+        
+        // Function to initialize UI based on selected endpoint
+        function initUI() {
+            // Trigger the endpoint change handler to set initial UI state
+            const endpointSelect = document.getElementById('endpoint');
+            const event = new Event('change');
+            endpointSelect.dispatchEvent(event);
+            
+            // Initialize copy buttons
+            initCopyButtons();
+        }

-        // Call this in your DOMContentLoaded or initialization
-        initCopyButtons();
+        // Initialize on page load
+        document.addEventListener('DOMContentLoaded', initUI);
+        // Also call it immediately in case the script runs after DOM is already loaded
+        if (document.readyState !== 'loading') {
+            initUI();
+        }

    </script>
 </body>
--- a/docs/apps/linkdin/README.md
+++ b/docs/apps/linkdin/README.md
@@ -0,0 +1,126 @@
+# Crawl4AI Prospect‑Wizard – step‑by‑step guide
+
+A three‑stage demo that goes from **LinkedIn scraping** ➜ **LLM reasoning** ➜ **graph visualisation**.
+
+```
+prospect‑wizard/
+├─ c4ai_discover.py         # Stage 1 – scrape companies + people
+├─ c4ai_insights.py         # Stage 2 – embeddings, org‑charts, scores
+├─ graph_view_template.html # Stage 3 – graph viewer (static HTML)
+└─ data/                    # output lands here (*.jsonl / *.json)
+```
+
+---
+
+## 1  Install & boot a LinkedIn profile (one‑time)
+
+### 1.1  Install dependencies
+```bash
+pip install crawl4ai openai sentence-transformers networkx pandas vis-network rich
+```
+
+### 1.2  Create / warm a LinkedIn browser profile
+```bash
+crwl profiler
+```
+1. The interactive shell shows **New profile** – hit **enter**.
+2. Choose a name, e.g. `profile_linkedin_uc`.
+3. A Chromium window opens – log in to LinkedIn, solve whatever CAPTCHA, then close.
+
+> Remember the **profile name**. All future runs take `--profile-name <your_name>`.
+
+---
+
+## 2  Discovery – scrape companies & people
+
+```bash
+python c4ai_discover.py full \ 
+  --query "health insurance management" \ 
+  --geo 102713980 \               # Malaysia geoUrn
+  --title_filters "" \            # or "Product,Engineering"
+  --max_companies 10 \            # default set small for workshops
+  --max_people 20 \               # \^ same
+  --profile-name profile_linkedin_uc \ 
+  --outdir ./data \ 
+  --concurrency 2 \ 
+  --log_level debug
+```
+**Outputs** in `./data/`:
+* `companies.jsonl` – one JSON per company
+* `people.jsonl` – one JSON per employee
+
+🛠️  **Dry‑run:** `C4AI_DEMO_DEBUG=1 python c4ai_discover.py full --query coffee` uses bundled HTML snippets, no network.
+
+### Handy geoUrn cheatsheet
+| Location | geoUrn |
+|----------|--------|
+| Singapore | **103644278** |
+| Malaysia | **102713980** |
+| United States | **103644922** |
+| United Kingdom | **102221843** |
+| Australia | **101452733** |
+_See more: <https://www.linkedin.com/search/results/companies/?geoUrn=XXX> – the number after `geoUrn=` is what you need._
+
+---
+
+## 3  Insights – embeddings, org‑charts, decision makers
+
+```bash
+python c4ai_insights.py \ 
+  --in  ./data \ 
+  --out ./data \ 
+  --embed_model all-MiniLM-L6-v2 \ 
+  --top_k 10 \ 
+  --openai_model gpt-4.1 \ 
+  --max_llm_tokens 8024 \ 
+  --llm_temperature 1.0 \ 
+  --workers 4
+```
+Emits next to the Stage‑1 files:
+* `company_graph.json` – inter‑company similarity graph
+* `org_chart_<handle>.json` – one per company
+* `decision_makers.csv` – hand‑picked ‘who to pitch’ list
+
+Flags reference (straight from `build_arg_parser()`):
+| Flag | Default | Purpose |
+|------|---------|---------|
+| `--in` | `.` | Stage‑1 output dir |
+| `--out` | `.` | Destination dir |
+| `--embed_model` | `all-MiniLM-L6-v2` | Sentence‑Transformer model |
+| `--top_k` | `10` | Neighbours per company in graph |
+| `--openai_model` | `gpt-4.1` | LLM for scoring decision makers |
+| `--max_llm_tokens` | `8024` | Token budget per LLM call |
+| `--llm_temperature` | `1.0` | Creativity knob |
+| `--stub` | off | Skip OpenAI and fabricate tiny charts |
+| `--workers` | `4` | Parallel LLM workers |
+
+---
+
+## 4  Visualise – interactive graph
+
+After Stage 2 completes, simply open the HTML viewer from the project root:
+```bash
+open graph_view_template.html   # or Live Server / Python -http
+```
+The page fetches `data/company_graph.json` and the `org_chart_*.json` files automatically; keep the `data/` folder beside the HTML file.
+
+* Left pane → list of companies (clans).
+* Click a node to load its org‑chart on the right.
+* Chat drawer lets you ask follow‑up questions; context is pulled from `people.jsonl`.
+
+---
+
+## 5  Common snags
+
+| Symptom | Fix |
+|---------|-----|
+| Infinite CAPTCHA | Use a residential proxy: `--proxy http://user:pass@ip:port` |
+| 429 Too Many Requests | Lower `--concurrency`, rotate profile, add delay |
+| Blank graph | Check JSON paths, clear `localStorage` in browser |
+
+---
+
+### TL;DR
+`crwl profiler` → `c4ai_discover.py` → `c4ai_insights.py` → open `graph_view_template.html`.  
+Live long and `import crawl4ai`.
+
--- a/docs/apps/linkdin/c4ai_discover.py
+++ b/docs/apps/linkdin/c4ai_discover.py
@@ -0,0 +1,440 @@
+#!/usr/bin/env python3
+"""
+c4ai-discover — Stage‑1 Discovery CLI
+
+Scrapes LinkedIn company search + their people pages and dumps two newline‑delimited
+JSON files: companies.jsonl and people.jsonl.
+
+Key design rules
+----------------
+* No BeautifulSoup — Crawl4AI only for network + HTML fetch.
+* JsonCssExtractionStrategy for structured scraping; schema auto‑generated once
+  from sample HTML provided by user and then cached under ./schemas/.
+* Defaults are embedded so the file runs inside VS Code debugger without CLI args.
+* If executed as a console script (argv > 1), CLI flags win.
+* Lightweight deps: argparse + Crawl4AI stack.
+
+Author: Tom @ Kidocode 2025‑04‑26
+"""
+from __future__ import annotations
+
+import warnings, re
+warnings.filterwarnings(
+    "ignore",
+    message=r"The pseudo class ':contains' is deprecated, ':-soup-contains' should be used.*",
+    category=FutureWarning,
+    module=r"soupsieve"
+)
+
+
+# ───────────────────────────────────────────────────────────────────────────────
+# Imports
+# ───────────────────────────────────────────────────────────────────────────────
+import argparse
+import random
+import asyncio
+import json
+import logging
+import os
+import pathlib
+import sys
+# 3rd-party rich for pretty logging
+from rich.console import Console
+from rich.logging import RichHandler
+
+from datetime import datetime, UTC
+from itertools import cycle
+from textwrap import dedent
+from types import SimpleNamespace
+from typing import Dict, List, Optional
+from urllib.parse import quote
+from pathlib import Path
+from glob import glob
+
+from crawl4ai import (
+    AsyncWebCrawler,
+    BrowserConfig,
+    CacheMode,
+    CrawlerRunConfig,
+    JsonCssExtractionStrategy,
+    BrowserProfiler,
+    LLMConfig,
+)
+
+# ───────────────────────────────────────────────────────────────────────────────
+# Constants / paths
+# ───────────────────────────────────────────────────────────────────────────────
+BASE_DIR = pathlib.Path(__file__).resolve().parent
+SCHEMA_DIR = BASE_DIR / "schemas"
+SCHEMA_DIR.mkdir(parents=True, exist_ok=True)
+COMPANY_SCHEMA_PATH = SCHEMA_DIR / "company_card.json"
+PEOPLE_SCHEMA_PATH = SCHEMA_DIR / "people_card.json"
+
+# ---------- deterministic target JSON examples ----------
+_COMPANY_SCHEMA_EXAMPLE = {
+    "handle": "/company/posify/",
+    "profile_image": "https://media.licdn.com/dms/image/v2/.../logo.jpg",
+    "name": "Management Research Services, Inc. (MRS, Inc)",
+    "descriptor": "Insurance • Milwaukee, Wisconsin",
+    "about": "Insurance • Milwaukee, Wisconsin",
+    "followers": 1000
+}
+
+_PEOPLE_SCHEMA_EXAMPLE = {
+    "profile_url": "https://www.linkedin.com/in/lily-ng/",
+    "name": "Lily Ng",
+    "headline": "VP Product @ Posify",
+    "followers": 890,
+    "connection_degree": "2nd",
+    "avatar_url": "https://media.licdn.com/dms/image/v2/.../lily.jpg"
+}
+
+# Provided sample HTML snippets (trimmed) — used exactly once to cold‑generate schema.
+_SAMPLE_COMPANY_HTML = (Path(__file__).resolve().parent / "snippets/company.html").read_text()
+_SAMPLE_PEOPLE_HTML = (Path(__file__).resolve().parent / "snippets/people.html").read_text()
+
+# --------- tighter schema prompts ----------
+_COMPANY_SCHEMA_QUERY = dedent(
+    """
+    Using the supplied <li> company-card HTML, build a JsonCssExtractionStrategy schema that,
+    for every card, outputs *exactly* the keys shown in the example JSON below.
+    JSON spec:
+      • handle        – href of the outermost <a> that wraps the logo/title, e.g. "/company/posify/"
+      • profile_image – absolute URL of the <img> inside that link
+      • name          – text of the <a> inside the <span class*='t-16'>
+      • descriptor    – text line with industry • location
+      • about         – text of the <div class*='t-normal'> below the name (industry + geo)
+      • followers     – integer parsed from the <div> containing 'followers'
+      
+    IMPORTANT: Do not use the base64 kind of classes to target element. It's not reliable.
+    The main div parent contains these li element is "div.search-results-container" you can use this.
+    The <ul> parent has "role" equal to "list". Using these two should be enough to target the <li> elements."
+    """
+)
+
+_PEOPLE_SCHEMA_QUERY = dedent(
+    """
+    Using the supplied <li> people-card HTML, build a JsonCssExtractionStrategy schema that
+    outputs exactly the keys in the example JSON below.
+    Fields:
+      • profile_url        – href of the outermost profile link
+      • name               – text inside artdeco-entity-lockup__title
+      • headline           – inner text of artdeco-entity-lockup__subtitle
+      • followers          – integer parsed from the span inside lt-line-clamp--multi-line
+      • connection_degree  – '1st', '2nd', etc. from artdeco-entity-lockup__badge
+      • avatar_url         – src of the <img> within artdeco-entity-lockup__image
+      
+    IMPORTANT: Do not use the base64 kind of classes to target element. It's not reliable.
+    The main div parent contains these li element is a "div" has these classes "artdeco-card org-people-profile-card__card-spacing org-people__card-margin-bottom".
+    """
+)
+
+# ---------------------------------------------------------------------------
+# Utility helpers
+# ---------------------------------------------------------------------------
+
+def _load_or_build_schema(
+    path: pathlib.Path, 
+    sample_html: str, 
+    query: str, 
+    example_json: Dict,
+    force = False
+) -> Dict:
+    """Load schema from path, else call generate_schema once and persist."""
+    if path.exists() and not force:
+        return json.loads(path.read_text())
+
+    logging.info("[SCHEMA] Generating schema %s", path.name)
+    schema = JsonCssExtractionStrategy.generate_schema(
+        html=sample_html,
+        llm_config=LLMConfig(
+            provider=os.getenv("C4AI_SCHEMA_PROVIDER", "openai/gpt-4o"),
+            api_token=os.getenv("OPENAI_API_KEY", "env:OPENAI_API_KEY"),
+        ),
+        query=query,
+        target_json_example=json.dumps(example_json, indent=2),
+    )
+    path.write_text(json.dumps(schema, indent=2))
+    return schema
+
+
+def _openai_friendly_number(text: str) -> Optional[int]:
+    """Extract first int from text like '1K followers' (returns 1000)."""
+    import re
+
+    m = re.search(r"(\d[\d,]*)", text.replace(",", ""))
+    if not m:
+        return None
+    val = int(m.group(1))
+    if "k" in text.lower():
+        val *= 1000
+    if "m" in text.lower():
+        val *= 1_000_000
+    return val
+
+# ---------------------------------------------------------------------------
+# Core async workers
+# ---------------------------------------------------------------------------
+async def crawl_company_search(crawler: AsyncWebCrawler, url: str, schema: Dict, limit: int) -> List[Dict]:
+    """Paginate 10-item company search pages until `limit` reached."""
+    extraction = JsonCssExtractionStrategy(schema)
+    cfg = CrawlerRunConfig(
+        extraction_strategy=extraction,
+        cache_mode=CacheMode.BYPASS,
+        wait_for = ".search-marvel-srp",
+        session_id="company_search",
+        delay_before_return_html=1,
+        magic = True,
+        verbose= False,
+    )
+    companies, page = [], 1
+    while len(companies) < max(limit, 10):
+        paged_url = f"{url}&page={page}"
+        res = await crawler.arun(paged_url, config=cfg)
+        batch = json.loads(res[0].extracted_content)
+        if not batch:
+            break
+        for item in batch:
+            name = item.get("name", "").strip()
+            handle = item.get("handle", "").strip()
+            if not handle or not name:
+                continue
+            descriptor = item.get("descriptor")
+            about = item.get("about")
+            followers = _openai_friendly_number(str(item.get("followers", "")))
+            companies.append(
+                {
+                    "handle": handle,
+                    "name": name,
+                    "descriptor": descriptor,
+                    "about": about,
+                    "followers": followers,
+                    "people_url": f"{handle}people/",
+                    "captured_at": datetime.now(UTC).isoformat(timespec="seconds") + "Z",
+                }
+            )
+        page += 1
+        logging.info(
+            f"[dim]Page {page}[/] — running total: {len(companies)}/{limit} companies"
+        )
+
+    return companies[:max(limit, 10)]
+
+
+async def crawl_people_page(
+    crawler: AsyncWebCrawler,
+    people_url: str,
+    schema: Dict,
+    limit: int,
+    title_kw: str,
+) -> List[Dict]:
+    people_u = f"{people_url}?keywords={quote(title_kw)}"
+    extraction = JsonCssExtractionStrategy(schema)
+    cfg = CrawlerRunConfig(
+        extraction_strategy=extraction,
+        # scan_full_page=True,
+        cache_mode=CacheMode.BYPASS,
+        magic=True,
+        wait_for=".org-people-profile-card__card-spacing",
+        delay_before_return_html=1,
+        session_id="people_search",
+    )
+    res = await crawler.arun(people_u, config=cfg)
+    if not res[0].success:
+        return []
+    raw = json.loads(res[0].extracted_content)
+    people = []
+    for p in raw[:limit]:
+        followers = _openai_friendly_number(str(p.get("followers", "")))
+        people.append(
+            {
+                "profile_url": p.get("profile_url"),
+                "name": p.get("name"),
+                "headline": p.get("headline"),
+                "followers": followers,
+                "connection_degree": p.get("connection_degree"),
+                "avatar_url": p.get("avatar_url"),
+            }
+        )
+    return people
+
+# ---------------------------------------------------------------------------
+# CLI + main
+# ---------------------------------------------------------------------------
+
+def build_arg_parser() -> argparse.ArgumentParser:
+    ap = argparse.ArgumentParser("c4ai-discover — Crawl4AI LinkedIn discovery")
+    sub = ap.add_subparsers(dest="cmd", required=False, help="run scope")
+
+    def add_flags(parser: argparse.ArgumentParser):
+        parser.add_argument("--query", required=False, help="query keyword(s)")
+        parser.add_argument("--geo", required=False, type=int, help="LinkedIn geoUrn")
+        parser.add_argument("--title-filters", default="Product,Engineering", help="comma list of job keywords")
+        parser.add_argument("--max-companies", type=int, default=1000)
+        parser.add_argument("--max-people", type=int, default=500)
+        parser.add_argument("--profile-path", default=str(pathlib.Path.home() / ".crawl4ai/profiles/profile_linkedin_uc"))
+        parser.add_argument("--outdir", default="./output")
+        parser.add_argument("--concurrency", type=int, default=4)
+        parser.add_argument("--log-level", default="info", choices=["debug", "info", "warn", "error"])
+
+    add_flags(sub.add_parser("full"))
+    add_flags(sub.add_parser("companies"))
+    add_flags(sub.add_parser("people"))
+
+    # global flags
+    ap.add_argument(
+        "--debug",
+        action="store_true",
+        help="Use built-in demo defaults (same as C4AI_DEMO_DEBUG=1)",
+    )
+    return ap
+
+
+def detect_debug_defaults(force = False) -> SimpleNamespace:
+    if not force and sys.gettrace() is None and not os.getenv("C4AI_DEMO_DEBUG"):
+        return SimpleNamespace()
+    # ----- debug‑friendly defaults -----
+    return SimpleNamespace(
+        cmd="full",
+        query="health insurance management",
+        geo=102713980,
+        # title_filters="Product,Engineering",
+        title_filters="",
+        max_companies=10,
+        max_people=5,
+        profile_name="profile_linkedin_uc",
+        outdir="./debug_out",
+        concurrency=2,
+        log_level="debug",
+    )
+
+
+async def async_main(opts):
+    # ─────────── logging setup ───────────
+    console = Console()
+    logging.basicConfig(
+        level=opts.log_level.upper(),
+        format="%(message)s",
+        handlers=[RichHandler(console=console, markup=True, rich_tracebacks=True)],
+    )
+
+    # -------------------------------------------------------------------
+    # Load or build schemas (one‑time LLM call each)
+    # -------------------------------------------------------------------
+    company_schema = _load_or_build_schema(
+        COMPANY_SCHEMA_PATH,
+        _SAMPLE_COMPANY_HTML,
+        _COMPANY_SCHEMA_QUERY,
+        _COMPANY_SCHEMA_EXAMPLE,
+        # True
+    )
+    people_schema = _load_or_build_schema(
+        PEOPLE_SCHEMA_PATH,
+        _SAMPLE_PEOPLE_HTML,
+        _PEOPLE_SCHEMA_QUERY,
+        _PEOPLE_SCHEMA_EXAMPLE,
+        # True
+    )
+
+    outdir = BASE_DIR / pathlib.Path(opts.outdir)
+    outdir.mkdir(parents=True, exist_ok=True)
+    f_companies = (BASE_DIR / outdir / "companies.jsonl").open("a", encoding="utf-8")
+    f_people = (BASE_DIR / outdir / "people.jsonl").open("a", encoding="utf-8")
+
+    # -------------------------------------------------------------------
+    # Prepare crawler with cookie pool rotation
+    # -------------------------------------------------------------------
+    profiler = BrowserProfiler()
+    path = profiler.get_profile_path(opts.profile_name)
+    bc = BrowserConfig(
+        headless=False,        
+        verbose=False,
+        user_data_dir=path,
+        use_managed_browser=True,
+        user_agent_mode = "random",
+        user_agent_generator_config= {
+            "platforms": "mobile",
+            "os": "Android"
+        },
+        verbose=False,
+    )
+    crawler = AsyncWebCrawler(config=bc)
+    
+    await crawler.start()
+
+    # Single worker for simplicity; concurrency can be scaled by arun_many if needed.
+    # crawler = await next_crawler().start()
+    try:
+        # Build LinkedIn search URL
+        search_url = f"https://www.linkedin.com/search/results/companies/?keywords={quote(opts.query)}&geoUrn={opts.geo}"
+        logging.info("Seed URL => %s", search_url)
+
+        companies: List[Dict] = []
+        if opts.cmd in ("companies", "full"):
+            companies = await crawl_company_search(
+                crawler, search_url, company_schema, opts.max_companies
+            )
+            for c in companies:
+                f_companies.write(json.dumps(c, ensure_ascii=False) + "\n")
+            logging.info(f"[bold green]✓[/] Companies scraped so far: {len(companies)}")
+
+        if opts.cmd in ("people", "full"):
+            if not companies:
+                # load from previous run
+                src = outdir / "companies.jsonl"
+                if not src.exists():
+                    logging.error("companies.jsonl missing — run companies/full first")
+                    return 10
+                companies = [json.loads(l) for l in src.read_text().splitlines()]
+            total_people = 0
+            title_kw = " ".join([t.strip() for t in opts.title_filters.split(",") if t.strip()]) if opts.title_filters else ""
+            for comp in companies:
+                people = await crawl_people_page(
+                    crawler,
+                    comp["people_url"],
+                    people_schema,
+                    opts.max_people,
+                    title_kw,
+                )
+                for p in people:
+                    rec = p | {
+                        "company_handle": comp["handle"],
+                        # "captured_at": datetime.now(UTC).isoformat(timespec="seconds") + "Z",
+                        "captured_at": datetime.now(UTC).isoformat(timespec="seconds") + "Z",
+                    }
+                    f_people.write(json.dumps(rec, ensure_ascii=False) + "\n")
+                total_people += len(people)
+                logging.info(
+                    f"{comp['name']} — [cyan]{len(people)}[/] people extracted"
+                )
+                await asyncio.sleep(random.uniform(0.5, 1))
+            logging.info("Total people scraped: %d", total_people)
+    finally:
+        await crawler.close()
+        f_companies.close()
+        f_people.close()
+
+    return 0
+
+
+def main():
+    parser = build_arg_parser()
+    cli_opts = parser.parse_args()
+
+    # decide on debug defaults
+    if cli_opts.debug:
+        opts = detect_debug_defaults(force=True)
+    else:
+        env_defaults = detect_debug_defaults()
+        env_defaults = detect_debug_defaults()
+        opts = env_defaults if env_defaults else cli_opts
+
+    if not getattr(opts, "cmd", None):
+        opts.cmd = "full"
+
+    exit_code = asyncio.run(async_main(opts))
+    sys.exit(exit_code)
+
+
+if __name__ == "__main__":
+    main()
--- a/docs/apps/linkdin/c4ai_insights.py
+++ b/docs/apps/linkdin/c4ai_insights.py
@@ -0,0 +1,372 @@
+#!/usr/bin/env python3
+"""
+Stage-2 Insights builder
+------------------------
+Reads companies.jsonl & people.jsonl (Stage-1 output) and produces:
+ • company_graph.json
+ • org_chart_<handle>.json  (one per company)
+ • decision_makers.csv
+ • graph_view.html          (interactive visualisation)
+
+Run:
+    python c4ai_insights.py --in ./stage1_out --out ./stage2_out
+
+Author : Tom @ Kidocode, 2025-04-28
+"""
+
+from __future__ import annotations
+
+# ───────────────────────────────────────────────────────────────────────────────
+# Imports & Third-party
+# ───────────────────────────────────────────────────────────────────────────────
+
+import argparse, asyncio, json, os, sys, pathlib, random, time, csv
+from datetime import datetime, UTC
+from types import SimpleNamespace
+from pathlib import Path
+from typing import List, Dict, Any
+# Pretty CLI UX
+from rich.console import Console
+from rich.logging import RichHandler
+from rich.progress import Progress, SpinnerColumn, BarColumn, TextColumn, TimeElapsedColumn
+import logging
+from jinja2 import Environment, FileSystemLoader, select_autoescape
+
+BASE_DIR = pathlib.Path(__file__).resolve().parent
+
+# ───────────────────────────────────────────────────────────────────────────────
+# 3rd-party deps
+# ───────────────────────────────────────────────────────────────────────────────
+import numpy as np
+# from sentence_transformers import SentenceTransformer
+# from sklearn.metrics.pairwise import cosine_similarity
+import pandas as pd
+import hashlib
+
+from openai import OpenAI                    # same SDK you pre-loaded
+
+# ───────────────────────────────────────────────────────────────────────────────
+# Utils
+# ───────────────────────────────────────────────────────────────────────────────
+def load_jsonl(path: Path) -> List[Dict[str, Any]]:
+    with open(path, "r", encoding="utf-8") as f:
+        return [json.loads(l) for l in f]
+
+def dump_json(obj, path: Path):
+    with open(path, "w", encoding="utf-8") as f:
+        json.dump(obj, f, ensure_ascii=False, indent=2)
+
+# ───────────────────────────────────────────────────────────────────────────────
+# Constants
+# ───────────────────────────────────────────────────────────────────────────────
+BASE_DIR = pathlib.Path(__file__).resolve().parent
+
+# ───────────────────────────────────────────────────────────────────────────────
+# Debug defaults   (mirrors Stage-1 trick)
+# ───────────────────────────────────────────────────────────────────────────────
+def dev_defaults() -> SimpleNamespace:
+    return SimpleNamespace(
+        in_dir="./debug_out",          
+        out_dir="./insights_debug",
+        embed_model="all-MiniLM-L6-v2",
+        top_k=10,
+        openai_model="gpt-4.1",
+        max_llm_tokens=8000,
+        llm_temperature=1.0,
+        workers=4,           # parallel processing
+        stub=False,          # manual
+    )
+
+# ───────────────────────────────────────────────────────────────────────────────
+# Graph builders
+# ───────────────────────────────────────────────────────────────────────────────
+def embed_descriptions(companies, model_name:str, opts) -> np.ndarray:
+    from sentence_transformers import SentenceTransformer
+    
+    logging.debug(f"Using embedding model: {model_name}")
+    cache_path = BASE_DIR / Path(opts.out_dir) / "embeds_cache.json"
+    cache = {}
+    if cache_path.exists():
+        with open(cache_path) as f:
+            cache = json.load(f)
+        # flush cache if model differs
+        if cache.get("_model") != model_name:
+            cache = {}
+
+    model = SentenceTransformer(model_name)
+    new_texts, new_indices = [], []
+    vectors = np.zeros((len(companies), 384), dtype=np.float32)
+
+    for idx, comp in enumerate(companies):
+        text = comp.get("about") or comp.get("descriptor","")
+        h = hashlib.sha1(text.encode("utf-8")).hexdigest()
+        cached = cache.get(comp["handle"])
+        if cached and cached["hash"] == h:
+            vectors[idx] = np.array(cached["vector"], dtype=np.float32)
+        else:
+            new_texts.append(text)
+            new_indices.append((idx, comp["handle"], h))
+
+    if new_texts:
+        embeds = model.encode(new_texts, show_progress_bar=False, convert_to_numpy=True)
+        for vec, (idx, handle, h) in zip(embeds, new_indices):
+            vectors[idx] = vec
+            cache[handle] = {"hash": h, "vector": vec.tolist()}
+        cache["_model"] = model_name
+        with open(cache_path, "w") as f:
+            json.dump(cache, f)
+
+    return vectors
+
+def build_company_graph(companies, embeds:np.ndarray, top_k:int) -> Dict[str,Any]:
+    from sklearn.metrics.pairwise import cosine_similarity
+    sims = cosine_similarity(embeds)
+    nodes, edges = [], []
+    idx_of = {c["handle"]: i for i,c in enumerate(companies)}
+    for i,c in enumerate(companies):
+        node = dict(
+            id=c["handle"].strip("/"),
+            name=c["name"],
+            handle=c["handle"],
+            about=c.get("about",""),
+            people_url=c.get("people_url",""),
+            industry=c.get("descriptor","").split("•")[0].strip(),
+            geoUrn=c.get("geoUrn"),
+            followers=c.get("followers",0),
+            # desc_embed=embeds[i].tolist(),
+            desc_embed=[],
+        )
+        nodes.append(node)
+        # pick top-k most similar except itself
+        top_idx = np.argsort(sims[i])[::-1][1:top_k+1]
+        for j in top_idx:
+            tgt = companies[j]
+            weight = float(sims[i,j])
+            if node["industry"] == tgt.get("descriptor","").split("•")[0].strip():
+                weight += 0.10
+            if node["geoUrn"] == tgt.get("geoUrn"):
+                weight += 0.05
+            tgt['followers'] = tgt.get("followers", None) or 1
+            node["followers"] = node.get("followers", None) or 1
+            follower_ratio = min(node["followers"], tgt.get("followers",1)) / max(node["followers"] or 1, tgt.get("followers",1))
+            weight += 0.05 * follower_ratio
+            edges.append(dict(
+                source=node["id"],
+                target=tgt["handle"].strip("/"),
+                weight=round(weight,4),
+                drivers=dict(
+                    embed_sim=round(float(sims[i,j]),4),
+                    industry_match=0.10 if node["industry"] == tgt.get("descriptor","").split("•")[0].strip() else 0,
+                    geo_overlap=0.05 if node["geoUrn"] == tgt.get("geoUrn") else 0,
+                )
+            ))
+    # return {"nodes":nodes,"edges":edges,"meta":{"generated_at":datetime.now(UTC).isoformat()}}
+    return {"nodes":nodes,"edges":edges,"meta":{"generated_at":datetime.now(UTC).isoformat()}}
+
+# ───────────────────────────────────────────────────────────────────────────────
+# Org-chart via LLM
+# ───────────────────────────────────────────────────────────────────────────────
+async def infer_org_chart_llm(company, people, client:OpenAI, model_name:str, max_tokens:int, temperature:float, stub:bool):
+    if stub:
+        # Tiny fake org-chart when debugging offline
+        chief = random.choice(people)
+        nodes = [{
+            "id": chief["profile_url"],
+            "name": chief["name"],
+            "title": chief["headline"],
+            "dept": chief["headline"].split()[:1][0],
+            "yoe_total": 8,
+            "yoe_current": 2,
+            "seniority_score": 0.8,
+            "decision_score": 0.9,
+            "avatar_url": chief.get("avatar_url")
+        }]
+        return {"nodes":nodes,"edges":[],"meta":{"debug_stub":True,"generated_at":datetime.now(UTC).isoformat()}}
+    
+    prompt = [
+        {"role":"system","content":"You are an expert B2B org-chart reasoner."},
+        {"role":"user","content":f"""Here is the company description:
+         
+<company>
+{json.dumps(company, ensure_ascii=False)}
+</company>
+                
+Here is a JSON list of employees:
+<employees>
+{json.dumps(people, ensure_ascii=False)}
+</employees>
+
+1) Build a reporting tree (manager -> direct reports)
+2) For each person output a decision_score 0-1 for buying new software
+
+Return JSON: {{ "nodes":[{{id,name,title,dept,yoe_total,yoe_current,seniority_score,decision_score,avatar_url,profile_url}}], "edges":[{{source,target,type,confidence}}] }}
+"""}
+    ]
+    resp = client.chat.completions.create(
+        model=model_name,
+        messages=prompt,
+        max_tokens=max_tokens,
+        temperature=temperature,
+        response_format={"type":"json_object"}
+    )
+    chart = json.loads(resp.choices[0].message.content)
+    chart["meta"] = dict(model=model_name, generated_at=datetime.now(UTC).isoformat())
+    return chart
+
+# ───────────────────────────────────────────────────────────────────────────────
+# CSV flatten
+# ───────────────────────────────────────────────────────────────────────────────
+def export_decision_makers(charts_dir:Path, csv_path:Path, threshold:float=0.5):
+    rows=[]
+    for p in charts_dir.glob("org_chart_*.json"):
+        data=json.loads(p.read_text())
+        comp = p.stem.split("org_chart_")[1]
+        for n in data.get("nodes",[]):
+            if n.get("decision_score",0)>=threshold:
+                rows.append(dict(
+                    company=comp,
+                    person=n["name"],
+                    title=n["title"],
+                    decision_score=n["decision_score"],
+                    profile_url=n["id"]
+                ))
+    pd.DataFrame(rows).to_csv(csv_path,index=False)
+
+# ───────────────────────────────────────────────────────────────────────────────
+# HTML rendering
+# ───────────────────────────────────────────────────────────────────────────────
+def render_html(out:Path, template_dir:Path):
+    # From template folder cp graph_view.html and ai.js in out folder
+    import shutil
+    shutil.copy(template_dir/"graph_view_template.html", out / "graph_view.html")
+    shutil.copy(template_dir/"ai.js", out)
+
+
+# ───────────────────────────────────────────────────────────────────────────────
+# Main async pipeline
+# ───────────────────────────────────────────────────────────────────────────────
+async def run(opts):
+    # ── silence SDK noise ──────────────────────────────────────────────────────
+    for noisy in ("openai", "httpx", "httpcore"):
+        lg = logging.getLogger(noisy)
+        lg.setLevel(logging.WARNING)     # or ERROR if you want total silence
+        lg.propagate = False             # optional: stop them reaching root
+
+    # ────────────── logging bootstrap ──────────────
+    console = Console()
+    logging.basicConfig(
+        level="INFO",
+        format="%(message)s",
+        handlers=[RichHandler(console=console, markup=True, rich_tracebacks=True)],
+    )
+
+    in_dir  = BASE_DIR / Path(opts.in_dir)
+    out_dir = BASE_DIR / Path(opts.out_dir)
+    out_dir.mkdir(parents=True, exist_ok=True)
+
+    companies = load_jsonl(in_dir/"companies.jsonl")
+    people    = load_jsonl(in_dir/"people.jsonl")
+
+    logging.info(f"[bold cyan]Loaded[/] {len(companies)} companies, {len(people)} people")
+
+    logging.info("[bold]⇢[/] Embedding company descriptions…")
+    # embeds = embed_descriptions(companies, opts.embed_model, opts)
+    
+    logging.info("[bold]⇢[/] Building similarity graph")
+    # company_graph = build_company_graph(companies, embeds, opts.top_k)
+    # dump_json(company_graph, out_dir/"company_graph.json")
+
+    # OpenAI client (only built if not debugging)
+    stub = bool(opts.stub)
+    client = OpenAI() if not stub else None
+
+    # Filter companies that need processing
+    to_process = []
+    for comp in companies:
+        handle = comp["handle"].strip("/").replace("/","_")
+        out_file = out_dir/f"org_chart_{handle}.json"
+        if out_file.exists() and False:
+            logging.info(f"[green]✓[/] Skipping existing {comp['name']}")
+            continue
+        to_process.append(comp)
+    
+    
+    if not to_process:
+        logging.info("[yellow]All companies already processed[/]")
+    else:
+        workers = getattr(opts, 'workers', 1)
+        parallel = workers > 1
+        
+        logging.info(f"[bold]⇢[/] Inferring org-charts via LLM {f'(parallel={workers} workers)' if parallel else ''}")
+        
+        with Progress(
+            SpinnerColumn(),
+            BarColumn(),
+            TextColumn("[progress.description]{task.description}"),
+            TimeElapsedColumn(),
+            console=console,
+        ) as progress:
+            task = progress.add_task("Org charts", total=len(to_process))
+            
+            async def process_one(comp):
+                handle = comp["handle"].strip("/").replace("/","_")
+                persons = [p for p in people if p["company_handle"].strip("/") == comp["handle"].strip("/")]
+                
+                chart = await infer_org_chart_llm(
+                    comp, persons,
+                    client=client if client else OpenAI(api_key="sk-debug"),
+                    model_name=opts.openai_model,
+                    max_tokens=opts.max_llm_tokens,
+                    temperature=opts.llm_temperature,
+                    stub=stub,
+                )
+                chart["meta"]["company"] = comp["name"]
+                
+                # Save the result immediately
+                dump_json(chart, out_dir/f"org_chart_{handle}.json")
+                
+                progress.update(task, advance=1, description=f"{comp['name']} ({len(persons)} ppl)")
+            
+            # Create tasks for all companies
+            tasks = [process_one(comp) for comp in to_process]
+            
+            # Process in batches based on worker count
+            semaphore = asyncio.Semaphore(workers)
+            
+            async def bounded_process(coro):
+                async with semaphore:
+                    return await coro
+            
+            # Run with concurrency control
+            await asyncio.gather(*(bounded_process(task) for task in tasks))
+
+    logging.info("[bold]⇢[/] Flattening decision-makers CSV")
+    export_decision_makers(out_dir, out_dir/"decision_makers.csv")
+        
+    render_html(out_dir, template_dir=BASE_DIR/"templates")
+    logging.success = lambda msg, **k: console.print(f"[bold green]✓[/] {msg}", **k)
+    logging.success(f"Stage-2 artefacts written to {out_dir}")
+
+# ───────────────────────────────────────────────────────────────────────────────
+# CLI
+# ───────────────────────────────────────────────────────────────────────────────
+def build_arg_parser():
+    p = argparse.ArgumentParser(description="Build graphs & visualisation from Stage-1 output")
+    p.add_argument("--in",       dest="in_dir",  required=False, help="Stage-1 output dir", default=".")
+    p.add_argument("--out",      dest="out_dir", required=False, help="Destination dir",   default=".")
+    p.add_argument("--embed_model", default="all-MiniLM-L6-v2")
+    p.add_argument("--top_k", type=int, default=10, help="Top-k neighbours per company")
+    p.add_argument("--openai_model", default="gpt-4.1")
+    p.add_argument("--max_llm_tokens", type=int, default=8024)
+    p.add_argument("--llm_temperature", type=float, default=1.0)
+    p.add_argument("--stub", action="store_true", help="Skip OpenAI call and generate tiny fake org charts")
+    p.add_argument("--workers", type=int, default=4, help="Number of parallel workers for LLM inference")
+    return p
+
+def main():
+    dbg = dev_defaults()
+    opts = dbg if True else build_arg_parser().parse_args()
+    asyncio.run(run(opts))
+
+if __name__ == "__main__":
+    main()
--- a/docs/apps/linkdin/schemas/company_card.json
+++ b/docs/apps/linkdin/schemas/company_card.json
@@ -0,0 +1,39 @@
+{
+  "name": "LinkedIn Company Card",
+  "baseSelector": "div.search-results-container ul[role='list'] > li",
+  "fields": [
+    {
+      "name": "handle",
+      "selector": "a[href*='/company/']",
+      "type": "attribute",
+      "attribute": "href"
+    },
+    {
+      "name": "profile_image",
+      "selector": "a[href*='/company/'] img",
+      "type": "attribute",
+      "attribute": "src"
+    },
+    {
+      "name": "name",
+      "selector": "span[class*='t-16'] a",
+      "type": "text"
+    },
+    {
+      "name": "descriptor",
+      "selector": "div[class*='t-black t-normal']",
+      "type": "text"
+    },
+    {
+      "name": "about",
+      "selector": "p[class*='entity-result__summary--2-lines']",
+      "type": "text"
+    },
+    {
+      "name": "followers",
+      "selector": "div:contains('followers')",
+      "type": "regex",
+      "pattern": "(\\d+)\\s*followers"
+    }
+  ]
+}
--- a/docs/apps/linkdin/schemas/people_card.json
+++ b/docs/apps/linkdin/schemas/people_card.json
@@ -0,0 +1,38 @@
+{
+  "name": "LinkedIn People Card",
+  "baseSelector": "li.org-people-profile-card__profile-card-spacing",
+  "fields": [
+    {
+      "name": "profile_url",
+      "selector": "a.eETATgYTipaVsmrBChiBJJvFsdPhNpulhPZUVLHLo",
+      "type": "attribute",
+      "attribute": "href"
+    },
+    {
+      "name": "name",
+      "selector": ".artdeco-entity-lockup__title .lt-line-clamp--single-line",
+      "type": "text"
+    },
+    {
+      "name": "headline",
+      "selector": ".artdeco-entity-lockup__subtitle .lt-line-clamp--multi-line",
+      "type": "text"
+    },
+    {
+      "name": "followers",
+      "selector": ".lt-line-clamp--multi-line.t-12",
+      "type": "text"
+    },
+    {
+      "name": "connection_degree",
+      "selector": ".artdeco-entity-lockup__badge .artdeco-entity-lockup__degree",
+      "type": "text"
+    },
+    {
+      "name": "avatar_url",
+      "selector": ".artdeco-entity-lockup__image img",
+      "type": "attribute",
+      "attribute": "src"
+    }
+  ]
+}
--- a/docs/apps/linkdin/snippets/company.html
+++ b/docs/apps/linkdin/snippets/company.html
@@ -0,0 +1,143 @@
+<li class="yCLWzruNprmIzaZzFFonVFBtMrbaVYnuDFA">
+    <!----><!---->
+
+
+
+    <div class="IxlEPbRZwQYrRltKPvHAyjBmCdIWTAoYo" data-chameleon-result-urn="urn:li:company:362492"
+        data-view-name="search-entity-result-universal-template">
+
+
+
+
+        <div class="linked-area flex-1
+              cursor-pointer">
+
+            <div class="BAEgVqVuxosMJZodcelsgPoyRcrkiqgVCGHXNQ">
+                <div class="afcvrbGzNuyRlhPPQWrWirJtUdHAAtUlqxwvVA">
+                    <div class="display-flex align-items-center">
+                        <!---->
+
+                        <a class="eETATgYTipaVsmrBChiBJJvFsdPhNpulhPZUVLHLo  scale-down " aria-hidden="true"
+                            tabindex="-1" href="https://www.linkedin.com/company/managment-research-services-inc./"
+                            data-test-app-aware-link="">
+
+                            <div class="ivm-image-view-model   ">
+
+                                <div class="ivm-view-attr__img-wrapper
+            
+            ">
+                                    <!---->
+                                    <!----> <img width="48"
+                                        src="https://media.licdn.com/dms/image/v2/C560BAQFWpusEOgW-ww/company-logo_100_100/company-logo_100_100/0/1630583697877/managment_research_services_inc_logo?e=1750896000&amp;v=beta&amp;t=Ch9vyEZdfng-1D1m_XqP5kjNpVXUBKkk9cNhMZUhx0E"
+                                        loading="lazy" height="48" alt="Management Research Services, Inc. (MRS, Inc)"
+                                        id="ember28"
+                                        class="ivm-view-attr__img--centered EntityPhoto-square-3   evi-image lazy-image ember-view">
+                                </div>
+
+                            </div>
+
+                        </a>
+
+
+                    </div>
+                </div>
+                <div
+                    class="wympnVuDByXHvafWrMGJLZuchDmCRqLmWPwg MmzCPRicJimZvjJhvqTzDcDbdHhWPzspERzA pt3 pb3 t-12 t-black--light">
+                    <div class="mb1">
+
+                        <div class="t-roman t-sans">
+
+
+
+                            <div class="display-flex">
+                                <span class="TikBXjihYvcNUoIzkslUaEjfIuLmYxfs OoHEyXgsiIqGADjcOtTmfdpoYVXrLKTvkwI ">
+                                    <span class="CgaWLOzmXNuKbRIRARSErqCJcBPYudEKo
+                t-16">
+                                        <a class="eETATgYTipaVsmrBChiBJJvFsdPhNpulhPZUVLHLo "
+                                            href="https://www.linkedin.com/company/managment-research-services-inc./"
+                                            data-test-app-aware-link="">
+                                            <!---->Management Research Services, Inc. (MRS, Inc)<!---->
+                                            <!----> </a>
+                                        <!----> </span>
+                                </span>
+                                <!---->
+                            </div>
+
+
+
+                        </div>
+
+
+
+                        <div class="LjmdKCEqKITHihFOiQsBAQylkdnsWhqZii
+              t-14 t-black t-normal">
+                            <!---->Insurance • Milwaukee, Wisconsin<!---->
+                        </div>
+
+                        <div class="cTPhJiHyNLmxdQYFlsEOutjznmqrVHUByZwZ
+              t-14 t-normal">
+                            <!---->1K followers<!---->
+                        </div>
+
+
+
+
+
+                    </div>
+
+                    <!---->
+                    <p class="yWzlqwKNlvCWVNoKqmzoDDEnBMUuyynaLg
+                    entity-result__summary--2-lines
+                    t-12 t-black--light
+                    ">
+                        <!---->MRS combines 30 years of experience supporting the Life,<span class="white-space-pre">
+                        </span><strong><!---->Health<!----></strong><span class="white-space-pre"> </span>and
+                        Annuities<span class="white-space-pre"> </span><strong><!---->Insurance<!----></strong><span
+                            class="white-space-pre"> </span>Industry with customized<span class="white-space-pre">
+                        </span><strong><!---->insurance<!----></strong><span class="white-space-pre">
+                        </span>underwriting solutions that efficiently support clients’ workflows. Supported by the
+                        Agenium Platform (www.agenium.ai) our innovative underwriting solutions are guaranteed to
+                        optimize requirements...<!---->
+                    </p>
+
+                    <!---->
+                </div>
+                <div class="qXxdnXtzRVFTnTnetmNpssucBwQBsWlUuk MmzCPRicJimZvjJhvqTzDcDbdHhWPzspERzA">
+                    <!---->
+
+
+                    <div>
+
+
+
+
+                        <button aria-label="Follow Management Research Services, Inc. (MRS, Inc)" id="ember61"
+                            class="artdeco-button artdeco-button--2 artdeco-button--secondary ember-view"
+                            type="button"><!---->
+                            <span class="artdeco-button__text">
+                                Follow
+                            </span></button>
+
+
+
+                        <!---->
+                        <!---->
+
+
+                    </div>
+
+
+
+                </div>
+            </div>
+
+        </div>
+
+
+
+
+    </div>
+
+
+
+</li>
--- a/docs/apps/linkdin/snippets/people.html
+++ b/docs/apps/linkdin/snippets/people.html
@@ -0,0 +1,94 @@
+<li class="grid grid__col--lg-8 block org-people-profile-card__profile-card-spacing">
+    <div>
+
+
+        <section class="artdeco-card full-width qQdPErXQkSAbwApNgNfuxukTIPPykttCcZGOHk">
+            <!---->
+
+            <img width="210" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7"
+                ariarole="presentation" loading="lazy" height="210" alt="" id="ember96"
+                class="evi-image lazy-image ghost-default ember-view org-people-profile-card__cover-photo org-people-profile-card__cover-photo--people">
+
+            <div class="org-people-profile-card__profile-info">
+                <div id="ember97"
+                    class="artdeco-entity-lockup artdeco-entity-lockup--stacked-center artdeco-entity-lockup--size-7 ember-view">
+                    <div id="ember98"
+                        class="artdeco-entity-lockup__image artdeco-entity-lockup__image--type-circle ember-view"
+                        type="circle">
+
+                        <a class="eETATgYTipaVsmrBChiBJJvFsdPhNpulhPZUVLHLo "
+                            id="org-people-profile-card__profile-image-0"
+                            href="https://www.linkedin.com/in/speakerrayna?miniProfileUrn=urn%3Ali%3Afs_miniProfile%3AACoAABsqUBoBr5x071PuGGpNtK3NlvSARiVXPIs"
+                            data-test-app-aware-link="">
+                            <img width="104"
+                                src="https://media.licdn.com/dms/image/v2/D5603AQGs2Vyju4xZ7A/profile-displayphoto-shrink_100_100/profile-displayphoto-shrink_100_100/0/1681741067031?e=1750896000&amp;v=beta&amp;t=Hvj--IrrmpVIH7pec7-l_PQok8vsS__CGeUqBWOw7co"
+                                loading="lazy" height="104" alt="Dr. Rayna S." id="ember99"
+                                class="evi-image lazy-image ember-view">
+                        </a>
+
+
+                    </div>
+                    <div id="ember100" class="artdeco-entity-lockup__content ember-view">
+                        <div id="ember101" class="artdeco-entity-lockup__title ember-view">
+                            <a class="eETATgYTipaVsmrBChiBJJvFsdPhNpulhPZUVLHLo  link-without-visited-state"
+                                aria-label="View Dr. Rayna S.’s profile"
+                                href="https://www.linkedin.com/in/speakerrayna?miniProfileUrn=urn%3Ali%3Afs_miniProfile%3AACoAABsqUBoBr5x071PuGGpNtK3NlvSARiVXPIs"
+                                data-test-app-aware-link="">
+                                <div id="ember103" class="ember-view lt-line-clamp lt-line-clamp--single-line AGabuksChUpCmjWshSnaZryLKSthOKkwclxY
+          t-black" style="">
+                                    Dr. Rayna S.
+
+                                    <!---->
+                                </div>
+
+                            </a>
+
+                        </div>
+                        <div id="ember104" class="artdeco-entity-lockup__badge ember-view"> <span class="a11y-text">3rd+
+                                degree connection</span>
+                            <span class="artdeco-entity-lockup__degree" aria-hidden="true">
+                                ·&nbsp;3rd
+                            </span>
+                            <!----><!---->
+                        </div>
+                        <div id="ember105" class="artdeco-entity-lockup__subtitle ember-view">
+                            <div class="t-14 t-black--light t-normal">
+                                <div id="ember107" class="ember-view lt-line-clamp lt-line-clamp--multi-line"
+                                    style="-webkit-line-clamp: 2">
+                                    Leadership and Talent Development Consultant and Professional Speaker
+
+                                    <!---->
+                                </div>
+
+                            </div>
+                        </div>
+                        <div id="ember108" class="artdeco-entity-lockup__caption ember-view"></div>
+                    </div>
+
+                </div>
+                <span class="text-align-center">
+                    <span id="ember110"
+                        class="ember-view lt-line-clamp lt-line-clamp--multi-line t-12 t-black--light mt2"
+                        style="-webkit-line-clamp: 3">
+                        727 followers
+
+                        <!----> </span>
+
+                </span>
+            </div>
+
+            <footer class="ph3 pb3">
+                <button aria-label="Follow Dr. Rayna S." id="ember111"
+                    class="artdeco-button artdeco-button--2 artdeco-button--secondary ember-view full-width"
+                    type="button"><!---->
+                    <span class="artdeco-button__text">
+                        Follow
+                    </span></button>
+            </footer>
+
+        </section>
+
+
+    </div>
+
+</li>
--- a/docs/apps/linkdin/templates/ai.js
+++ b/docs/apps/linkdin/templates/ai.js
@@ -0,0 +1,50 @@
+// ==== File: ai.js ====
+
+class ApiHandler {
+    constructor(apiKey = null) {
+      this.apiKey = apiKey || localStorage.getItem("openai_api_key") || "";
+      console.log("ApiHandler ready");
+    }
+  
+    setApiKey(k) {
+      this.apiKey = k.trim();
+      if (this.apiKey) localStorage.setItem("openai_api_key", this.apiKey);
+    }
+  
+    async *chatStream(messages, {model = "gpt-4o", temperature = 0.7} = {}) {
+      if (!this.apiKey) throw new Error("OpenAI API key missing");
+      const payload = {model, messages, stream: true, max_tokens: 1024};
+      const controller = new AbortController();
+  
+      const res = await fetch("https://api.openai.com/v1/chat/completions", {
+        method: "POST",
+        headers: {
+          "Content-Type": "application/json",
+          Authorization: `Bearer ${this.apiKey}`,
+        },
+        body: JSON.stringify(payload),
+        signal: controller.signal,
+      });
+      if (!res.ok) throw new Error(`OpenAI: ${res.statusText}`);
+      const reader = res.body.getReader();
+      const dec = new TextDecoder();
+  
+      let buf = "";
+      while (true) {
+        const {done, value} = await reader.read();
+        if (done) break;
+        buf += dec.decode(value, {stream: true});
+        for (const line of buf.split("\n")) {
+          if (!line.startsWith("data: ")) continue;
+          if (line.includes("[DONE]")) return;
+          const json = JSON.parse(line.slice(6));
+          const delta = json.choices?.[0]?.delta?.content;
+          if (delta) yield delta;
+        }
+        buf = buf.endsWith("\n") ? "" : buf; // keep partial line
+      }
+    }
+  }
+  
+  window.API = new ApiHandler();
+  
--- a/docs/apps/linkdin/templates/graph_view_template.html
+++ b/docs/apps/linkdin/templates/graph_view_template.html
--- a/docs/codebase/browser.md
+++ b/docs/codebase/browser.md
@@ -0,0 +1,51 @@
+### browser_manager.py
+
+| Function | What it does |
+|---|---|
+| `ManagedBrowser.build_browser_flags` | Returns baseline Chromium CLI flags, disables GPU and sandbox, plugs locale, timezone, stealth tweaks, and any extras from `BrowserConfig`. |
+| `ManagedBrowser.__init__` | Stores config and logger, creates temp dir, preps internal state. |
+| `ManagedBrowser.start` | Spawns or connects to the Chromium process, returns its CDP endpoint plus the `subprocess.Popen` handle. |
+| `ManagedBrowser._initial_startup_check` | Pings the CDP endpoint once to be sure the browser is alive, raises if not. |
+| `ManagedBrowser._monitor_browser_process` | Async-loops on the subprocess, logs exits or crashes, restarts if policy allows. |
+| `ManagedBrowser._get_browser_path_WIP` | Old helper that maps OS + browser type to an executable path. |
+| `ManagedBrowser._get_browser_path` | Current helper, checks env vars, Playwright cache, and OS defaults for the real executable. |
+| `ManagedBrowser._get_browser_args` | Builds the final CLI arg list by merging user flags, stealth flags, and defaults. |
+| `ManagedBrowser.cleanup` | Terminates the browser, stops monitors, deletes the temp dir. |
+| `ManagedBrowser.create_profile` | Opens a visible browser so a human can log in, then zips the resulting user-data-dir to `~/.crawl4ai/profiles/<name>`. |
+| `ManagedBrowser.list_profiles` | Thin wrapper, now forwarded to `BrowserProfiler.list_profiles()`. |
+| `ManagedBrowser.delete_profile` | Thin wrapper, now forwarded to `BrowserProfiler.delete_profile()`. |
+| `BrowserManager.__init__` | Holds the global Playwright instance, browser handle, config signature cache, session map, and logger. |
+| `BrowserManager.start` | Boots the underlying `ManagedBrowser`, then spins up the default Playwright browser context with stealth patches. |
+| `BrowserManager._build_browser_args` | Translates `CrawlerRunConfig` (proxy, UA, timezone, headless flag, etc.) into Playwright `launch_args`. |
+| `BrowserManager.setup_context` | Applies locale, geolocation, permissions, cookies, and UA overrides on a fresh context. |
+| `BrowserManager.create_browser_context` | Internal helper that actually calls `browser.new_context(**options)` after running `setup_context`. |
+| `BrowserManager._make_config_signature` | Hashes the non-ephemeral parts of `CrawlerRunConfig` so contexts can be reused safely. |
+| `BrowserManager.get_page` | Returns a ready `Page` for a given session id, reusing an existing one or creating a new context/page, injects helper scripts, updates `last_used`. |
+| `BrowserManager.kill_session` | Force-closes a context/page for a session and removes it from the session map. |
+| `BrowserManager._cleanup_expired_sessions` | Periodic sweep that drops sessions idle longer than `ttl_seconds`. |
+| `BrowserManager.close` | Gracefully shuts down all contexts, the browser, Playwright, and background tasks. |
+
+---
+
+### browser_profiler.py
+
+| Function | What it does |
+|---|---|
+| `BrowserProfiler.__init__` | Sets up profile folder paths, async logger, and signal handlers. |
+| `BrowserProfiler.create_profile` | Launches a visible browser with a new user-data-dir for manual login, on exit compresses and stores it as a named profile. |
+| `BrowserProfiler.cleanup_handler` | General SIGTERM/SIGINT cleanup wrapper that kills child processes. |
+| `BrowserProfiler.sigint_handler` | Handles Ctrl-C during an interactive session, makes sure the browser shuts down cleanly. |
+| `BrowserProfiler.listen_for_quit_command` | Async REPL that exits when the user types `q`. |
+| `BrowserProfiler.list_profiles` | Enumerates `~/.crawl4ai/profiles`, prints profile name, browser type, size, and last modified. |
+| `BrowserProfiler.get_profile_path` | Returns the absolute path of a profile given its name, or `None` if missing. |
+| `BrowserProfiler.delete_profile` | Removes a profile folder or a direct path from disk, with optional confirmation prompt. |
+| `BrowserProfiler.interactive_manager` | Text UI loop for listing, creating, deleting, or launching profiles. |
+| `BrowserProfiler.launch_standalone_browser` | Starts a non-headless Chromium with remote debugging enabled and keeps it alive for manual tests. |
+| `BrowserProfiler.get_cdp_json` | Pulls `/json/version` from a CDP endpoint and returns the parsed JSON. |
+| `BrowserProfiler.launch_builtin_browser` | Spawns a headless Chromium in the background, saves `{wsEndpoint, pid, started_at}` to `~/.crawl4ai/builtin_browser.json`. |
+| `BrowserProfiler.get_builtin_browser_info` | Reads that JSON file, verifies the PID, and returns browser status info. |
+| `BrowserProfiler._is_browser_running` | Cross-platform helper that checks if a PID is still alive. |
+| `BrowserProfiler.kill_builtin_browser` | Terminates the background builtin browser and removes its status file. |
+| `BrowserProfiler.get_builtin_browser_status` | Returns `{running: bool, wsEndpoint, pid, started_at}` for quick health checks. |
+
+Let me know what you want to tweak or dive into next.
--- a/docs/codebase/cli.md
+++ b/docs/codebase/cli.md
@@ -0,0 +1,40 @@
+### `cli.py` command surface
+
+| Command | Inputs / flags | What it does |
+|---|---|---|
+| **profiles** | *(none)* | Opens the interactive profile manager, lets you list, create, delete saved browser profiles that live in `~/.crawl4ai/profiles`. |
+| **browser status** | – | Prints whether the always-on *builtin* browser is running, shows its CDP URL, PID, start time. |
+| **browser stop** | – | Kills the builtin browser and deletes its status file. |
+| **browser view** | `--url, -u` URL *(optional)* | Pops a visible window of the builtin browser, navigates to `URL` or `about:blank`. |
+| **config list** | – | Dumps every global setting, showing current value, default, and description. |
+| **config get** | `key` | Prints the value of a single setting, falls back to default if unset. |
+| **config set** | `key value` | Persists a new value in the global config (stored under `~/.crawl4ai/config.yml`). |
+| **examples** | – | Just spits out real-world CLI usage samples. |
+| **crawl** | `url` *(positional)*<br>`--browser-config,-B` path<br>`--crawler-config,-C` path<br>`--filter-config,-f` path<br>`--extraction-config,-e` path<br>`--json-extract,-j` [desc]\*<br>`--schema,-s` path<br>`--browser,-b` k=v list<br>`--crawler,-c` k=v list<br>`--output,-o` all,json,markdown,md,markdown-fit,md-fit *(default all)*<br>`--output-file,-O` path<br>`--bypass-cache,-b` *(flag, default true — note flag reuse)*<br>`--question,-q` str<br>`--verbose,-v` *(flag)*<br>`--profile,-p` profile-name | One-shot crawl + extraction. Builds `BrowserConfig` and `CrawlerRunConfig` from inline flags or separate YAML/JSON files, runs `AsyncWebCrawler.run()`, can route through a named saved profile and pipe the result to stdout or a file. |
+| **(default)** | Same flags as **crawl**, plus `--example` | Shortcut so you can type just `crwl https://site.com`. When first arg is not a known sub-command, it falls through to *crawl*. |
+
+\* `--json-extract/-j` with no value turns on LLM-based JSON extraction using an auto schema, supplying a string lets you prompt-engineer the field descriptions.
+
+> Quick mental model  
+> `profiles` = manage identities,  
+> `browser ...` = control long-running headless Chrome that all crawls can piggy-back on,  
+> `crawl` = do the actual work,  
+> `config` = tweak global defaults,  
+> everything else is sugar.
+
+### Quick-fire “profile” usage cheatsheet
+
+| Scenario | Command (copy-paste ready) | Notes |
+|---|---|---|
+| **Launch interactive Profile Manager UI** | `crwl profiles` | Opens TUI with options: 1 List, 2 Create, 3 Delete, 4 Use-to-crawl, 5 Exit. |
+| **Create a fresh profile** | `crwl profiles` → choose **2** → name it → browser opens → log in → press **q** in terminal | Saves to `~/.crawl4ai/profiles/<name>`. |
+| **List saved profiles** | `crwl profiles` → choose **1** | Shows name, browser type, size, last-modified. |
+| **Delete a profile** | `crwl profiles` → choose **3** → pick the profile index → confirm | Removes the folder. |
+| **Crawl with a profile (default alias)** | `crwl https://site.com/dashboard -p my-profile` | Keeps login cookies, sets `use_managed_browser=true` under the hood. |
+| **Crawl + verbose JSON output** | `crwl https://site.com -p my-profile -o json -v` | Any other `crawl` flags work the same. |
+| **Crawl with extra browser tweaks** | `crwl https://site.com -p my-profile -b "headless=true,viewport_width=1680"` | CLI overrides go on top of the profile. |
+| **Same but via explicit sub-command** | `crwl crawl https://site.com -p my-profile` | Identical to default alias. |
+| **Use profile from inside Profile Manager** | `crwl profiles` → choose **4** → pick profile → enter URL → follow prompts | Handy when demo-ing to non-CLI folks. |
+| **One-off crawl with a profile folder path (no name lookup)** | `crwl https://site.com -b "user_data_dir=$HOME/.crawl4ai/profiles/my-profile,use_managed_browser=true"` | Bypasses registry, useful for CI scripts. |
+| **Launch a dev browser on CDP port with the same identity** | `crwl cdp -d $HOME/.crawl4ai/profiles/my-profile -P 9223` | Lets Puppeteer/Playwright attach for debugging. |
+
--- a/docs/examples/crypto_analysis_example.py
+++ b/docs/examples/crypto_analysis_example.py
@@ -391,12 +391,14 @@ async def main():
        # Process results
        raw_df = pd.DataFrame()
        for result in results:
-            if result.success and result.media["tables"]:
+            # Use the new tables field, falling back to media["tables"] for backward compatibility
+            tables = result.tables if hasattr(result, "tables") and result.tables else result.media.get("tables", [])
+            if result.success and tables:
                # Extract primary market table
                # DataFrame
                raw_df = pd.DataFrame(
-                    result.media["tables"][0]["rows"],
-                    columns=result.media["tables"][0]["headers"],
+                    tables[0]["rows"],
+                    columns=tables[0]["headers"],
                )
                break

--- a/docs/examples/docker/demo_docker_api.py
+++ b/docs/examples/docker/demo_docker_api.py
--- a/docs/examples/hello_world.py
+++ b/docs/examples/hello_world.py
@@ -31,7 +31,7 @@ async def example_cdp():
                   

 async def main():
-    browser_config = BrowserConfig(headless=True, verbose=True)
+    browser_config = BrowserConfig(headless=False, verbose=True)
    async with AsyncWebCrawler(config=browser_config) as crawler:
        crawler_config = CrawlerRunConfig(
            cache_mode=CacheMode.BYPASS,
--- a/docs/md_v2/assets/layout.css
+++ b/docs/md_v2/assets/layout.css
@@ -412,17 +412,41 @@ footer {
    background-color: var(--primary-dimmed-color, #09b5a5);
    color: var(--background-color, #070708);
    border: none;
-    padding: 4px 8px;
+    padding: 6px 10px;
    font-size: 0.8em;
    border-radius: 4px;
    cursor: pointer;
-    box-shadow: 0 2px 5px rgba(0, 0, 0, 0.3);
-    transition: background-color 0.2s ease;
+    box-shadow: 0 3px 8px rgba(0, 0, 0, 0.3);
+    transition: background-color 0.2s ease, transform 0.15s ease;
    white-space: nowrap;
+    display: flex;
+    align-items: center;
+    font-weight: 500;
+    animation: askAiButtonAppear 0.2s ease-out;
+}
+
+@keyframes askAiButtonAppear {
+    from {
+        opacity: 0;
+        transform: scale(0.9);
+    }
+    to {
+        opacity: 1;
+        transform: scale(1);
+    }
 }

 .ask-ai-selection-button:hover {
    background-color: var(--primary-color, #50ffff);
+    transform: scale(1.05);
+}
+
+/* Mobile styles for Ask AI button */
+@media screen and (max-width: 768px) {
+    .ask-ai-selection-button {
+        padding: 8px 12px; /* Larger touch target on mobile */
+        font-size: 0.9em; /* Slightly larger text */
+    }
 }

 /* ==== File: docs/assets/layout.css (Additions) ==== */
--- a/docs/md_v2/assets/selection_ask_ai.js
+++ b/docs/md_v2/assets/selection_ask_ai.js
@@ -8,12 +8,32 @@ document.addEventListener('DOMContentLoaded', () => {
        const button = document.createElement('button');
        button.id = 'ask-ai-selection-btn';
        button.className = 'ask-ai-selection-button';
-        button.textContent = 'Ask AI'; // Or use an icon
+        
+        // Add icon and text for better visibility
+        button.innerHTML = `
+            <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" width="12" height="12" fill="currentColor" style="margin-right: 4px; vertical-align: middle;">
+                <path d="M20 2H4c-1.1 0-2 .9-2 2v12c0 1.1.9 2 2 2h14l4 4V4c0-1.1-.9-2-2-2z"/>
+            </svg>
+            <span>Ask AI</span>
+        `;
+        
+        // Common styles
        button.style.display = 'none'; // Initially hidden
        button.style.position = 'absolute';
        button.style.zIndex = '1500'; // Ensure it's on top
+        button.style.boxShadow = '0 3px 8px rgba(0, 0, 0, 0.4)'; // More pronounced shadow
+        button.style.transition = 'transform 0.15s ease, background-color 0.2s ease'; // Smooth hover effect
+        
+        // Add transform on hover
+        button.addEventListener('mouseover', () => {
+            button.style.transform = 'scale(1.05)';
+        });
+        
+        button.addEventListener('mouseout', () => {
+            button.style.transform = 'scale(1)';
+        });
+        
        document.body.appendChild(button);
-
        button.addEventListener('click', handleAskAiClick);
        return button;
    }
@@ -43,11 +63,38 @@ document.addEventListener('DOMContentLoaded', () => {
        const range = selection.getRangeAt(0);
        const rect = range.getBoundingClientRect();

-        // Calculate position: top-right of the selection
+        // Get viewport dimensions
+        const viewportWidth = window.innerWidth;
+        const viewportHeight = window.innerHeight;
+        
+        // Calculate position based on selection
        const scrollX = window.scrollX;
        const scrollY = window.scrollY;
-        const buttonTop = rect.top + scrollY - askAiButton.offsetHeight - 5; // 5px above
-        const buttonLeft = rect.right + scrollX + 5; // 5px to the right
+        
+        // Default position (top-right of selection)
+        let buttonTop = rect.top + scrollY - askAiButton.offsetHeight - 5; // 5px above
+        let buttonLeft = rect.right + scrollX + 5; // 5px to the right
+        
+        // Check if we're on mobile (which we define as less than 768px)
+        const isMobile = viewportWidth <= 768;
+        
+        if (isMobile) {
+            // On mobile, position centered above selection to avoid edge issues
+            buttonTop = rect.top + scrollY - askAiButton.offsetHeight - 10; // 10px above on mobile
+            buttonLeft = rect.left + scrollX + (rect.width / 2) - (askAiButton.offsetWidth / 2); // Centered
+        } else {
+            // For desktop, ensure the button doesn't go off screen
+            // Check right edge
+            if (buttonLeft + askAiButton.offsetWidth > scrollX + viewportWidth) {
+                buttonLeft = scrollX + viewportWidth - askAiButton.offsetWidth - 10; // 10px from right edge
+            }
+        }
+        
+        // Check top edge (for all devices)
+        if (buttonTop < scrollY) {
+            // If would go above viewport, position below selection instead
+            buttonTop = rect.bottom + scrollY + 5; // 5px below
+        }

        askAiButton.style.top = `${buttonTop}px`;
        askAiButton.style.left = `${buttonLeft}px`;
@@ -77,8 +124,8 @@ document.addEventListener('DOMContentLoaded', () => {

    // --- Event Listeners ---

-    // Show button on mouse up after selection
-    document.addEventListener('mouseup', (event) => {
+    // Function to handle selection events (both mouse and touch)
+    function handleSelectionEvent(event) {
        // Slight delay to ensure selection is registered
        setTimeout(() => {
            const selectedText = getSafeSelectedText();
@@ -86,7 +133,7 @@ document.addEventListener('DOMContentLoaded', () => {
                if (!askAiButton) {
                    askAiButton = createAskAiButton();
                }
-                // Don't position if the click was ON the button itself
+                // Don't position if the event was ON the button itself
                if (event.target !== askAiButton) {
                     positionButton(event);
                }
@@ -94,16 +141,46 @@ document.addEventListener('DOMContentLoaded', () => {
                hideButton();
            }
        }, 10); // Small delay
+    }
+
+    // Mouse selection events (desktop)
+    document.addEventListener('mouseup', handleSelectionEvent);
+
+    // Touch selection events (mobile)
+    document.addEventListener('touchend', handleSelectionEvent);
+    document.addEventListener('selectionchange', () => {
+        // This helps with mobile selection which can happen without mouseup/touchend
+        setTimeout(() => {
+            const selectedText = getSafeSelectedText();
+            if (selectedText && askAiButton) {
+                positionButton();
+            }
+        }, 300); // Longer delay for selection change
    });

-    // Hide button on scroll or click elsewhere
+    // Hide button on various events
    document.addEventListener('mousedown', (event) => {
        // Hide if clicking anywhere EXCEPT the button itself
        if (askAiButton && event.target !== askAiButton) {
            hideButton();
        }
    });
+    
+    document.addEventListener('touchstart', (event) => {
+        // Same for touch events, but only hide if not on the button
+        if (askAiButton && event.target !== askAiButton) {
+            hideButton();
+        }
+    });
+    
    document.addEventListener('scroll', hideButton, true); // Capture scroll events
+    
+    // Also hide when pressing Escape key
+    document.addEventListener('keydown', (event) => {
+        if (event.key === 'Escape') {
+            hideButton();
+        }
+    });

    console.log("Selection Ask AI script loaded.");
 });
--- a/docs/md_v2/blog/index.md
+++ b/docs/md_v2/blog/index.md
@@ -4,6 +4,32 @@ Welcome to the Crawl4AI blog! Here you'll find detailed release notes, technical

 ## Latest Release

+Here’s the blog index entry for **v0.6.0**, written to match the exact tone and structure of your previous entries:
+
+---
+
+### [Crawl4AI v0.6.0 – World-Aware Crawling, Pre-Warmed Browsers, and the MCP API](releases/0.6.0.md)
+*April 23, 2025*
+
+Crawl4AI v0.6.0 is our most powerful release yet. This update brings major architectural upgrades including world-aware crawling (set geolocation, locale, and timezone), real-time traffic capture, and a memory-efficient crawler pool with pre-warmed pages.  
+
+The Docker server now exposes a full-featured MCP socket + SSE interface, supports streaming, and comes with a new Playground UI. Plus, table extraction is now native, and the new stress-test framework supports crawling 1,000+ URLs.  
+
+Other key changes:  
+
+*   Native support for `result.media["tables"]` to export DataFrames  
+* Full network + console logs and MHTML snapshot per crawl  
+* Browser pooling and pre-warming for faster cold starts  
+* New streaming endpoints via MCP API and Playground  
+* Robots.txt support, proxy rotation, and improved session handling  
+* Deprecated old markdown names, legacy modules cleaned up  
+* Massive repo cleanup: ~36K insertions, ~5K deletions across 121 files
+
+[Read full release notes →](releases/0.6.0.md)
+
+---
+
+Let me know if you want me to auto-update the actual file or just paste this into the markdown.

 ### [Crawl4AI v0.5.0: Deep Crawling, Scalability, and a New CLI!](releases/0.5.0.md)

--- a/docs/md_v2/blog/releases/0.6.0.md
+++ b/docs/md_v2/blog/releases/0.6.0.md
@@ -1,51 +1,143 @@
-# Crawl4AI 0.6.0
+# Crawl4AI v0.6.0 Release Notes

-*Release date: 2025‑04‑22*
-
-0.6.0 is the **biggest jump** since the 0.5 series, packing a smarter browser core, pool‑based crawlers, and a ton of DX candy. Expect faster runs, lower RAM burn, and richer diagnostics.
+We're excited to announce the release of **Crawl4AI v0.6.0**, our biggest and most feature-rich update yet. This version introduces major architectural upgrades, brand-new capabilities for geo-aware crawling, high-efficiency scraping, and real-time streaming support for scalable deployments.

 ---

-## 🚀 Key upgrades
+## Highlights

-| Area | What changed |
-|------|--------------|
-| **Browser** | New **Browser** management with pooling, page pre‑warm, geolocation + locale + timezone switches |
-| **Crawler** | Console and network log capture, MHTML snapshots, safer `get_page` API |
-| **Server & API** | **Crawler Pool Manager** endpoint, MCP socket + SSE support |
-| **Docs** | v2 layout, floating Ask‑AI helper, GitHub stats badge, copy‑code buttons, Docker API demo |
-| **Tests** | Memory + load benchmarks, 90+ new cases covering MCP and Docker |
+### 1. **World-Aware Crawlers**
+Crawl as if you’re anywhere in the world. With v0.6.0, each crawl can simulate:
+- Specific GPS coordinates
+- Browser locale
+- Timezone
+
+Example:
+```python
+CrawlerRunConfig(
+    url="https://browserleaks.com/geo",
+    locale="en-US",
+    timezone_id="America/Los_Angeles",
+    geolocation=GeolocationConfig(
+        latitude=34.0522,
+        longitude=-118.2437,
+        accuracy=10.0
+    )
+)
+```
+Great for accessing region-specific content or testing global behavior.

 ---

-## ⚠️ Breaking changes
+### 2. **Native Table Extraction**
+Extract HTML tables directly into usable formats like Pandas DataFrames or CSV with zero parsing hassle. All table data is available under `result.media["tables"]`.

-1. **`get_page` signature** – returns `(html, metadata)` instead of plain html.
-2. **Docker** – new Chromium base layer, rebuild images.
+Example:
+```python
+raw_df = pd.DataFrame(
+    result.media["tables"][0]["rows"],
+    columns=result.media["tables"][0]["headers"]
+)
+```
+This makes it ideal for scraping financial data, pricing pages, or anything tabular.

 ---

-## How to upgrade
+### 3. **Browser Pooling & Pre-Warming**
+We've overhauled browser management. Now, multiple browser instances can be pooled and pages pre-warmed for ultra-fast launches:
+- Reduces cold-start latency
+- Lowers memory spikes
+- Enhances parallel crawling stability

+This powers the new **Docker Playground** experience and streamlines heavy-load crawling.
+
+---
+
+### 4. **Traffic & Snapshot Capture**
+Need full visibility? You can now capture:
+- Full network traffic logs
+- Console output
+- MHTML page snapshots for post-crawl audits and debugging
+
+No more guesswork on what happened during your crawl.
+
+---
+
+### 5. **MCP API and Streaming Support**
+We’re exposing **MCP socket and SSE endpoints**, allowing:
+- Live streaming of crawl results
+- Real-time integration with agents or frontends
+- A new Playground UI for interactive crawling
+
+This is a major step towards making Crawl4AI real-time ready.
+
+---
+
+### 6. **Stress-Test Framework**
+Want to test performance under heavy load? v0.6.0 includes a new memory stress-test suite that supports 1,000+ URL workloads. Ideal for:
+- Load testing
+- Performance benchmarking
+- Validating memory efficiency
+
+---
+
+## Core Improvements
+- Robots.txt compliance
+- Proxy rotation support
+- Improved URL normalization and session reuse
+- Shared data across crawler hooks
+- New page routing logic
+
+---
+
+## Breaking Changes & Deprecations
+- Legacy `crawl4ai/browser/*` modules are removed. Update imports accordingly.
+- `AsyncPlaywrightCrawlerStrategy.get_page` now uses a new function signature.
+- Deprecated markdown generator aliases now point to `DefaultMarkdownGenerator` with warning.
+
+---
+
+## Miscellaneous Updates
+- FastAPI validators replaced custom validation logic
+- Docker build now based on a Chromium layer
+- Repo-wide cleanup: ~36,000 insertions, ~5,000 deletions
+
+---
+
+## New Examples Included
+- Geo-location crawling
+- Network + console log capture
+- Docker MCP API usage
+- Markdown selector usage
+- Crypto project data extraction
+
+---
+
+## Watch the Release Video
+Want a visual walkthrough of all these updates? Watch the video:
+🔗 https://youtu.be/9x7nVcjOZks
+
+If you're new to Crawl4AI, start here:
+🔗 https://www.youtube.com/watch?v=xo3qK6Hg9AA&t=15s
+
+---
+
+## Join the Community
+We’ve just opened up our **Discord** for the public. Join us to:
+- Ask questions
+- Share your projects
+- Get help or contribute
+
+💬 https://discord.gg/wpYFACrHR4
+
+---
+
+## Install or Upgrade
 ```bash
-pip install -U crawl4ai==0.6.0
+pip install -U crawl4ai
 ```

 ---

-## Full changelog
-
-The diff between `main` and `next` spans **36 k insertions, 4.9 k deletions** over 121 files. Read the [compare view](https://github.com/unclecode/crawl4ai/compare/0.5.0.post8...0.6.0) or see `CHANGELOG.md` for the granular list.
-
---
-
-## Upgrade tips
-
-* Using the Docker API? Pull `unclecode/crawl4ai:0.6.0`, new args are documented in `/deploy/docker/README.md`.
-* Stress‑test your stack with `tests/memory/run_benchmark.py` before production rollout.
-* Markdown generators renamed but aliased, update when convenient, warnings will remind you.
-
---
-
-Happy crawling, ping `@unclecode` on X for questions or memes.
+Live long and import crawl4ai. 🖖

--- a/docs/md_v2/core/docker-deployment.md
+++ b/docs/md_v2/core/docker-deployment.md
@@ -58,7 +58,7 @@ Pull and run images directly from Docker Hub without building locally.

 #### 1. Pull the Image

-Our latest release candidate is `0.6.0rc1-r2`. Images are built with multi-arch manifests, so Docker automatically pulls the correct version for your system.
+Our latest release candidate is `0.6.0-r2`. Images are built with multi-arch manifests, so Docker automatically pulls the correct version for your system.

 ```bash
 # Pull the release candidate (recommended for latest features)
@@ -124,9 +124,9 @@ docker stop crawl4ai && docker rm crawl4ai
 #### Docker Hub Versioning Explained

 *   **Image Name:** `unclecode/crawl4ai`
-*   **Tag Format:** `LIBRARY_VERSION[-SUFFIX]` (e.g., `0.6.0rc1-r2`)
+*   **Tag Format:** `LIBRARY_VERSION[-SUFFIX]` (e.g., `0.6.0-r2`)
    *   `LIBRARY_VERSION`: The semantic version of the core `crawl4ai` Python library
-    *   `SUFFIX`: Optional tag for release candidates (`rc1`) and revisions (`r1`)
+    *   `SUFFIX`: Optional tag for release candidates (``) and revisions (`r1`)
 *   **`latest` Tag:** Points to the most recent stable version
 *   **Multi-Architecture Support:** All images support both `linux/amd64` and `linux/arm64` architectures through a single tag

--- a/tests/profiler/test_crteate_profile.py
+++ b/tests/profiler/test_crteate_profile.py
@@ -0,0 +1,32 @@
+from crawl4ai import BrowserProfiler
+import asyncio
+
+
+if __name__ == "__main__":
+    # Example usage
+    profiler = BrowserProfiler()
+    
+    # Create a new profile
+    import os
+    from pathlib import Path
+    home_dir = Path.home()
+    profile_path = asyncio.run(profiler.create_profile( str(home_dir / ".crawl4ai/profiles/test-profile")))
+    
+    print(f"Profile created at: {profile_path}")
+
+        
+            
+    # # Launch a standalone browser
+    # asyncio.run(profiler.launch_standalone_browser())
+    
+    # # List profiles
+    # profiles = profiler.list_profiles()
+    # for profile in profiles:
+    #     print(f"Profile: {profile['name']}, Path: {profile['path']}")
+    
+    # # Delete a profile
+    # success = profiler.delete_profile("my-profile")
+    # if success:
+    #     print("Profile deleted successfully")
+    # else:
+    #     print("Failed to delete profile")
Author	SHA1	Message	Date
UncleCode	0e5d672763	Merge branch 'pr-971' into merge-pr971	2025-05-01 18:57:28 +08:00
wakaka6	cd2b490b40	refactor(logger): Apply the Enumeration for color	2025-05-01 17:04:44 +08:00
UncleCode	50f0b83fcd	feat(linkedin): add prospect-wizard app with scraping and visualization Add new LinkedIn prospect discovery tool with three main components: - c4ai_discover.py for company and people scraping - c4ai_insights.py for org chart and decision maker analysis - Interactive graph visualization with company/people exploration Features include: - Configurable LinkedIn search and scraping - Org chart generation with decision maker scoring - Interactive network graph visualization - Company similarity analysis - Chat interface for data exploration Requires: crawl4ai, openai, sentence-transformers, networkx	2025-04-30 19:38:25 +08:00
UncleCode	9499164d3c	feat(browser): improve browser profile management and cleanup Enhance browser profile handling with better process cleanup and documentation: - Add process cleanup for existing Chromium instances on Windows/Unix - Fix profile creation by passing complete browser config - Add comprehensive documentation for browser and CLI components - Add initial profile creation test - Bump version to 0.6.3 This change improves reliability when managing browser profiles and provides better documentation for developers.	2025-04-29 23:04:32 +08:00
UncleCode	2140d9aca4	fix(browser): correct headless mode default behavior Modify BrowserConfig to respect explicit headless parameter setting instead of forcing True. Update version to 0.6.2 and clean up code formatting in examples. BREAKING CHANGE: BrowserConfig no longer defaults to headless=True when explicitly set to False	2025-04-26 21:09:50 +08:00
UncleCode	ccec40ed17	feat(models): add dedicated tables field to CrawlResult - Add tables field to CrawlResult model while maintaining backward compatibility - Update async_webcrawler.py to extract tables from media and pass to tables field - Update crypto_analysis_example.py to use the new tables field - Add /config/dump examples to demo_docker_api.py - Bump version to 0.6.1	2025-04-24 18:36:25 +08:00
UncleCode	ad4dfb21e1	Remoce "rc1"	2025-04-23 21:00:00 +08:00
UncleCode	7784b2468e	feat(docs): enhance Ask AI button UX and add v0.6.0 release notes Improve Ask AI button with better mobile support, animations, and positioning: - Add button animations and hover effects - Improve mobile responsiveness - Add icon to button - Fix positioning logic for different viewport sizes - Add keyboard (Escape) support Add comprehensive v0.6.0 release documentation: - Create detailed release notes - Update blog index with latest release - Document all major features and breaking changes BREAKING CHANGE: Documentation structure updated with new v0.6.0 section	2025-04-23 20:07:03 +08:00
wakaka6	b2f3cb0dfa	WIP: logger migriate to rich	2025-04-11 00:44:43 +08:00