fix(docs): standardize C4A-Script tutorial, add CLI identity-based crawling, and add sponsorship CTA

- Switch installs to pip install -r requirements.txt (tutorial and app docs) - Update local run steps to python server.py and http://localhost:8000 - Set default PORT to 8000; update port-in-use commands and alt port 8001 - Replace unsupported :contains() example with accessible attribute selector - Update example URLs in tutorial servers to 127.0.0.1:8000 - Add “Identity-based crawling” section with crwl profiles CLI workflow and code usage - Replace legacy-docs note with sponsorship message in docs/md_v2/index.md - Minor copy and consistency fixes across pages
2025-10-03 22:00:46 +05:30
parent 80aa6c11d9
commit 7dfe528d43
7 changed files with 55 additions and 19 deletions
--- a/docs/md_v2/advanced/identity-based-crawling.md
+++ b/docs/md_v2/advanced/identity-based-crawling.md
@@ -82,6 +82,42 @@ If you installed Crawl4AI (which installs Playwright under the hood), you alread

 ---

+### Creating a Profile Using the Crawl4AI CLI (Easiest)
+
+If you prefer a guided, interactive setup, use the built-in CLI to create and manage persistent browser profiles.
+
+1.⠀Launch the profile manager:
+   ```bash
+   crwl profiles
+   ```
+
+2.⠀Choose "Create new profile" and enter a profile name. A Chromium window opens so you can log in to sites and configure settings. When finished, return to the terminal and press `q` to save the profile.
+
+3.⠀Profiles are saved under `~/.crawl4ai/profiles/<profile_name>` (for example: `/home/<you>/.crawl4ai/profiles/test_profile_1`) along with a `storage_state.json` for cookies and session data.
+
+4.⠀Optionally, choose "List profiles" in the CLI to view available profiles and their paths.
+
+5.⠀Use the saved path with `BrowserConfig.user_data_dir`:
+   ```python
+   from crawl4ai import AsyncWebCrawler, BrowserConfig
+
+   profile_path = "/home/<you>/.crawl4ai/profiles/test_profile_1"
+
+   browser_config = BrowserConfig(
+       headless=True,
+       use_managed_browser=True,
+       user_data_dir=profile_path,
+       browser_type="chromium",
+   )
+
+   async with AsyncWebCrawler(config=browser_config) as crawler:
+       result = await crawler.arun(url="https://example.com/private")
+   ```
+
+The CLI also supports listing and deleting profiles, and even testing a crawl directly from the menu.
+
+---
+
 ## 3. Using Managed Browsers in Crawl4AI

 Once you have a data directory with your session data, pass it to **`BrowserConfig`**: