Fix temperature typo and enhance LinkedIn extraction with Colab support

- Fixed widespread typo: `temprature` → `temperature` across LLMConfig and related files
- Enhanced CSS/XPath selector guidance for more reliable LinkedIn data extraction
- Added Google Colab display server support for running Crawl4AI in notebook environments
- Improved browser debugging with verbose startup args logging
- Updated LinkedIn schemas and HTML snippets for better parsing accuracy

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
UncleCode
2025-05-25 16:47:12 +08:00
parent 9c2cc7f73c
commit 1fc45ffac8
15 changed files with 355 additions and 136 deletions

View File

@@ -107,7 +107,14 @@ _COMPANY_SCHEMA_QUERY = dedent(
IMPORTANT: Do not use the base64 kind of classes to target element. It's not reliable.
The main div parent contains these li element is "div.search-results-container" you can use this.
The <ul> parent has "role" equal to "list". Using these two should be enough to target the <li> elements."
The <ul> parent has "role" equal to "list". Using these two should be enough to target the <li> elements.
IMPORTANT: Remember there might be multiple <a> tags that start with https://www.linkedin.com/company/[NAME],
so in case you refer to them for different fields, make sure to be more specific. One has the image, and one
has the person's name.
IMPORTANT: Be very smart in selecting the correct and unique way to address the element. You should ensure
your selector points to a single element and is unique to the place that contains the information.
"""
)
@@ -421,7 +428,7 @@ def main():
cli_opts = parser.parse_args()
# decide on debug defaults
if cli_opts.debug or True:
if cli_opts.debug:
opts = detect_debug_defaults(force=True)
cli_opts = opts
else: