Compare commits

..

6 Commits

Author SHA1 Message Date
unclecode
9ffa34b697 Update README 2024-10-14 22:58:27 +08:00
unclecode
740802c491 Merge branch '0.3.6' 2024-10-14 22:55:24 +08:00
unclecode
b9ac96c332 Merge branch 'main' of https://github.com/unclecode/crawl4ai 2024-10-14 22:54:23 +08:00
unclecode
d06535388a Update gitignore 2024-10-14 22:53:56 +08:00
UncleCode
ccbe72cfc1 Merge pull request #135 from hitesh22rana/fix/docs-example
docs: fixed css_selector for example
2024-10-13 14:39:07 +08:00
hitesh22rana
768b93140f docs: fixed css_selector for example 2024-10-05 00:25:41 +09:00
2 changed files with 10 additions and 4 deletions

4
.gitignore vendored
View File

@@ -203,6 +203,4 @@ git_changes.py
git_changes.md
pypi_build.sh
.tests/
git_changes.py
git_changes.md
.tests/

View File

@@ -10,6 +10,14 @@ Crawl4AI simplifies asynchronous web crawling and data extraction, making it acc
> Looking for the synchronous version? Check out [README.sync.md](./README.sync.md). You can also access the previous version in the branch [V0.2.76](https://github.com/unclecode/crawl4ai/blob/v0.2.76).
## New update 0.3.6
- 🌐 Multi-browser support (Chromium, Firefox, WebKit)
- 🖼️ Improved image processing with lazy-loading detection
- 🔧 Custom page timeout parameter for better control over crawling behavior
- 🕰️ Enhanced handling of delayed content loading
- 🔑 Custom headers support for LLM interactions
- 🖼️ iframe content extraction for comprehensive page analysis
- ⏱️ Flexible timeout and delayed content retrieval options
## Try it Now!
@@ -124,7 +132,7 @@ async def main():
result = await crawler.arun(
url="https://www.nbcnews.com/business",
js_code=js_code,
css_selector="article.tease-card",
css_selector=".wide-tease-item__description",
bypass_cache=True
)
print(result.extracted_content)