Compare commits
6 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
9ffa34b697 | ||
|
|
740802c491 | ||
|
|
b9ac96c332 | ||
|
|
d06535388a | ||
|
|
ccbe72cfc1 | ||
|
|
768b93140f |
4
.gitignore
vendored
4
.gitignore
vendored
@@ -203,6 +203,4 @@ git_changes.py
|
|||||||
git_changes.md
|
git_changes.md
|
||||||
pypi_build.sh
|
pypi_build.sh
|
||||||
|
|
||||||
.tests/
|
.tests/
|
||||||
git_changes.py
|
|
||||||
git_changes.md
|
|
||||||
10
README.md
10
README.md
@@ -10,6 +10,14 @@ Crawl4AI simplifies asynchronous web crawling and data extraction, making it acc
|
|||||||
|
|
||||||
> Looking for the synchronous version? Check out [README.sync.md](./README.sync.md). You can also access the previous version in the branch [V0.2.76](https://github.com/unclecode/crawl4ai/blob/v0.2.76).
|
> Looking for the synchronous version? Check out [README.sync.md](./README.sync.md). You can also access the previous version in the branch [V0.2.76](https://github.com/unclecode/crawl4ai/blob/v0.2.76).
|
||||||
|
|
||||||
|
## New update 0.3.6
|
||||||
|
- 🌐 Multi-browser support (Chromium, Firefox, WebKit)
|
||||||
|
- 🖼️ Improved image processing with lazy-loading detection
|
||||||
|
- 🔧 Custom page timeout parameter for better control over crawling behavior
|
||||||
|
- 🕰️ Enhanced handling of delayed content loading
|
||||||
|
- 🔑 Custom headers support for LLM interactions
|
||||||
|
- 🖼️ iframe content extraction for comprehensive page analysis
|
||||||
|
- ⏱️ Flexible timeout and delayed content retrieval options
|
||||||
|
|
||||||
## Try it Now!
|
## Try it Now!
|
||||||
|
|
||||||
@@ -124,7 +132,7 @@ async def main():
|
|||||||
result = await crawler.arun(
|
result = await crawler.arun(
|
||||||
url="https://www.nbcnews.com/business",
|
url="https://www.nbcnews.com/business",
|
||||||
js_code=js_code,
|
js_code=js_code,
|
||||||
css_selector="article.tease-card",
|
css_selector=".wide-tease-item__description",
|
||||||
bypass_cache=True
|
bypass_cache=True
|
||||||
)
|
)
|
||||||
print(result.extracted_content)
|
print(result.extracted_content)
|
||||||
|
|||||||
Reference in New Issue
Block a user