refactor(browser): improve browser strategy architecture and lifecycle management

Major refactoring of browser strategy implementations to improve code organization and reliability:
- Move CrawlResultContainer and RunManyReturn types from async_webcrawler to models.py
- Simplify browser lifecycle management in AsyncWebCrawler
- Standardize browser strategy interface with _generate_page method
- Improve headless mode handling and browser args construction
- Clean up Docker and Playwright strategy implementations
- Fix session management and context handling across strategies

BREAKING CHANGE: Browser strategy interface has changed with new _generate_page method requirement
This commit is contained in:
UncleCode
2025-03-30 20:58:39 +08:00
parent 3ff7eec8f3
commit bb02398086
11 changed files with 271 additions and 459 deletions

View File

@@ -530,7 +530,7 @@ async def test_docker_registry_reuse():
logger.info("First browser started successfully", tag="TEST")
# Get container ID from the strategy
docker_strategy1 = manager1._strategy
docker_strategy1 = manager1.strategy
container_id1 = docker_strategy1.container_id
logger.info(f"First browser container ID: {container_id1[:12]}", tag="TEST")
@@ -560,7 +560,7 @@ async def test_docker_registry_reuse():
logger.info("Second browser started successfully", tag="TEST")
# Get container ID from the second strategy
docker_strategy2 = manager2._strategy
docker_strategy2 = manager2.strategy
container_id2 = docker_strategy2.container_id
logger.info(f"Second browser container ID: {container_id2[:12]}", tag="TEST")