refactor(core): reorganize project structure and remove legacy code
Major reorganization of the project structure: - Moved legacy synchronous crawler code to legacy folder - Removed deprecated CLI and docs manager - Consolidated version manager into utils.py - Added CrawlerHub to __init__.py exports - Fixed type hints in async_webcrawler.py - Fixed minor bugs in chunking and crawler strategies BREAKING CHANGE: Removed synchronous WebCrawler, CLI, and docs management functionality. Users should migrate to AsyncWebCrawler.
This commit is contained in:
@@ -4,7 +4,6 @@ from collections import Counter
|
||||
import string
|
||||
from .model_loader import load_nltk_punkt
|
||||
|
||||
|
||||
# Define the abstract base class for chunking strategies
|
||||
class ChunkingStrategy(ABC):
|
||||
"""
|
||||
@@ -72,6 +71,7 @@ class NlpSentenceChunking(ChunkingStrategy):
|
||||
"""
|
||||
Initialize the NlpSentenceChunking object.
|
||||
"""
|
||||
from crawl4ai.le.legacy.model_loader import load_nltk_punkt
|
||||
load_nltk_punkt()
|
||||
|
||||
def chunk(self, text: str) -> list:
|
||||
|
||||
Reference in New Issue
Block a user