fix: Clarify description of 'use_stemming' parameter in markdown generation documentation ref #1086

This commit is contained in:
ntohidi
2025-07-08 12:24:33 +02:00
parent fee4c5c783
commit a3d41c7951

View File

@@ -200,7 +200,7 @@ config = CrawlerRunConfig(markdown_generator=md_generator)
- **`user_query`**: The term you want to focus on. BM25 tries to keep only content blocks relevant to that query.
- **`bm25_threshold`**: Raise it to keep fewer blocks; lower it to keep more.
- **`use_stemming`** *(default `True`)*: If enabled, variations of words match (e.g., “learn,” “learning,” “learnt”).
- **`use_stemming`** *(default `True`)*: Whether to apply stemming to the query and content.
- **`language (str)`**: Language for stemming (default: 'english').
**No query provided?** BM25 tries to glean a context from page metadata, or you can simply treat it as a scorched-earth approach that discards text with low generic score. Realistically, you want to supply a query for best results.