From a3d41c795132a8858535e1ce60406e2f36bdd40f Mon Sep 17 00:00:00 2001 From: ntohidi Date: Tue, 8 Jul 2025 12:24:33 +0200 Subject: [PATCH] fix: Clarify description of 'use_stemming' parameter in markdown generation documentation ref #1086 --- docs/md_v2/core/markdown-generation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/md_v2/core/markdown-generation.md b/docs/md_v2/core/markdown-generation.md index 1b95b965..af9b35b5 100644 --- a/docs/md_v2/core/markdown-generation.md +++ b/docs/md_v2/core/markdown-generation.md @@ -200,7 +200,7 @@ config = CrawlerRunConfig(markdown_generator=md_generator) - **`user_query`**: The term you want to focus on. BM25 tries to keep only content blocks relevant to that query. - **`bm25_threshold`**: Raise it to keep fewer blocks; lower it to keep more. -- **`use_stemming`** *(default `True`)*: If enabled, variations of words match (e.g., “learn,” “learning,” “learnt”). +- **`use_stemming`** *(default `True`)*: Whether to apply stemming to the query and content. - **`language (str)`**: Language for stemming (default: 'english'). **No query provided?** BM25 tries to glean a context from page metadata, or you can simply treat it as a scorched-earth approach that discards text with low generic score. Realistically, you want to supply a query for best results.