Add token usage tracking to generate_schema / agenerate_schema
generate_schema can make up to 5 internal LLM calls (field inference, schema generation, validation retries) with no way to track token consumption. Add an optional `usage: TokenUsage = None` parameter that accumulates prompt/completion/total tokens across all calls in-place. - _infer_target_json: accept and populate usage accumulator - agenerate_schema: track usage after every aperform_completion call in the retry loop, forward usage to _infer_target_json - generate_schema (sync): forward usage to agenerate_schema Fully backward-compatible — omitting usage changes nothing.
This commit is contained in:
@@ -204,7 +204,9 @@ llm_strategy.show_usage()
|
||||
# e.g. “Total usage: 1241 tokens across 2 chunk calls”
|
||||
```
|
||||
|
||||
If your model provider doesn’t return usage info, these fields might be partial or empty.
|
||||
If your model provider doesn't return usage info, these fields might be partial or empty.
|
||||
|
||||
> **Tip:** `JsonCssExtractionStrategy.generate_schema()` also supports token usage tracking via an optional `usage` parameter. See [Token Usage Tracking in Schema Generation](./no-llm-strategies.md#token-usage-tracking) for details.
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user