Add token usage tracking to generate_schema / agenerate_schema

generate_schema can make up to 5 internal LLM calls (field inference,
schema generation, validation retries) with no way to track token
consumption. Add an optional `usage: TokenUsage = None` parameter that
accumulates prompt/completion/total tokens across all calls in-place.

- _infer_target_json: accept and populate usage accumulator
- agenerate_schema: track usage after every aperform_completion call
  in the retry loop, forward usage to _infer_target_json
- generate_schema (sync): forward usage to agenerate_schema

Fully backward-compatible — omitting usage changes nothing.
This commit is contained in:
unclecode
2026-02-18 06:44:17 +00:00
parent 8576331d4e
commit c9cb0160cf
5 changed files with 726 additions and 3 deletions

View File

@@ -204,7 +204,9 @@ llm_strategy.show_usage()
# e.g. “Total usage: 1241 tokens across 2 chunk calls”
```
If your model provider doesnt return usage info, these fields might be partial or empty.
If your model provider doesn't return usage info, these fields might be partial or empty.
> **Tip:** `JsonCssExtractionStrategy.generate_schema()` also supports token usage tracking via an optional `usage` parameter. See [Token Usage Tracking in Schema Generation](./no-llm-strategies.md#token-usage-tracking) for details.
---