Source:
docs/manual/models-cost.mdThis page is generated by
site/scripts/sync-manual-docs.mjs.
Models, Pricing, and Cost
Cruvero tracks LLM usage and cost at the per-step and per-run level. It supports multiple LLM providers with automatic failover, model preference controls per namespace, and cost-aware execution that can downgrade models when approaching quota limits.
Source: internal/llm/*, internal/agent/cost.go
Provider
- Providers:
openrouter(default),azure,openai, andgoogle. - Model IDs are provider-scoped:
- OpenRouter:
x-ai/grok-4.1-fast - Azure:
azure/<deployment> - OpenAI:
openai/<model> - Google Gemini:
google/<model>
- OpenRouter:
Model Catalog
- Stored in
model_catalog(Postgres). - Ingested via
cmd/models-refresh. - Queried via
cmd/models-list. For Azure,cmd/models-refresh --source azureuses: CRUVERO_AZURE_PRICING_JSON(optional, per deployment)CRUVERO_AZURE_CONTEXT_JSON(optional, per deployment)
Azure Override Format
Azure deployment listing does not include price/context metadata, so provide overrides with JSON maps.
CRUVERO_AZURE_PRICING_JSON accepts deployment keys (with or without azure/ prefix):
{
"deploy-gpt4o-prod": { "prompt": "0.000005", "completion": "0.000015", "request": "0" },
"azure/deploy-gpt4o-mini": { "prompt": "0.00000015", "completion": "0.00000060", "request": "0" },
"gpt-4o": { "prompt": "0.000005", "completion": "0.000015", "request": "0" }
}
promptandcompletionare per-token prices (same format used across model catalog pricing fields).requestis an optional fixed per-request surcharge.- Deployment-specific keys win over model-name fallback keys when both are present.
CRUVERO_AZURE_CONTEXT_JSON maps deployment/model keys to context length:
{
"deploy-gpt4o-prod": 128000,
"azure/deploy-gpt4o-mini": 128000,
"gpt-4o": 128000
}
Cost Accounting
- Each decision stores estimated and actual cost.
- If usage cost is returned by provider, it supersedes estimates.
- Per-tool cost hints can be set in tool registry definitions.
- For Azure, when provider usage cost is
0, Cruvero computes cost as:request + (prompt_tokens * prompt_price) + (completion_tokens * completion_price)- using
CRUVERO_AZURE_PRICING_JSONdeployment overrides.
Cost Policy
MaxCostPerRun,MaxCostPerStepPreferCheaperwithCheapModelToolCostDefault
Preferred Models
- Stored in
kv_storeundermodel_prefs:<namespace>. - Managed via
cmd/model-prefsor toolmodel_prefs.
Speculative Execution
- Run multiple models and select by lowest cost or highest similarity.
cmd/run --spec-models ... --spec-select ...