Model Rotation Runbook
When to use: A new LLM model needs to be deployed, an existing model is deprecated, cost optimization requires switching providers, or quality regression is reported on the current model.
Prerequisites:
- Access to LLM provider dashboards for the target model
- Ability to update
CRUVERO_LLM_*environment variables on workers - Diff-test infrastructure available (
cmd/diff-test)
Trigger Conditions
Run model rotation when:
- a new model becomes available,
- cost optimization is required,
- an existing model is deprecated or unstable,
- quality regression is reported on current model.
Step-by-Step Actions
- Define candidate and rollback model IDs.
- Update test configuration:
CRUVERO_OPENROUTER_MODEL=<candidate>for OpenRouter, or- tenant/provider overrides via model preferences.
- Run differential validation before promotion:
go run ./cmd/diff-test --prompt "<representative prompt>" --model-a <current> --model-b <candidate> --registry-id default --registry-version latest
- Compare:
- answer quality,
- tool-call behavior,
- latency,
- estimated cost.
- Promote candidate in staged rollout:
- dev/staging first,
- then limited production tenants,
- then global/default rollout.
- Monitor error, latency, and cost metrics for at least one full traffic cycle.
Verification
- No material regression in acceptance prompts.
- Tool-call validity and retry rates remain acceptable.
- Per-run cost and latency within budget.
- Health checks for active provider remain healthy.
Rollback
- Revert model setting (
CRUVERO_OPENROUTER_MODELor tenant model prefs). - Restart/redeploy worker if env-based config requires reload.
- Confirm traffic returns to previous model and error rate normalizes.
Escalation Path
- Escalate to ML/Application owners when quality drops or tool behaviors change.
- Escalate to Platform/SRE when provider outages or severe latency prevent stable rollout.
- Escalate to Finance/Operations when cost envelope is exceeded.