Skip to main content

Model Rotation Runbook

When to use: A new LLM model needs to be deployed, an existing model is deprecated, cost optimization requires switching providers, or quality regression is reported on the current model.

Prerequisites:

  • Access to LLM provider dashboards for the target model
  • Ability to update CRUVERO_LLM_* environment variables on workers
  • Diff-test infrastructure available (cmd/diff-test)

Trigger Conditions

Run model rotation when:

  • a new model becomes available,
  • cost optimization is required,
  • an existing model is deprecated or unstable,
  • quality regression is reported on current model.

Step-by-Step Actions

  1. Define candidate and rollback model IDs.
  2. Update test configuration:
    • CRUVERO_OPENROUTER_MODEL=<candidate> for OpenRouter, or
    • tenant/provider overrides via model preferences.
  3. Run differential validation before promotion:
    • go run ./cmd/diff-test --prompt "<representative prompt>" --model-a <current> --model-b <candidate> --registry-id default --registry-version latest
  4. Compare:
    • answer quality,
    • tool-call behavior,
    • latency,
    • estimated cost.
  5. Promote candidate in staged rollout:
    • dev/staging first,
    • then limited production tenants,
    • then global/default rollout.
  6. Monitor error, latency, and cost metrics for at least one full traffic cycle.

Verification

  • No material regression in acceptance prompts.
  • Tool-call validity and retry rates remain acceptable.
  • Per-run cost and latency within budget.
  • Health checks for active provider remain healthy.

Rollback

  1. Revert model setting (CRUVERO_OPENROUTER_MODEL or tenant model prefs).
  2. Restart/redeploy worker if env-based config requires reload.
  3. Confirm traffic returns to previous model and error rate normalizes.

Escalation Path

  • Escalate to ML/Application owners when quality drops or tool behaviors change.
  • Escalate to Platform/SRE when provider outages or severe latency prevent stable rollout.
  • Escalate to Finance/Operations when cost envelope is exceeded.