Tool Registry Restructure

The tool registry restructure (Phase 19) extends the tool registry with usage metrics, quality tracking, semantic search, and automated quarantine. It adds a metrics store, weighted scoring, and a quality tracker that can disable tools exhibiting degraded behavior.

Source: internal/registry/* (extended)

Overview

The tool registry restructure upgrades tool discovery from keyword-based matching to semantic vector search with quality tracking. It introduces a 3-stage search pipeline (vector retrieval, quality re-ranking, result assembly), per-tool quality metrics with LLM auto-rating, degradation detection with quarantine escalation, and user-submitted feedback. All functionality extends the existing internal/registry/ package.

When enabled (CRUVERO_TOOL_SEARCH_SEMANTIC=true), the agent's filterRegistryForPrompt uses semantic vector search instead of keyword scoring. When disabled or when the vector store is unavailable, it falls back to existing keyword scoring transparently.

Semantic Search

Three-stage pipeline for tool discovery by semantic similarity:

Stage 1: Vector Retrieval

Embed the query text using the configured embedding provider
Search the tool_registry vector store collection for top-K candidates (default K=30)
Apply tenant isolation filter

Stage 2: Quality Re-Ranking

Score each candidate using a weighted formula:

score = W_sim * similarity + W_qual * quality + W_rec * recency

Weight	Default	Source
Similarity	0.5	Vector cosine similarity from Stage 1
Quality	0.35	`success_rate * avg_llm_rating` from tool metrics
Recency	0.15	Recency decay from last successful call

Tools with active quarantine entries are excluded from results.

Stage 3: Result Assembly

Sort by composite score, truncate to requested limit (default 20)
Return scored results with component breakdowns for transparency

Quality Tracking

LLM Auto-Rating

After each tool execution in ToolExecuteActivity, a non-blocking Temporal activity records an ExecutionOutcome including:

Binary success/failure (existing)
Execution latency
LLM quality rating (0.0-1.0) from a post-execution assessment prompt

The LLM rating uses a lightweight prompt asking the model to rate tool output relevance and correctness. This runs as a child activity with short timeout (5s) and fire-and-forget semantics — tool execution is never blocked.

Composite Quality Score

Each tool maintains a running quality score computed as success_rate * avg_llm_rating. This score is stored in the extended tool_retry_stats table and used during search re-ranking.

Degradation Detection

A rolling quality score is computed per tool. When the score drops below a configurable threshold:

Warning — Structured log warning + NATS event (if Phase 12 active) or memory episode fallback
Alert — Set degraded_since timestamp in tool metrics
Quarantine escalation — If quality stays below threshold for N consecutive calls, insert into existing tool_quarantine table with reason referencing quality degradation

Tool Feedback

User-submitted quality ratings via the tool-feedback CLI or API. Records a rating (0.0-1.0) and optional comment into the tool_feedback table. Feedback contributes to the tool's running quality metrics without modifying immutable tool definitions.

Configuration

Variable	Default	Description
`CRUVERO_TOOL_SEARCH_SEMANTIC`	`false`	Enable semantic vector search for tool discovery
`CRUVERO_TOOL_SEARCH_COLLECTION`	`tool_registry`	Vector store collection name
`CRUVERO_TOOL_SEARCH_K`	`30`	Vector retrieval candidates (Stage 1)
`CRUVERO_TOOL_SEARCH_RESULT_LIMIT`	`20`	Max tools returned to agent
`CRUVERO_TOOL_SEARCH_W_SIMILARITY`	`0.5`	Ranking weight: vector similarity
`CRUVERO_TOOL_SEARCH_W_QUALITY`	`0.35`	Ranking weight: quality score
`CRUVERO_TOOL_SEARCH_W_RECENCY`	`0.15`	Ranking weight: recency decay
`CRUVERO_TOOL_QUALITY_ENABLED`	`true`	Enable quality tracking and LLM auto-rating
`CRUVERO_TOOL_QUALITY_RATING_TIMEOUT`	`5s`	Timeout for LLM auto-rating activity
`CRUVERO_TOOL_QUALITY_DEGRADE_THRESHOLD`	`0.3`	Quality score below which a tool is considered degraded
`CRUVERO_TOOL_QUALITY_QUARANTINE_AFTER`	`5`	Consecutive degraded calls before quarantine escalation

CLI Tools

tool-feedback submit

Submit a quality rating for a tool.

tool-feedback submit --tool email_dispatch --rating 0.85 --comment "Fast and reliable"

tool-feedback list

List recent feedback for a tool.

tool-feedback list --tool email_dispatch --limit 20
tool-feedback list --tool email_dispatch --format json

tool-feedback metrics

Display current quality metrics for a tool or list degraded tools.

tool-feedback metrics --tool email_dispatch
tool-feedback metrics --degraded
tool-feedback metrics --tool email_dispatch --format json

seed-registry (indexing)

The existing seed-registry CLI is extended to index tool descriptions into the vector store after seeding.

seed-registry --file tools.json --tenant my_tenant

Overview​

Semantic Search​

Stage 1: Vector Retrieval​

Stage 2: Quality Re-Ranking​

Stage 3: Result Assembly​

Quality Tracking​

LLM Auto-Rating​

Composite Quality Score​

Degradation Detection​

Tool Feedback​

Configuration​

CLI Tools​

tool-feedback submit​

tool-feedback list​

tool-feedback metrics​

seed-registry (indexing)​

Related Docs​