Control Loop v0.1: closed-loop AI reliability with eval rigor and human-in-the-loop guardrails
April 30, 2026 · Wei Hai
The thing most AI reliability tools get right is detection. They tell you your agent broke, show you which traces failed, maybe cluster similar errors so you see patterns instead of noise.
What they do not do is close the loop. The fix still falls on you: read the alert, understand what failed, push a new prompt or swap the model, wait for the next failure cycle to know if it helped. That is a manual process wrapped around an automated detection system, and it does not scale when your AI is handling thousands of requests per hour.
Control Loop v0.1 closes the loop. Here is what shipped as of today.
What shipped
Eval-rule-triggered remediation
The control loop has always been able to fire actions — retry with a fallback model, modify a prompt, trigger a webhook — when an eval score crosses a threshold. What v0.1 hardens is the rigor of that trigger. Actions now fire only when a configured eval rule crosses a configured threshold. Not on a heuristic, not on a raw score floor. You define what "bad" means using any of TruLayer's 25 built-in evaluators or your own custom scorer, and the control loop acts on exactly that definition.
This matters because heuristic-based remediation produces false positives that erode trust in the automation. When the retry fires on a trace that was actually acceptable, you get unnecessary cost, latency, and potential tone drift with no upside. Eval-rule rigor makes the trigger precise.
Configurable retry budget per policy
Previously, the retry count was tracked and logged but not enforced. A cascading failure could trigger repeated retries on the same trace indefinitely, compounding cost and latency.
v0.1 adds a configurable maximum retry depth per policy — set anywhere from 1 to 10, defaulting to 3. When a trace has been retried that many times without passing the triggering eval rule, the platform automatically escalates it to the human-in-the-loop queue instead of looping again. The trace detail view shows the escalation reason and retry count. The dashboard surfaces a "Retry cap hit" metric so you can see when policies are hitting their budgets in aggregate.
This is the bounded cascade. It is not optional — unconfigured infinite retry loops are a production liability, and v0.1 removes them.
Human-in-the-loop (HITL) approval gate
Any control-loop action can now be gated on human review before it executes. Set require_approval: true on the action payload, or configure the policy with ActionType = escalate, and the action parks in pending_approval status. Owners can approve or reject via POST /v1/control/actions/:id/approve and /reject.
The HITL gate serves two distinct use cases. First, genuinely high-stakes actions where you want a human signature before TruLayer modifies behavior — production agents handling payments, medical triage, legal review. Second, the automatic escalation path when the retry budget is exhausted: rather than silently giving up, TruLayer routes the trace to the queue and notifies the owner that a trace failed remediation and needs attention.
Both paths use the same HITL mechanism. The queue is visible in the dashboard. The API is in the docs at docs.trulayer.ai.
Remediation regression alerts
When a retry action fires and the trace re-enters the eval pipeline, TruLayer now produces a secondary alert if the remediated output fails a different eval rule than the one that triggered the original action.
This catches the "fixed the score, broke the tone" class of failure — where a fallback model or modified prompt resolves the faithfulness issue but introduces a new toxicity or off-brand language problem. Without this surface, you would see the original alert resolve and assume the fix worked, not knowing a new issue had been introduced. With remediation regression alerts, both signals appear as separate alerts linked to the same originating trace.
This is available for retry actions today. Fallback-model and prompt-modification paths do not yet re-enter the eval pipeline automatically — that alignment work is in progress.
Per-trace before/after deltas
After each control-loop action, the platform now computes a structured delta between the original span and the remediated span. The diff panel is inline in the trace detail view and covers:
- Eval rule scores (every rule that ran on both the original and remediated span)
- Output token length
- Latency
- Embedding similarity (how semantically different the outputs are)
- Citation density (for RAG pipelines)
This gives you a quantitative answer to "did the fix help?" beyond the pass/fail signal of whether the triggering eval rule cleared. A fixed faithfulness score is good; a fixed faithfulness score with stable tone, acceptable latency, and a similarity score above 0.85 is a reliable fix.
What is still in progress
Being direct about the gaps matters more than a clean launch narrative.
Fallback-model and prompt-modification re-evaluation. The retry path re-enters the eval pipeline automatically. The fallback-model and prompt-modification paths do not yet — re-evaluation after those actions requires a manual comparison query. We are aligning the other action types to the same pipeline path in the next sprint.
Automatic loop-depth enforcement across all action types. The retry budget cap is enforced on the retry action type. Fallback-model and prompt-modification do not yet have a configurable budget gate. We track action counts for those paths but do not auto-escalate.
Per-trace delta API. The diff panel is in the trace detail UI. A dedicated API endpoint for programmatic access to remediation deltas is on the roadmap — today you query the eval list filtered by trace_id and compare the two evaluation rows.
Why the closed loop is the right architecture
The detection half of AI reliability is a solved problem. The remediation half is where teams are still hacking together shell scripts, GitHub Actions workflows, and manual playbooks. That duct tape works at small scale and falls apart the first time you need to remediate 200 traces in a 10-minute window while your on-call is asleep.
The control loop is the production-grade version of those playbooks. It is eval-rule-backed, bounded, human-overridable, and self-monitoring. You configure the policy once and it operates across every trace that matches the trigger condition — no manual intervention required until either the fix works or the trace hits the retry budget and needs a human to look at it.
That architecture — trigger, bounded retry, HITL gate, regression monitor — is what "closed loop" means in practice. v0.1 ships all four pieces.
Getting started
The control loop is available on the Team plan and above. If you are already on Team or Business, the new capabilities are live in your account today.
New to TruLayer:
- Sign up free — Starter tier is 1M spans/month, 1K eval runs, no credit card
- Control loop quickstart — configure your first policy in under 15 minutes
- Features overview — full list of what ships with each plan
- Pricing — Team plan includes the full control loop
What's next
The next sprint focuses on completing the remediation pipeline alignment (fallback-model and prompt-modification re-evaluation) and shipping the per-trace delta API for programmatic access. The retry budget cap will extend to all action types.
If you hit a failure mode this post does not address, I want to hear about it. Open a support ticket from the dashboard or email support@trulayer.ai.