Rollback & Kill Switches

Can you stop your agent in 30 seconds? If not, you're not production-ready.

On this page

The Failure Scenario
Why This Matters
How to Implement
Production Checklist
Common Pitfalls
Terminal Output

The Failure Scenario

A SaaS company ships an agent update on Friday afternoon that changes how the agent handles subscription cancellations. Instead of processing cancellation requests, the updated agent interprets "I want to cancel" as "I want to cancel my cancellation" and reactivates accounts with a new billing cycle. By Saturday morning, 340 customers have been re-enrolled and charged. The on-call engineer gets paged but cannot find the agent's deployment configuration. There is no kill switch.

The engineer tries to roll back by redeploying the previous version, but the deployment pipeline requires a passing CI suite and two approvals. Neither is available on a Saturday at 6 AM. They try disabling the agent's API key, but the agent uses a shared service account that also powers the billing dashboard. Killing the key would take down the entire admin panel. Three hours pass before they find a way to stop the agent without collateral damage.

Three hours of an agent acting on bad logic at scale is not an inconvenience. It's a business crisis. The refund processing alone takes two weeks. Customer trust takes longer. Every minute between "we know there is a problem" and "the agent is stopped" multiplies the damage. If your shutdown path requires pipeline approvals, shared credentials, or tribal knowledge about which config to change, you do not have a kill switch. You have a hope.

Why This Matters

Agent failures are not like service outages. A broken API returns errors, and users notice, complain, and stop trying. A broken agent continues to take actions with full confidence, producing outputs that look correct but are not. The failure mode is silent corruption, not noisy downtime. This means the window between "failure starts" and "failure is detected" can be hours or days, and every action during that window may need to be reversed.

Kill switches and rollback mechanisms are the operational foundation that makes agent deployment survivable. Without them, every deploy is a one-way door. With them, you can ship with confidence because you know you can revert in seconds if something goes wrong. This is not about preventing failures. Failures are inevitable in any sufficiently complex system. It's about controlling the blast radius.

Circuit breakers add a layer of automatic protection. If your agent's error rate spikes, if tool calls start failing at an unusual rate, or if the agent's behavior deviates from established baselines, the circuit breaker should trip automatically. It should stop the agent or degrade it to a safe mode without waiting for a human to notice and react. The fastest kill switch is the one that does not need a human to pull it.

How to Implement

Build a dedicated kill switch endpoint that is independent of the agent's main deployment infrastructure. This should be a simple feature flag or configuration value that the agent checks before every action. When the flag is set, the agent stops processing new requests immediately, completes any in-flight actions that are safe to finish, and returns a graceful fallback response to users. The kill switch must not require a deploy, a pipeline run, or an approval. It should be a single API call or button press.

Implement circuit breakers that monitor three signals: error rate (percentage of tool calls failing), anomaly rate (actions that deviate from baseline patterns), and volume rate (sudden spikes in agent activity that suggest a feedback loop). When any signal crosses its threshold, the circuit breaker moves the agent to a degraded mode. Read-only operations continue, but write operations are blocked pending human review.

Design your agent's state to be rollback-friendly. Every action the agent takes should record enough state to be reversed: the previous value before a database update, the original configuration before a change, the message ID of a sent communication. Store these rollback records in a dedicated table with the trace ID so you can programmatically undo an entire agent session if needed.

circuit_breaker.py

import time
from enum import Enum

class CircuitState(Enum):
    CLOSED = "closed"          # normal operation
    OPEN = "open"              # all write actions blocked
    HALF_OPEN = "half_open"    # limited actions for testing

class AgentCircuitBreaker:
    def __init__(self, config: dict):
        self.state = CircuitState.CLOSED
        self.error_threshold = config.get("error_rate_pct", 15)
        self.anomaly_threshold = config.get("anomaly_rate_pct", 10)
        self.volume_spike_factor = config.get("volume_spike_factor", 3.0)
        self.window_seconds = config.get("window_seconds", 300)
        self.cooldown_seconds = config.get("cooldown_seconds", 600)
        self._opened_at = None

    def should_allow(self, action_type: str) -> bool:
        if self.state == CircuitState.CLOSED:
            return True
        if self.state == CircuitState.OPEN:
            if action_type == "read":
                return True  # reads always allowed
            if self._cooldown_elapsed():
                self.state = CircuitState.HALF_OPEN
                return True  # allow one probe
            return False
        # HALF_OPEN: allow reads + limited writes
        return action_type == "read"

    def record_outcome(self, success: bool, is_anomaly: bool):
        self._recent_calls.append({
            "ts": time.time(), "ok": success, "anomaly": is_anomaly
        })
        self._evaluate()

    def _evaluate(self):
        window = [c for c in self._recent_calls
                  if c["ts"] > time.time() - self.window_seconds]
        if not window:
            return
        error_rate = sum(1 for c in window if not c["ok"]) / len(window)
        anomaly_rate = sum(1 for c in window if c["anomaly"]) / len(window)

        if (error_rate * 100 > self.error_threshold or
                anomaly_rate * 100 > self.anomaly_threshold):
            self.state = CircuitState.OPEN
            self._opened_at = time.time()
            alert_oncall("Circuit breaker OPEN", {
                "error_rate": error_rate,
                "anomaly_rate": anomaly_rate,
            })

Production Checklist

✓A kill switch exists that stops the agent within 30 seconds. No deploy, no pipeline, no approval required.
✓The kill switch is independent of the agent's infrastructure. If the agent's service is unresponsive, the kill switch still works.
✓Circuit breakers monitor error rate, anomaly rate, and volume spikes with configurable thresholds.
✓When the circuit breaker trips, write operations stop but read operations continue (graceful degradation, not total outage).
✓Every write action records rollback state: previous values, original configs, message IDs. That is enough to programmatically undo.
✓A rollback script can reverse all actions from a specific time window or trace ID with a single command.
✓Agent deployments use canary or blue-green strategies. New versions serve 5% of traffic before full rollout.
✓Fallback responses are configured for when the agent is stopped. Users see a helpful message, not a 500 error.
✓Kill switch activation is logged and alerts the on-call team with the reason and the activator's identity.
✓Monthly drills test the kill switch and rollback process. Verify that the team can stop the agent and reverse actions within the target time.

Common Pitfalls

The most common failure is building a kill switch that depends on the same infrastructure as the agent. If your agent runs on a Kubernetes cluster and the kill switch is a config change that requires kubectl access to the same cluster, a cluster-level incident takes out both the agent and your ability to stop it. The kill switch should be an external feature flag service, a separate API, or at minimum a DNS-level redirect that routes traffic away from the agent.

Another pitfall is treating rollback as "just redeploy the old version." Agent state is not the same as code state. Even if you roll back the code, the actions the broken version took are still in effect. Customers were charged, emails were sent, and records were modified. You need data-level rollback, not just code-level rollback. If your agent modifies external state, you need compensating transactions that reverse those modifications.

Teams also frequently set circuit breaker thresholds too high because they are afraid of false positives. A 50% error rate threshold means half of all actions are failing before the breaker trips. That's a catastrophe, not a threshold. Start aggressive: 10-15% error rate over a 5-minute window. You can always widen the threshold after observing baseline behavior. A circuit breaker that never trips is not providing protection.

Terminal Output

terminal

$ clawproof --check 05

  CHECK 05 — Rollback & Kill Switches
  ────────────────────────────────────

  [PASS] Kill switch endpoint active (independent service)
  [PASS] Kill switch response time: ~4 seconds (target: <30s)
  [PASS] Circuit breaker configured: error 15%, anomaly 10%
  [PASS] Graceful degradation: reads continue when breaker is open
  [PASS] Rollback state recorded for write actions (3 action types)
  [WARN] Rollback script covers 2/3 write action types — db_update missing
  [PASS] Canary deployment configured: 5% initial traffic
  [PASS] Fallback response template configured
  [PASS] Kill switch activation logging and alerting active
  [FAIL] No kill switch drill in the last 90 days

  Result: 8 passed, 1 warning, 1 failed
  Status: NEEDS ATTENTION

$ clawproof --related

Referenced In

articleAnatomy of an Agent Incident: The Runaway Email Bot articleWhy Your AI Agent Needs a Pre-Flight Checklist playbookSecuring LangChain Agent Pipelines

Previous← #04 Human-in-the-Loop & Escalation Next#06 Secrets Management →