Skip to content
Manzia AITrusted Agents
Deploy

Staged rollouts for agent deployments

Why "ship to 1% of traffic" doesn't map cleanly onto agents, and a four-stage rollout — shadow, sandbox, gated, general — that does.

Published
Author
Manzia Editorial
Reading time
1 min read

Staged rollouts for agent deployments

The standard web-services rollout playbook — feature flag, 1% canary, expand by traffic share — leaks safety properties when applied to agents. The reason is that the failure modes aren't independent across requests. One bad tool call can persist state that poisons the next ten.

A four-stage alternative

  • Shadow. The new agent runs alongside production but its outputs are discarded. You compare its tool-call plans against the live agent's actual calls.
  • Sandbox. Real inputs, real reasoning, but tools execute against a mocked side-effect layer. Catches "would-have-done-the-wrong-thing" failures cheaply.
  • Gated. Real tools, but every irreversible action is human-confirmed. You measure how often the human disagrees.
  • General. Default path, with the gates lowered to the irreversibles you still don't fully trust.

The stages are sequential per release, not per user. Skipping a stage to ship faster is the most common cause of agent incidents we see.

Author

Manzia EditorialEditorial team

The Manzia editorial team curates research, frameworks, and field reports on building, deploying, and benchmarking Trusted Agents.