This Week in Agentic Harnesses / Published 2026-05-12

Governance Becomes Enforcement

Edited by Michael Ruescher / revised 2026-07-02

Operator Brief

Governance moved from convention to enforcement this week, and durability followed.

Upgrade / check: Hermes v0.13.0 redacts secrets by default. Verify log pipelines handle sanitized output. Signal
Paperclip v2026.512.0 fixes an SSH host-env leak that forwarded API keys to remote targets. Treat as a security advisory. Signal
Claude Code worktree.baseRef defaults to "fresh" (origin/default). Set "head" if you relied on local-HEAD branching. Signal
Try: Claude Code: dispatch a background session with claude --bg, monitor via claude agents, set a /goal on a multi-step task. Signal
Hermes: lock /goal on a multi-step task and observe the Kanban hallucination gate under real multi-agent workloads. Signal
OpenHands: enable enable_sub_agents in a multi-task session; measure whether sub-agent scoping reduces total cost or context accumulation. Signal
Watch: Cross-provider /goal convergence: Claude Code and Hermes shipped the same persistent-goal primitive within a week. Watch whether this becomes a stable abstraction or fragments by tool. Claude Code · Hermes
Default-closed governance is spreading across providers (OpenHands sub-agents, OpenClaw archive uploads, Agent Zero ODF formats). This looks like a bet for the next quarter rather than a passing release-note theme.
Uncertain: Hermes Kanban hallucination gate: model-based, schema-based, or rule-based? False-positive rate under real multi-agent workloads not yet documented. Signal
OpenHands critic calibration: what does a score of 0.4 mean operationally? When does agent_behavioral_issues fire versus user_followup_patterns? Signal
Gemini RemoteSubagentProtocol: ships with tests, no observed remote target. Google-hosted or user-controlled infrastructure? Signal

For a while, the surest way to get an AI agent to finish a job was to let it decide it had finished. It did the work, marked the ticket done, and moved on, and nothing in the system checked whether the work was real. This week two providers took that privilege away. Hermes's task board stopped letting a worker close a card until it could prove the cards it claimed to have created actually existed, and Paperclip blocked an agent from moving its own issue into review.

It is one move in a larger turn, the one that gives the week its shape. For most of the short history of these tools, governance has been a matter of convention: the agent could do the wrong thing, and operators leaned on careful prompting, documentation, and trust to keep it from happening. This week that began hardening into enforcement. Hermes made secret redaction the default instead of a setting. OpenHands shipped sub-agent delegation switched off and put a quality score on every session. Agent Zero changed its default document format and told its agents to name what they were doing instead of clicking at screen coordinates. Different teams, different architectures, one direction: the risky thing now takes a deliberate decision to switch on, and the safe thing is what happens if nobody does.

The week's other half was durability, the unglamorous work of keeping an agent on task across turns, sessions, crashes, and a compressed context window. Claude Code shipped a real supervisor surface for background agents and a /goal command that holds one to a finish line; Hermes built the same goal primitive on top of its task board; Gemini made a session portable between machines; Agent Zero gave its virtual desktop a memory that survives navigation.

The two halves need each other, which is why it matters that they landed together. Durability without enforcement is just an agent doing the wrong thing for longer. Enforcement without durability is an agent that is safe and cannot hold a goal long enough to finish anything worth governing.

Breaking Changes: Check These Before Upgrading

Hermes v0.13.0: secret redaction is now ON by default. If you have Hermes log pipelines that read raw agent output, they will receive sanitized logs after upgrade. This is the right call as a default; it is a breaking change for tooling that depends on unredacted output.

Paperclip v2026.512.0: SSH host environment was leaking. Before PR #5142, SSH remote execution forwarded the Paperclip host's environment variables (including API keys and tokens) to remote execution targets. Operators running SSH-managed agents should treat this as a security advisory and upgrade.

Hermes v0.13.0: Discord role allowlists are now guild-scoped. The prior behavior allowed a role match from any guild to authorize a cross-guild DM, a CVSS 8.1 bypass. Discord operators using role-based access control should reverify their configuration.

Claude Code v2.1.x: worktree.baseRef now defaults to "fresh". New worktrees now branch from origin/<default> rather than the local HEAD. Operators who depended on new worktrees carrying unpushed local commits should set worktree.baseRef: "head" explicitly.

Pi v0.74.0: package scope migration underway. The npm package is moving from @mariozechner/pi-coding-agent to @earendil-works/pi-coding-agent. Global installs: run pi update --self once the new package publishes. CI, Dockerfiles, and package.json pins: update the reference manually.

The gates go up

Start with the completion gates, the bluntest example of the shift. Hermes's Kanban board now checks a worker's claims before it will let a card move to done: the cards it says it created have to actually exist and belong to it, which kills both phantom references and one worker closing out another's work. Workers that quit without finishing are blocked automatically, heartbeats catch the stale ones, and per-task retry budgets stop a silent cascade of retries. Paperclip made the same point with a smaller mechanism, a control-plane fix (PR #5292) that stops an agent from sliding its own issue into in_review; that transition now needs a real review precondition, not a model's say-so. One is multi-agent coordination and the other a state-machine rule, but the claim underneath is identical. An agent saying it is done has never been evidence that it is, and the system finally checks.

The same instinct shows up as a quieter habit: leave the dangerous thing off. OpenHands shipped sub-agent delegation switched off, and behind the switch the orchestrator hands work to specialists, a bash runner, a code explorer, a web researcher, each scoped to a defined tool set rather than the full surface. That is the right default, because fanning work out to sub-agents changes a session's cost, scope, and authority enough to deserve a deliberate yes. OpenClaw did the same for skills delivered as uploaded zip archives, which stay blocked until an operator turns them on, part of a house style in which anything that can execute code is opt-in and labeled as requiring trust. Agent Zero's new document default belongs in the same column: it now writes open ODT, ODS, and ODP files unless an operator asks for the Office formats by name, reversing an assumption that used to run the other way.

Staying on task

If the first move is about stopping an agent from doing the wrong thing, the second is about keeping it doing the right thing long enough to matter. Claude Code shipped the most visible version, a claude agents supervisor that shows every session by state, working, waiting on you, done, or failed, with background sessions running under a process that outlives the terminal and each isolated in its own git worktree. You dispatch from the prompt, push a running session to the background with a keystroke, and answer a blocked one from a peek panel without attaching. A companion /goal command sets a finish condition the agent holds to across turns.

Hermes built the same idea one level down with its /goal Ralph loop, backed by the task-board reliability work above: lock an agent onto a target and it holds through context compression, turn budgets, and branching, while the board runs the multi-agent case in which workers pick up tasks and, as we just saw, cannot close them without proof. Gemini treated durability as portability, letting you export a session and import it on another machine so the state travels as a real object instead of ambient context. Agent Zero gave its virtual desktop a memory: a single Xpra session now survives navigation, modal switches, and host changes, with a deliberate shutdown told apart from a crash and unsafe controls hidden from view.

You can't govern what you can't see

Enforcement and durability both lean on a third thing the week kept improving: being able to see what an agent is doing while it does it. Codex put permissions and approval-mode on the status line as separate, configurable readouts, turning the most common operator mistake, firing off an irreversible command without remembering which permission posture is live, into a glance. Claude Code's supervisor does the same for a fleet: one color-coded panel where five terminal windows used to be, with a live overlay on each goal tracking elapsed time, turns, and tokens.

OpenHands made the quality of a session legible rather than just its state. A new critic display scores every finished session from 0 to 1, stars it out of five, and color-codes three bands, agent_behavioral_issues, user_followup_patterns, and infrastructure_issues, in the GUI. It stays off unless a deployment sets OH_ENABLE_CRITIC_BY_DEFAULT, but switched on it builds a feedback loop logs never gave, letting an operator watch a session degrade as it happens. Agent Zero pushed legibility down to the single action: its Linux Desktop skill tells the agent to prefer named, structured actions, cell_edit, app_launch, form_submit, over raw coordinate clicks like click(x=423, y=187). The point is auditability. cell_edit(B3, 42) says what happened; a click at a pixel says only where. An action you can name is one you can verify, replay, and record; a coordinate click is none of those.

Provider Notes

Claude Code (v2.1.139) adds settings.autoMode.hard_deny: hard blocks that no allow rule can override. The continueOnBlock option for PostToolUse hooks feeds the rejection reason back so Claude can adapt rather than just stop. API key auth now disables Remote Control, /schedule, and claude.ai MCP connectors, so operators using API key auth should audit reliance on those surfaces.

OpenClaw (v2026.5.10 beta) adds per-agent message send restrictions (tools.message.crossContext, tools.message.actions.allow) that let you deploy a sandboxed agent that can only reply in the thread it was addressed in. Memory auto-promotion is now bounded: the dreaming process compacts the oldest sections when the budget is reached, while preserving user-authored notes. Transcript reads are now streaming; peak memory for a long session dropped roughly 90%.

Paperclip (v2026.512.0) adds secrets provider vault configuration with AWS Secrets Manager as the first remote-import backend. The database gains secret_access_events and company_secret_provider_configs tables. The new cursor_cloud adapter routes work to Cursor's hosted-agent platform.

Agent Zero (v1.11 to v1.13) completes what it calls the "visible computer": browser with multi-tab parallel fanout, LibreOffice desktop via Xpra/XFCE, and a persistent desktop session. The multi browser action fans out reads or mutations across tabs in a single tool call with parallel execution.

Gemini CLI (v0.41.0) adds a pluggable AgentProtocol with local and remote backends, forcing the "where does delegated work actually run" question into a surface that can be inspected and configured. Workspace trust now enforces in headless mode; shell command validation gains a core-tools allowlist.

Pi coding agent (v0.74.0) migrates from badlogic/pi-mono to the Earendil Works organization. JSONC parsing for models.json is new (comments and trailing commas now valid).

What To Try

Hermes operators: verify your log pipeline handles sanitized output before upgrading to v0.13.0. Redaction is now on.
Paperclip operators running SSH: upgrade before deploying new remote agents. The host env isolation fix is silent in prior versions.
Claude Code: dispatch a background session with claude --bg "<prompt>", use claude agents to monitor, and test peek/reply from the list. Set a /goal on a multi-step task and inspect the turn/token overlay.
OpenHands: enable enable_sub_agents in a multi-task session. Observe whether sub-agent scoping reduces total session cost or context accumulation.
Agent Zero: create a Writer document and confirm the output is ODT (not DOCX) in v1.13+. Verify your downstream tooling handles ODT, or explicitly configure OOXML output.
Codex: add both permissions and approval-mode to your status line if you run multiple permission profiles.

What Remains Uncertain

Hermes Kanban hallucination gate: what does verification involve? Is it model-based, schema-based, or rule-based? The gate's false-positive rate under real multi-agent workloads is not yet documented.
Paperclip in_review gate: what constitutes a "real review path"? The PR notes do not define whether a human reviewer, an automated review step, or a configured participant list is required.
OpenHands critic calibration: what does a score of 0.4 mean operationally? When does agent_behavioral_issues fire versus user_followup_patterns? The calibration methodology is not yet documented.
Gemini RemoteSubagentProtocol: ships with tests but no observed remote target. Whether the remote execution surface runs on a Google-hosted infrastructure or a user-controlled one is not yet established.
Claude Code /ultrareview: the research preview returns verdicts to CLI/Desktop but the output schema is not documented. How should a CI pipeline ingest or route the findings?
Agent Zero desktop state: is there a session timeout, an idle cleanup, or a storage limit for persistent Xpra sessions? Or does the operator manage cleanup entirely manually?
OpenClaw skill archive trust model: skills.install.allowUploadedArchives is opt-in, but signature checking and sandbox isolation for uploaded archives are not yet documented.

Top signals from this issue

Noted, not headlined

Items checked this window that do not demand a decision from you this week -- listed so the restraint is visible, not silent.

Projects reviewed in this research run

Codex Claude Code Gemini CLI Hermes Agent Pi Coding Agent OpenClaw Paperclip Agent Zero OpenHands

Research artifacts and publication history are open in the repository.

View source on GitHub

Versions

complete

2026-05-12-weekly-digest-2026-05-07_2026-05-12-frontier-v0

0 signals / 2026-05-07 to 2026-05-12