Founding member access recorded.
Checkout cancelled.
This Week in Agentic Harnesses · Published 2026-06-03

Operator Brief

Across the watchlist this week the dominant move was not new capability but closing the gap between policy operators thought they had configured and policy the runtime actually enforced. Claude Code, Gemini CLI, Pi, Hermes, OpenHands, and Flue each shipped fixes where a deny rule, a credential boundary, or a sandbox edge silently failed to hold -- a week of false-confidence closures, several of which are security advisories the changelogs do not flag as such. The second, quieter thread: skills and plugins became governed, auditable, sometimes agent-activated resources across Paperclip, OpenClaw, Flue, and Agent Zero.

Upgrade / check
  • Claude Code 2.1.160-2.1.162 closes three permission-bypass gaps: WebFetch rules now override preapproved domains, Windows path rules match backslash/case variants, and Read-deny hides files from Glob/Grep. Upgrade, then re-audit any allow/deny policy from older builds. Signal
  • Claude Code 2.1.160 also makes acceptEdits prompt before writing execution-granting config (.npmrc, .bazelrc, .pre-commit-config) and shell-startup files. Recognize the new prompts; do not blanket-allow them. Signal
  • Pi closes OAuth browser-launch command injection and git-package path traversal -- untrusted OAuth servers or git URLs could execute or write outside the install root on prior builds. OAuth · Path traversal
  • OpenHands patches axios (CVE-2026-44492) and dompurify (CVE-2026-41238) in the frontend: two commits, one action -- rebuild and redeploy the bundle. dulwich (CVE-2026-42305) is a backend git lib needing a lockfile re-resolve and image rebuild. Frontend · Backend
  • Gemini CLI v0.45.0 stable closes an MCP blacklist bypass where a blacklisted tool or server could still be invoked. Upgrade before trusting MCP deny-lists for containment. Signal
  • Hermes v0.15.1 makes the Docker dashboard insecure binding an explicit HERMES_DASHBOARD_INSECURE=1 opt-in -- the heuristic that silently dropped auth is gone. Update env config before upgrade. Signal
  • Flue v0.9.0 is a hard breaking migration: routing/provider imports move, provider model values need provider-id/model-id format, persisted beta sessions are rejected, and Cloudflare DO migrations are now operator-owned. Sessions will not restore until updated. Signal
Try
  • Claude Code: wire claude agents --json waitingFor and the done/total progress counter into supervision tooling so stuck-agent triage stops requiring a human to open each session. Signal
  • Paperclip: use the skills CLI (install/reset/audit/export/assign) to audit which agents hold which company skills and export the catalog for provenance review. Signal
  • Codex: enable the optional Face ID / passcode lock on iOS 1.2026.146 before treating mobile as a trusted access surface. Signal
  • Hermes v0.15.0: queue a decomposable task on the new multi-agent Kanban and confirm the orchestrator spawns sub-agents in isolated worktrees before trusting it with real work. Signal
  • Agent Zero v1.19: disable the Office/Desktop/Editor plugins you do not need via the protected plugin-toggle endpoint to bound the agent's standing capability surface. Signal
Watch
  • Watch whether agent self-proposal and self-activation of skills (OpenClaw skill_workshop tool, Flue activate_skill) outpaces the review gate -- the governance surface now spanning Paperclip, OpenClaw, Flue, and Agent Zero only holds if self-approval stays bounded. OpenClaw · Flue
  • Standing credentials and approved-host registries are replacing per-session prompts: Codex remote-exec API-key host registration, Hermes Bitwarden Secrets Manager, Codex Bedrock under AWS IAM. A leaked key or over-broad approval becomes ambient authority. Codex · Hermes
  • Cloud-provider deployment paths keep expanding: Claude Code Auto Mode now on Bedrock/Vertex/Foundry, Codex models via Amazon Bedrock. On those paths, managed-settings deny rules must carry the governance weight per-action prompts used to. Claude Code · Codex
  • Agent Zero reversed its ephemeral-by-default posture: computer-use screenshots now persist to durable chat-scoped storage with no automatic redaction. Review retention and access controls before capturing sensitive screens. Signal
Uncertain
  • Codex remote-exec API-key registration: the changelog does not document key scope, rotation, or revocation. Whether a leaked key grants persistent remote exec is unverified. Signal
  • Codex iOS SSH-to-Windows: host-key verification, key storage, and scoping of the iOS SSH client are undocumented.
  • Gemini CLI model routing: Gemini 3.5 Flash GA is gated server-side by experiment flag, so the same binary routes different users to different models -- client version no longer determines the model in use. Signal
  • OpenClaw enhanced plugin isolation: the release note asserts a tighter sandbox boundary but does not describe its depth, so operators cannot verify it from the receipt. Signal
  • Flue v0.9.0 persisted-session rejection has no automated migration path; a self-scripted migration could reintroduce stale, unredacted state. Signal
  • OpenHands DELETE /api/organizations now cascade-deletes the sole-org requester's user identity -- the highest-consequence caveat this cycle; enforce backups before any org delete. Signal

The Policy You Wrote Wasn't the Policy You Had

Seven days, ten providers, one uncomfortable theme: the headline this week is not new capability. It is the gap between the policy an operator configured and the policy the runtime actually enforced -- and how many providers spent the window quietly closing it.

A Claude Code operator who wrote a Read-deny rule to hide a secret file was still leaking it through Glob and Grep. A Pi user authenticating against an OAuth server could be handed a verification URI that ran shell commands. A Hermes Docker dashboard could drop its auth because a heuristic misread the bind host. A Gemini CLI MCP blacklist could be bypassed. None of these were the operator's misconfiguration. The rules were written; the enforcement silently wasn't there. This week, across Claude Code, Gemini CLI, Pi, OpenHands, Hermes, and Flue, the same class of fix landed: restore the enforcement the operator already believed was in place.

The quieter, more forward-looking thread is the inverse of a gap-close: skills and plugins became governed, auditable, sometimes agent-activated resources across four providers in parallel -- Paperclip, OpenClaw, Flue, and Agent Zero. Capability that used to be an ambient default is becoming reviewable operating state.

Security Advisories: Check These Before Upgrading

Claude Code 2.1.160--2.1.162: three permission-bypass gaps closed at once. Custom WebFetch permission rules now override the built-in preapproved-domain whitelist; Windows permission rules with backslashes or case-variant paths now match; and Read-deny rules now hide files from Glob and Grep results. The sharpest of the three is the last one: a file an operator denied for Read was still discoverable -- path and contents surfaceable -- through search tools, defeating the access-control intent. The attacker model is prompt-injection or compromised task content steering the agent toward a denied domain or walled-off path; the fix is gated purely on upgrading, so the operator action is upgrade, then re-audit whether any policy was silently bypassed in the prior window, especially on Windows and any setup relying on Read-deny to hide secrets from search. The changelog ships this as an ordinary entry; treat it as the advisory it is.

Claude Code 2.1.160: execution-granting config writes now prompt even in acceptEdits mode. Two guardrails land together. acceptEdits mode now prompts before writing build-tool config that grants code execution (.npmrc, .yarnrc*, bunfig.toml, .bazelrc, .pre-commit-config.yaml, .devcontainer/), and the agent now prompts before writing shell startup files (.zshenv, .zlogin, .bash_login) and ~/.config/git/. Operators running acceptEdits or auto-leaning modes previously had a silent write path into files that execute on the next shell login, install, or commit -- the classic agent-persistence and supply-chain escalation vector. The prompt is the guardrail here; blanket- allowing it puts you back where you started.

Pi: OAuth command injection and git-package path traversal closed. Commit ba6e529 validates OAuth verification URIs (rejecting non-HTTP(S) schemes) and launches the browser via spawn() instead of shell exec(), closing a path where a malicious OAuth server could inject $(id>/tmp/pwned)-style commands. Commit a98e087 rejects git URLs with .., null bytes, backslashes, or leading slashes at both parse and resolution time, blocking writes outside the package install root. The attacker is whoever controls the OAuth server or authors the git package; both fixes need no config change, only the upgrade.

OpenHands main: three named CVEs. No tagged release fell in the window, but main closed CVE-2026-44492 (axios 1.16.0), CVE-2026-41238 (dompurify 3.4.0), and CVE-2026-42305 (dulwich 1.2.5). The first two are browser-facing (HTTP client and HTML/DOM sanitizer) and need a frontend rebuild and redeploy; the third is a backend git library and needs a poetry.lock re-resolve and image rebuild. Self-hosters pinning older lockfiles must bump manually.

Gemini CLI v0.45.0: MCP blacklist bypass fixed. The stable release bundles Termux relaunch/resize fixes, session-context filtering on history resume, and -- the security-bearing item -- a fix for a path where a blacklisted MCP tool or server could still be reached. Operators relying on MCP deny-lists for containment should upgrade before trusting the blacklist, and test that blacklisted tools are actually unreachable rather than assume full coverage.

Hermes v0.15.1: Docker insecure binding is now an explicit opt-in. The dashboard no longer infers insecure mode from the bind host; it requires HERMES_DASHBOARD_INSECURE=1 explicitly. This removes a silent path where a misread bind host dropped auth and exposed the dashboard to a network-adjacent attacker. Existing Docker and hosted setups must update env config before upgrading. The same patch fixes a v0.15.0 loopback-mode dashboard reload loop and restores MCP bare-command resolution (npx, npm, node) in Docker.

Paperclip v2026.529.0: first-admin claim is now the bootstrap gate. Unclaimed self-hosted deployments get a one-time browser claim to create the first admin. The flip side is a race: whoever completes the claim first becomes admin, so an attacker with network reach to a freshly stood-up instance could seize control before the legitimate operator. Claim promptly and restrict network exposure during the unclaimed window.

Hermes v0.15.0: Promptware defense, and a migration. The Velocity Release adds a built-in defense against Brainworm-class prompt-injection and closes 19 security-tagged issues. Operators running against untrusted content (web, repos, MCP output) should validate the defense against their own injection corpus rather than assume blanket coverage; novel vectors outside the known class may still pass.

Flue v0.9.0: a hard breaking migration. Routing imports move from @flue/runtime/app to @flue/runtime/routing, provider model values now require provider-id/model-id format, SDK mount paths derive from baseUrl, and persisted beta session state is rejected -- clear or migrate the store before upgrading or sessions fail to restore. Cloudflare Durable Object migrations are no longer auto-appended; the operator now owns them in the Wrangler config, and interrupted workflows no longer auto-retry.

The Enforcement Gap, Six Ways

The thread that cuts across the watchlist is consistent enough to name plainly. In each case, a control the operator had reason to believe was active was not -- and the fix is the same shape: make the enforcement match the configuration.

The Claude Code cluster is the clearest statement of it. A Read-deny rule that didn't hide files from Glob/Grep, a WebFetch rule that didn't override the preapproved-domain list, and Windows path rules that silently didn't match on case or separator variance are three independent ways the same promise -- "the policy I wrote is enforced" -- was broken. The same release line also converts a silent config-write into a confirmation checkpoint for files that grant code execution, and corrects an over-broad managed-settings policy that was wrongly blocking legitimate third-party provider sessions.

Gemini CLI's MCP blacklist bypass is the same bug class at the tool layer: a deny-list that didn't deny. Its companion policy-file resilience fix closes a fail-open gap where a policy file that failed to persist (on cross-device container mounts) or failed to parse (corrupt TOML) could leave the agent running without the operator's intended policy in effect. Recovery now writes a .bak and rebuilds to defaults, so a corrupted policy is silently reset -- re-verify intended policy after a .bak appears.

Pi's quartet of hardening commits -- OAuth injection, git path traversal, auth files created at 0o600 instead of briefly world-readable, and extension cache moved out of world-accessible /tmp -- is the multi-user-host version of the same theme: close the windows where a control was assumed but a co-tenant could slip through. Flue's v0.9.1 WebSocket credential hardening strips query strings and fragments before persisting Cloudflare attachments so URL-carried handshake credentials are not retained, and OpenHands moved ACP provider credentials off the plaintext acp_env channel onto an encrypted secrets channel. And Hermes closed the same shape at the deployment edge: a Docker dashboard that silently dropped auth when a heuristic misread the bind host now demands an explicit HERMES_DASHBOARD_INSECURE=1.

The operator takeaway is uncomfortable but actionable: an upgrade is not just a feature bump this week. For every provider above, the safe assumption is that some control you configured on the prior build was not holding, and the post-upgrade action is a re-audit, not a victory lap.

Skills and Plugins Become Governed State

The second thread runs the other way. Four providers, four surfaces, shipped the same move: agent capability stops being an ambient default and becomes reviewable, sometimes approvable, operating state.

Paperclip made company skills first-class resources with an install / reset / audit / export / assign CLI. The load-bearing verbs are audit and export -- which skills an agent holds becomes a queryable, exportable fact rather than implicit config -- and assign, which makes a capability grant a distinct, reviewable authority action.

OpenClaw's Skill Workshop inserts a human-in-the-loop gate: new skills enter a pending-proposal queue reviewed via CLI or Gateway before taking effect. A new skill_workshop agent tool lets agents file proposals themselves, which widens the surface proposals originate from -- so the operator decision is who may review and who may self-approve. Lax review re-opens the unreviewed-skill path.

Flue v0.9.2 went the other direction on activation authority: an activate_skill tool lets agents load full skill instructions on demand before matching work. The operator's visible control narrows to which skills are configured; the choice to activate moves to the agent. Workspace skills are reread on activation, so mid-session edits take effect.

Agent Zero v1.19 made Office, Desktop, and Editor plugins toggleable behind a protected plugin-state API -- a real authority lever that lets an operator disable powerful capabilities (Desktop computer-use especially) on deployments that should not have them. The release note describes a "protected" toggle endpoint but no auth model or role-based capability management, so treat it as a disable lever, not yet an audited capability register.

The shapes differ -- catalog audit, proposal approval, agent self-activation, capability toggle -- but the direction is one: the question "what can this agent do?" is becoming answerable by inspecting state rather than reading code or trusting defaults.

The accessibility read is mixed. Claude Code's waitingFor field and fan-out progress counter make agent state legible to operators who previously had to open each session; Flue v0.9.0, by contrast, forces a hard migration with no automated path, raising rather than lowering the cost of staying current. The week made harnesses more governable, not more reachable.

Control Plane

Control plane saw the most movement, in two directions. The governance-of-capability cluster above (Paperclip skills, OpenClaw Skill Workshop, Agent Zero plugin toggles, Flue agent-activated skills) sits here, as does a steady relocation of authority onto standing credentials and cloud paths: Codex remote-exec API-key host registration, Hermes Bitwarden Secrets Manager replacing per-provider keys, Claude Code Auto Mode reaching Bedrock/Vertex/Foundry, and Codex models running under AWS IAM via Bedrock. Claude Code also made agent supervision more legible: claude agents --json now exposes a waitingFor field naming what a blocked session waits on (e.g. a permission prompt), plus a done/total fan-out progress counter.

Runtime

Runtime carried most of the enforcement-gap closures -- the Claude Code config-write prompts, Pi's OAuth and path-traversal fixes, Flue's WebSocket credential stripping, Hermes's Promptware defense, OpenHands's dulwich CVE -- plus one notable posture reversal: Agent Zero reverted computer-use screenshots to durable chat-scoped storage, undoing its prior ephemeral-by-default stance. That improves audit trails but persists potentially sensitive on-screen content (credentials, PII, internal UIs) with no automatic redaction -- a data-at-rest exposure operators must scope and prune.

Platform

Platform was mostly steady-state plumbing: the OpenHands frontend CVE cluster, Gemini CLI's v0.45.0 stable bundle and an editor-spam-loop fix, OpenClaw's MiniMax M3 model support, and Flue's OpenTelemetry tracing package. Codex's Sites plugin -- in-app website/web-app creation and deployment, included by default in Business workspaces -- is the one platform item with a governance edge: a deploy capability may already be active without an explicit enablement step.

Provider Notes

Codex (CLI 0.135.0--0.136.0, iOS 1.2026.146) shipped named permission profiles with custom-config display and codex doctor diagnostics (0.135.0), a non-interactive installer for CI, plus remote-exec API-key host registration and thread archiving (0.136.0). The iOS app added an optional Face ID / passcode lock for Codex and SSH-to-Windows. Two integrations landed: the Sites plugin and Amazon Bedrock under AWS-managed auth and billing.

Claude Code (2.1.158--2.1.162) is the enforcement-gap headliner: the permission/deny-rule cluster, execution-granting config-write prompts, the managed-settings third-party-session unblock, agent-status observability, and Auto Mode reaching Bedrock/Vertex/Foundry for Opus 4.7/4.8.

Gemini CLI (v0.44.1--v0.46.0-preview) shipped the v0.45.0 stable bundle with the MCP blacklist fix and Termux hardening, policy-file resilience, and a server-flag-gated Gemini 3.5 Flash GA rollout that decouples model-in-use from client version. A CI change to pull_request_target on the PR-size labeler is low-risk as written (it only reads line counts) but removes the structural safety of pull_request -- any future edit adding fork-code checkout becomes immediately dangerous.

Hermes Agent (v0.15.0--v0.15.1 + post-release commits) is the Velocity Release: a 76% run_agent.py refactor, Kanban evolving into a multi-agent orchestration platform with auto-decomposition, swarm topology, and worktree-per-task, Promptware defense, and Bitwarden Secrets Manager. The v0.15.1 patch fixes the Docker insecure-binding opt-in and a dashboard reload loop; June 3 commit waves hardened installer self-update, Windows/WSL2 PTY and schtasks handling, and desktop session management.

Pi coding agent (commits to main) shipped a security-hardening cluster: OAuth launch hardening, git path-traversal rejection, auth-file mode-on-create (0o600 instead of briefly world-readable), extension-cache isolation out of world-accessible /tmp, and HTML-export XSS sanitization. Alongside, model-catalog maintenance removed stale Codex entries, added Mistral Devstral 2 and Open Mistral Nemo, and refreshed Claude model pricing and token caps to 128k output. No reliably in-window tagged release landed; the security work shipped as commits to main.

OpenClaw (2026.5.31-beta.3 through 2026.6.1 stable) shipped the Skill Workshop proposal workflow, interrupted-tool-call recovery, bounded request timers (re-evaluate SLOs), enhanced plugin isolation, MiniMax M3, and Tailscale Serve service-name binding with SQLite-backed state migration for iMessage and plugin-install tracking.

Paperclip (v2026.529.0) shipped the skills CLI/catalog, the first-admin claim flow, inline document annotations, per-user sidebar controls, and live Claude model discovery from the UI.

Agent Zero (v1.19) renamed Remote Link to Remote Control with selectable tunnel providers, made Office/Desktop/Editor plugins toggleable behind a protected API, reverted screenshots to durable chat-scoped storage, unified OAuth account management, and hardened Xpra desktop control.

OpenHands (main, no tagged release) shipped the three-CVE remediation cluster (axios, dompurify, dulwich), the ACP-credentials-to-secrets-channel move, a cascade-delete-sole-org-requester change on DELETE /api/organizations (org deletion now also deletes the requesting user if it is their only org), a git-proxy capability, and a LiteLLM 1.84.1 upgrade.

Flue (Tier 2; v0.8.1--v0.9.2) shipped OpenTelemetry tracing, the v0.9.0 breaking migration, WebSocket credential hardening, operator-owned workflow-run retention (the implicit 50-run prune is gone), and autonomous activate_skill.

What To Try

  • Claude Code operators: upgrade to 2.1.162 and re-audit any allow/deny or Read-deny policy that ran on older builds. Then wire waitingFor and the fan-out progress counter into supervision tooling so stuck-agent triage stops requiring a human to open each session.
  • Paperclip operators: use the skills CLI to audit and export which agents hold which skills, and claim any freshly stood-up self-hosted instance immediately.
  • Codex operators on iOS: enable the Face ID / passcode lock before treating mobile as a trusted access surface.
  • Hermes operators: queue a decomposable task on the new multi-agent Kanban and confirm the orchestrator spawns the expected sub-agents in isolated worktrees before trusting it with real work. Set HERMES_DASHBOARD_INSECURE=1 only where insecure binding is genuinely intended.
  • Agent Zero operators: disable the Office/Desktop/Editor plugins you do not need, and review retention/access controls for the now-durable computer-use screenshots before capturing sensitive screens.
  • Gemini CLI maintainers: review the pull_request_target labeler workflow to confirm it only reads PR metadata and never checks out fork code under the elevated token.

What Remains Uncertain

  • Codex remote-exec key lifecycle: scope, rotation, and revocation for the approved-host API-key registration are undocumented. Whether a leaked key grants persistent remote exec is unverified.
  • Codex iOS SSH trust handling: host-key verification, key storage, and scoping of the iOS SSH-to-Windows client are not described.
  • Gemini CLI model routing: with Flash GA gated server-side, the model in use is no longer determined by client version alone -- backend flag state is now part of the audit surface.
  • OpenClaw plugin-isolation depth: the release note asserts tighter isolation but does not describe the boundary's depth, so operators cannot verify it from the receipt.
  • Hermes Promptware coverage: the defense targets a known attack class; novel injection vectors outside Brainworm patterns may still pass. Validate against your own corpus.
  • Flue persisted-session migration: v0.9.0 rejects pre-upgrade session state with no automated migration path; a self-scripted migration could reintroduce stale, unredacted state.
  • OpenHands org-deletion blast radius: operators on the cascade-delete change should enforce backups before any DELETE /api/organizations, since a sole-org delete now removes the user identity too.

This digest was produced by the Bitter autonomous research loop.

Sources

Primary links, including exact changelog lines when available.

Versions