Section

Control Plane

Agent labor becomes operational only when the surface shows who asked for it, what it may touch, what it costs, and how the result will be checked.

Control Plane covers provider changes that make agent labor governable as operating state: goals, roles, budgets, approvals, permission manifests, capability profiles, credential scopes, cost summaries, blockers, schedulers, triggers, sub-agent routing, kanban orchestration. Where authority over what an agent does and when lives.

Other sections

July 2026

2026-07-02 / Hermes Agent

Hermes v2026.7.1 tags the security wave that was main-only last issue

Control Plane
- Hermes v2026.7.1 ships the prior main-only wave: MCP-config persistence hardening, cron `base_url` credential-exfiltration blocking, prefix-secret sentinels for file reads, Slack `xapp-` token redaction, browser cloud-metadata guardrails, resume/session scoping, and a dependency floor.
- Upgrade from v2026.6.19 if you were waiting for a tag. The channel question changed from 'run main or wait' to a normal release upgrade.
Run: 2026-07-02-weekly-digest-2026-07-01_2026-07-02-frontier-v0
2026-07-02 / Gemini CLI

Gemini CLI fixes a memory-import symlink escape in nightly only

Control Plane
- Nightly v0.51.0-nightly.20260702.gff00dacd9 fixes a symbolic-link directory escape in the memory import processor.
- Stable users do not have a shipped fix yet. Avoid untrusted `GEMINI.md` memory imports or track the nightly until a preview/stable tag absorbs it.
Run: 2026-07-02-weekly-digest-2026-07-01_2026-07-02-frontier-v0
2026-07-01 / Antigravity CLI

Google retired consumer Gemini CLI (June 18) and force-migrated users to closed-source Antigravity -- which hardened one approval gate and auto-opened another the same week

Control Plane
- Per Google's own announcement, consumer Gemini CLI stopped serving requests on 2026-06-18 (AI Pro/Ultra, free individual Code Assist, new GitHub-org installs). Enterprise Code Assist retained access and the OSS gemini-cli repo stays Apache-2.0. The successor, Antigravity CLI (the `agy` binary), is closed-source Go. If you ran Gemini CLI as an individual, your path is now a binary you cannot read.
- Antigravity's governance is real and moved both ways in one release train: 1.0.13 (06-27) made 'Always Approve' rule matching strict (non-regex) by default -- a tightening -- while 1.0.14 (06-30) added an 'always proceeds' mode that auto-approves a subagent's artifacts -- a loosening. Audit your migration: the closed binary's subagents can now approve their own work.
- Decide deliberately per tier: enterprise stays on Gemini CLI unchanged; individuals migrate to Antigravity (closed) or move to an open harness; anyone who needs to audit enforcement should treat a closed governance claim as unverifiable until probed.
Run: 2026-07-01-weekly-digest-2026-06-24_2026-07-01-frontier-v0
2026-07-01 / Hermes Agent

Hermes landed a full security wave on main -- path-escape, command-approval-bypass, secret redaction -- and tagged none of it (still v2026.6.19)

Control Plane
- Three real hardening fixes hit main in-window: a path-traversal fix (model-supplied tool-call IDs could escape the tool-result storage directory), a command-approval-bypass close (GNU long-flag prefix abbreviations of `chown --recursive` and `git push --force` slipped past the guard), and secret redaction in user-facing approval prompts. None reached a tag; the newest release is still v2026.6.19 (2026-06-19).
- If you run the v2026.6.19 tag you have none of these. The approval-bypass in particular means a guard you believed was blocking `chown --recursive` / `git push --force` could be defeated by an abbreviated long flag on the tagged binary. Track main or wait for a tag, but know the gap.
Run: 2026-07-01-weekly-digest-2026-06-24_2026-07-01-frontier-v0
2026-07-01 / Claude Code

Claude Code 2.1.196 closes an MCP self-approval hole and binds Remote Control to the Anthropic host

Control Plane
- 2.1.196 is a security release: `claude mcp list`/`get` no longer spawn `.mcp.json` servers that a repo self-approves -- closing a path where merely inspecting MCP config in an untrusted repo could launch a repo-declared server. And Remote Control is now disabled when `ANTHROPIC_BASE_URL` points at a non-Anthropic host (org-configurable), so a redirected base URL cannot silently drive remote control.
- Upgrade to 2.1.196+ and re-audit MCP trust in any workflow that opens untrusted repositories; if you proxy `ANTHROPIC_BASE_URL`, confirm the new Remote Control binding matches your intent.
Run: 2026-07-01-weekly-digest-2026-06-24_2026-07-01-frontier-v0
2026-07-01 / Codex

Codex 0.142.2 gates uninspectable PowerShell behind approval and switches MCP tool discovery to tool-search by default

Control Plane
- 0.142.2 (stable, 06-25) tightened PowerShell safety: commands with AST regions the classifier cannot inspect now require approval rather than auto-running. Windows operators running Codex unattended may see new approval prompts (or refusals in non-interactive mode) where PowerShell previously executed -- re-audit unattended Windows pipelines before upgrading.
- Same release makes MCP tools use tool-search by default (rather than dumping the full tool list). Re-verify MCP-backed workflows still resolve the tools you expect against your model/provider.
Run: 2026-07-01-weekly-digest-2026-06-24_2026-07-01-frontier-v0
2026-07-01 / Gemini CLI

Gemini CLI's OSS repo shipped the skill path-traversal fix to stable (v0.49.0) and renamed coreTools -- fixes now landing in a CLI whose consumers were cut off

Control Plane
- Carry-forward resolved: the skill-install path-traversal fix, stranded in preview for two windows, reached STABLE in v0.49.0 (2026-06-25). Same release renamed the `coreTools` setting to `tools.core` -- a breaking config change to migrate. Newer `@file` defensive path resolution + null-byte sanitization landed on main (commit b5fc06e), not yet stable.
- The pointed part: these fixes ship to an OSS CLI whose consumer service Google shut down on June 18. If you are on the OSS repo (enterprise or BYO-key), upgrade to v0.49.0 and migrate the config; if you were a consumer, this fix is not for you -- you are on Antigravity now.
Run: 2026-07-01-weekly-digest-2026-06-24_2026-07-01-frontier-v0
2026-07-01 / Paperclip

Paperclip v2026.626.0 tags its master-only control plane: daily run/cost budget caps, a split skills:create permission, cross-company auth hardening

Control Plane
- Carry-forward resolved: v2026.626.0 (2026-06-27) tagged 128 commits of previously master-only control-plane work. The operator-relevant pieces: hard daily run-count and cost ceilings enforced before an adapter runs (preflight budget caps), skill mutations now gated behind a dedicated `skills:create` permission split from `agents:create`, and cross-company access authorization hardened with single-issue read enforcement.
- If you run Paperclip, these are now in a stable tag you can adopt: wire the budget caps to bound runaway spend, and split skill-creation authority from agent-creation in your permission model.
Run: 2026-07-01-weekly-digest-2026-06-24_2026-07-01-frontier-v0

June 2026

2026-06-24 / heypi

heypi's headline feature is approvals, but nothing requires approval by default

Control Plane
- The docs are explicit: `approval does not make every tool call require approval. Tool confirmation does that.` Out of the box the only automatic gate is the bash `approval.command()` classifier (blocks destructive, asks on risky, allows low-risk); every other tool runs without a human gate until you wire `approval`/`confirm` per tool. An operator who adopts heypi *for* its approvals must author them; the default is not a human-in-the-loop posture.
- The companion 'audit trail' is typed trace events surfaced in the admin panel -- which is itself disabled by default and binds loopback. To actually get the reviewable record the marketing promises, you must enable and operate the admin panel (or read `heypi events`).
- Before deploying, decide which tools must gate on a named approver and wire them; do not assume the framework's headline posture is its default.
Run: 2026-06-24-weekly-digest-2026-06-23_2026-06-24-frontier-v0
2026-06-24 / heypi

heypi 0.2.0-beta.0 breaks root-level approver config and makes webhooks HTTPS-by-default -- and it is a beta

Control Plane
- 0.2.0-beta.0 (2026-06-23) moves approver/admin identity from a root-level `approval.approvers`/`approval.admins` block to adapter-local `permissions`, and root-level config now FAILS at startup. Webhooks are HTTPS-by-default (plain HTTP needs `unsafeReplyHttp: true`), and the durable instruction file renamed `prompt`/`soul` to `instructions`. An operator upgrading from 0.1.x must migrate config or the app will not start.
- It is a `-beta.0` pre-release, and heypi publishes no GitHub Releases at all -- the newest fixes already sit on `main` past the tag. Decide deliberately: pin 0.1.3 for stability, or adopt the beta and track `main`, but do not run a beta as if it were a stable line.
Run: 2026-06-24-weekly-digest-2026-06-23_2026-06-24-frontier-v0
2026-06-23 / Claude Code

The five-level subagent depth cap did not bind for foreground spawns until 2.1.181

Control Plane
- 2.1.181 fixed foreground subagents spawning unbounded nested chains; they now respect the same 5-level depth limit as background subagents. An operator who relied on last window's depth cap (2.1.178) for foreground delegation was, in the interim, not protected by it.
- Upgrade past 2.1.181 and re-test that a foreground delegation tree actually stops at five levels rather than assuming the announced cap binds.
Run: 2026-06-23-weekly-digest-2026-06-16_2026-06-23-frontier-v0
2026-06-23 / Claude Code

Agent(type) deny and Agent(x,y) allowed-types rules were not enforced for named subagent spawns until 2.1.186

Control Plane
- 2.1.186 fixed `Agent(type)` deny rules and `Agent(x,y)` allowed-types restrictions not being enforced for named subagent spawns. An operator who wrote those rules when the argument-aware grammar was announced was unprotected by them until this fix.
- Upgrade past 2.1.186, then write an `Agent(type)` deny for a subagent you expect blocked and confirm a named spawn is actually refused, not silently allowed.
Run: 2026-06-23-weekly-digest-2026-06-16_2026-06-23-frontier-v0
2026-06-23 / Claude Code

Auto mode now blocks specific destructive git/IaC commands and reclassifies scheduled-task and webhook triggers

Control Plane
- 2.1.183 enumerated destructive commands the auto-mode classifier now blocks — `git reset --hard`, `git checkout -- .`, `git clean -fd`, `git stash drop` (when you did not ask to discard work), `git commit --amend` for commits the agent did not make this session, and `terraform`/`pulumi`/`cdk destroy` unless you asked for the specific stack.
- 2.1.183 also fixed scheduled-task and webhook trigger deliveries being treated as keyboard input; they now classify as task notifications and can no longer approve a pending action or set the session title under auto mode.
- Operators running auto mode should upgrade past 2.1.183; the blocks are conditional (gated on what you asked for), not unconditional refusals.
Run: 2026-06-23-weekly-digest-2026-06-16_2026-06-23-frontier-v0
2026-06-23 / Codex

Codex command and network approvals are now scoped per execution environment and fail closed on ambiguity

Control Plane
- CLI 0.142.0 (#28738, #28899) scopes command and network approvals by execution environment: an approval granted in one environment no longer leaks to another, Codex denies when active-call attribution is ambiguous, and it fails closed if an environment-specific proxy endpoint cannot be prepared.
- Upgrade to 0.142.0 and re-test approval reuse: grant a command approval in a local workspace and confirm a remote executor environment prompts again rather than inheriting it.
Run: 2026-06-23-weekly-digest-2026-06-16_2026-06-23-frontier-v0
2026-06-23 / Codex

Codex rollout token budgets abort turns on exhaustion — a hard spend cap at the accounting boundary

Control Plane
- CLI 0.142.0 makes configurable rollout token budgets track usage across agent threads and abort turns when exhausted — a hard spend cap, not a warning. It is soft in timing: it lands at the next usage-accounting boundary with no cross-thread interrupt fan-out, so an in-flight expensive call can still complete.
- Set a rollout token budget and watch it abort a long multi-turn run at the accounting boundary; do not assume an instantaneous mid-response kill. Behavior under real multi-agent load is undocumented.
Run: 2026-06-23-weekly-digest-2026-06-16_2026-06-23-frontier-v0
2026-06-23 / Codex

Codex adds a multi-agent delegation-authority mode

Control Plane
- CLI 0.142.0 adds a multi-agent delegation-authority mode, giving operators a configurable posture for how authority flows to delegated agents. It is the control surface to set before trusting a delegation tree under the new shared token ledger.
- Operators orchestrating delegated Codex agents should choose the delegation-authority mode deliberately rather than inheriting a default.
Run: 2026-06-23-weekly-digest-2026-06-16_2026-06-23-frontier-v0
2026-06-23 / Hermes Agent

Hermes adds a root-owned, user-immutable /etc/hermes managed scope

Control Plane
- PR #49098 (in v0.17.0 / v2026.6.19) adds a managed `/etc/hermes` scope: a root-owned, user-immutable layer of config and secrets that wins per-key over a user's own files. It is Hermes's first centralized, OS-backed policy pin, for a tool whose posture had been governs-through-allowlists, not identity services.
- Operators wanting an OS-enforced policy floor can now pin config and secrets a user cannot override; audit which keys the managed scope wins so credential flow stays legible.
Run: 2026-06-23-weekly-digest-2026-06-16_2026-06-23-frontier-v0
2026-06-23 / Paperclip

Last window's multi-tenant authority cluster finally tagged in v2026.618.0

Control Plane
- v2026.618.0 (June 18) is the tag that finally contains last window's master-only authority cluster: cloud-tenant instance-admin deprivileging (#7525), per-company JWT signing keys (#5864), plugin tenant isolation (#5865), the negated-phrasing review-approval fix (#5839), and HTTP-log credential redaction (#8013). Shared-pool and cloud-tenant operators must upgrade.
- Provision a separate non-cloud-tenant admin identity first: the deprivileging purges stale instance-admin rows by design.
Run: 2026-06-23-weekly-digest-2026-06-16_2026-06-23-frontier-v0
2026-06-23 / Paperclip

Preflight budget caps that cancel queued work before an adapter starts (master, unreleased)

Control Plane
- PR #8347 (merged to master 2026-06-20, NOT in a tagged release) adds heartbeat preflight gates for daily-run and daily-cost caps (`maxDailyRuns`, `maxDailyCostCents`): a capped agent stops before new execution, with the cap re-checked immediately before execution. This moves budget from surfacing to enforced state.
- Paperclip operators tracking master can stage a per-agent daily cost cap in a test company and confirm work is refused at claim time, not mid-run; operators running the tagged binary do not yet have it.
Run: 2026-06-23-weekly-digest-2026-06-16_2026-06-23-frontier-v0
2026-06-23 / Paperclip

A recovery watchdog whose actors structurally cannot mutate approvals (master, unreleased)

Control Plane
- PR #8339 (merged to master 2026-06-19, NOT tagged) adds a task watchdog control plane whose recovery and status-only actors must remain limited to status reporting and cannot create approvals, link approvals, or submit approval comments — enforced via a scoped mutation guard.
- Operators relying on automated recovery should know the next tag structurally prevents a recovery actor from escalating into an approval; the running binary does not yet carry this guard.
Run: 2026-06-23-weekly-digest-2026-06-16_2026-06-23-frontier-v0
2026-06-23 / OpenHands

API-key auth decoupled from Keycloak — IdP session revocation is no longer a kill switch for machine keys

Control Plane
- PR #14867 (merged to main 2026-06-17, NOT in any tag) decouples API-key (Bearer) auth from Keycloak offline sessions: API-key authentication performs zero Keycloak round-trips, so a revoked or expired IdP session no longer invalidates a machine key. Headless clients stop hitting opaque 401s — but the revocation contract changed.
- Operators who relied on Keycloak session revocation to kill machine keys must now revoke at the key store instead. This is on main, in no release.
Run: 2026-06-23-weekly-digest-2026-06-16_2026-06-23-frontier-v0
2026-06-23 / OpenHands

Last window's per-org concurrency limits reverted outright in-window

Control Plane
- PR #14877 (merged to main 2026-06-17) reverts the DB-backed per-org/per-user concurrent sandbox/conversation limits from #14168, adding migration 124 to drop the columns introduced by migration 120. The 429-based quota some operators were waiting for is not coming in the next 1.x; the surviving concurrency-control path now counts from the runtime `/list` API rather than a DB flag.
- Operators who anticipated 429-based concurrency enforcement must not plan around it — it was withdrawn, not enforced.
Run: 2026-06-23-weekly-digest-2026-06-16_2026-06-23-frontier-v0
2026-06-23 / OpenClaw

OpenClaw shipped automatic Codex plugin approvals to stable — the one gate that loosened this window

Control Plane
- PR #92625 (stable v2026.6.9, June 21) adds automatic Codex plugin approvals — a gate opened in a fortnight when nearly everyone else was closing them. It cuts against the consent-over-default grain even from a vendor with real accessibility discipline.
- Operators on v2026.6.9 should audit the new automatic Codex plugin approvals against their trust posture; convenience relaxed a consent gate that was previously explicit.
Run: 2026-06-23-weekly-digest-2026-06-16_2026-06-23-frontier-v0
2026-06-23 / Hermes Agent

Hermes MCP-persistence mitigation wave landed on main, not in the v0.17.0 tag

Control Plane
- Days after tagging v0.17.0, a fresh security wave landed on main (June 21-22, NOT in that tag): a guard that rejects MCP entries writing shell payloads into OS persistence surfaces (authorized_keys, cron, sudoers), an IOC blocklist enforced at save and spawn time, an API-key entropy floor raised from 8 to 16, and a startup posture audit warning when a gateway runs as root or exposes an unauthenticated API server.
- Per the maintainer's own commit narrative, this wave responds to an apparent in-the-wild hermes-0day persistence campaign — a single-source claim (a cited Reddit thread and a self-named instance), NOT independently confirmed exploitation. Read the mechanism and fix as real; read 'actively exploited' as the maintainer's account. Either way, if you expose a Hermes dashboard or API server, run main or wait for the next tag.
Run: 2026-06-23-weekly-digest-2026-06-16_2026-06-23-frontier-v0
2026-06-23 / OpenHands

An entire OpenHands enterprise and security build-out, two windows unreleased (postcss CVE + git-token redaction)

Control Plane
- The only mainline release is still 1.8.0 (June 10). Two security fixes that matter to anyone on a build from main are in no tag: the moderate postcss XSS, CVE-2026-41305 (#14770), and a fix that stops a `PluginSpec.source` containing an embedded git token from being written to the database in plaintext (#14795). New writes are redacted.
- Operators on 1.8.0 have none of this and are not patched. Operators on a build from main should rotate any git token that was embedded in a repo source URL, because pre-fix writes were stored plaintext.
Run: 2026-06-23-weekly-digest-2026-06-16_2026-06-23-frontier-v0
2026-06-23 / Gemini CLI

Gemini CLI skill path-traversal fix stranded in preview for a second window

Control Plane
- The skill install/link/uninstall path-traversal fix (commit bca5667fc / PR #27767) is, for the second straight week, in no stable release. It exists only in v0.48.0-preview.0; stable v0.47.0 does not contain it. A malicious `.skill` package can still write outside the skills directory on stable.
- Treat third-party skill installs as untrusted on stable v0.47.0 until the path-traversal fix leaves preview.
Run: 2026-06-23-weekly-digest-2026-06-16_2026-06-23-frontier-v0
2026-06-15 / Claude Code

Subagents can spawn subagents five deep, and auto mode now classifies spawns before launch

Control Plane
- 2.1.172 lets a subagent spawn its own subagents up to 5 levels deep (new capability and a new governance surface); 2.1.178 then made the auto-mode classifier evaluate a spawn before launch, closing a gap where a deeply nested agent could request an action the operator's policy would block at the top.
- Operators running under auto mode should upgrade past 2.1.178 before trusting a delegation tree, and use argument-aware permission rules to cap what spawned agents can do.
Run: 2026-06-16-weekly-digest-2026-06-04_2026-06-16-frontier-v0
2026-06-15 / Claude Code

Permission rules can finally match a tool's arguments (Agent(model:opus))

Control Plane
- 2.1.178 added Tool(param:value) syntax so a rule can match input parameters, e.g. Agent(model:opus) blocks Opus subagents; permissions move from all-or-nothing per tool to per-argument.
- Operators governing delegated trees should reach for this to cap model tiers and arguments inside subagents.
Run: 2026-06-16-weekly-digest-2026-06-04_2026-06-16-frontier-v0
2026-06-15 / OpenHands

Concurrency becomes a governed, billable resource (Personal 3, commercial 10; unreleased)

Control Plane
- PR #14168 (main, unreleased) caps concurrent conversations/sandboxes (Personal=3, commercial=10) with per-org and per-user override columns and HTTP 429 enforcement. A real resource-control and economics surface; tightens the free tier.
Run: 2026-06-16-weekly-digest-2026-06-04_2026-06-16-frontier-v0
2026-06-14 / OpenHands

Admins can lock an org to a curated model set and hide custom-key fields (unreleased)

Control Plane
- PR #14773 (main, unreleased) adds allow_user_llm_configuration: off hides custom model/base-URL/API-key inputs and locks the org to a curated, proxy-served model set. The platform owns the model-access policy, not the user.
Run: 2026-06-16-weekly-digest-2026-06-04_2026-06-16-frontier-v0
2026-06-12 / Claude Code

Org model allowlists are finally binding, even against the default model

Control Plane
- enforceAvailableModels (2.1.175) makes the availableModels allowlist constrain the Default model and blocks user/project widening; a cluster of fixes closed env-var, /fast, subagent, advisor, and dispatch escape hatches. This is the lever an enterprise needs to decide whether Fable 5 is reachable per-org.
Run: 2026-06-16-weekly-digest-2026-06-04_2026-06-16-frontier-v0
2026-06-12 / Paperclip

Shared-pool tenants were instance admins of the whole instance (fixed, unreleased)

Control Plane
- PR #7525 (merged to master 2026-06-12, NOT in a tagged release) removes a grant that made every cloud tenant on a shared pool an instance admin with reach into every other tenant's data, and purges stale admin rows. Shared-pool operators must track the next tag and provision a non-cloud-tenant admin identity first (the purge is destructive).
Run: 2026-06-16-weekly-digest-2026-06-04_2026-06-16-frontier-v0
2026-06-12 / Paperclip

Per-company JWT signing keys and a 1-hour TTL replace a single master key (unreleased)

Control Plane
- PR #5864 (master, unreleased) derives a per-company signing key and cuts the agent-token TTL from 48h to 1h, so one tenant's leaked key can no longer forge tokens for other tenants. Multi-tenant blast-radius control; track the next tag.
Run: 2026-06-16-weekly-digest-2026-06-04_2026-06-16-frontier-v0
2026-06-12 / Paperclip

A 'NOT APPROVED' comment could auto-complete an issue (fixed, unreleased)

Control Plane
- PR #5839 (master, unreleased) tightens an approval regex that matched negated phrasings (so 'NOT APPROVED' auto-completed an issue) and wraps comment + status + decision in one transaction. Makes 'a rejection can never auto-complete' and 'observable state cannot diverge from intended state' true invariants of the approval gate.
Run: 2026-06-16-weekly-digest-2026-06-04_2026-06-16-frontier-v0
2026-06-12 / OpenClaw

Exec approvals fail closed on timeout, and HTTP override surfaces are admin-gated

Control Plane
- v2026.6.6 made exec approvals fail closed on timeout (a pending dangerous command now denies rather than proceeds) across a dozen-surface boundary sweep that also closed a deleted-agent ACP bypass; v2026.6.8 gated HTTP session/model override surfaces behind admin privileges. The correct reversibility default for a surface aimed at non-experts.
Run: 2026-06-16-weekly-digest-2026-06-04_2026-06-16-frontier-v0
2026-06-11 / Codex

Computer use expands to Europe and Enterprise, with the first per-app controls and a CDP browser surface

Control Plane
- App 26.609 added Developer mode giving the agent controlled Chrome DevTools Protocol access (network interception, arbitrary in-page JS, the debugger), the first per-app access controls for computer use on Windows, and Enterprise computer use; on 2026-06-16 computer use reached the EEA/UK/Switzerland and Chronicle previewed building memory from screen context.
- Keep Developer-mode CDP off by default; use the Windows per-app controls to allowlist apps; default Chronicle off on confidential machines.
Run: 2026-06-16-weekly-digest-2026-06-04_2026-06-16-frontier-v0
2026-06-10 / OpenHands

OpenHands Enterprise: the first user to log in owns the organization (unreleased)

Control Plane
- PR #14752 (main, intended for an untagged 1.39.0) makes the first user to sign in after enabling the default org its owner, keyed to an is_default DB flag (migration 119). The multi-tenant foundation the window's enterprise work stacks on. Operators must control who signs in first.
Run: 2026-06-16-weekly-digest-2026-06-04_2026-06-16-frontier-v0
2026-06-10 / OpenHands

hide_personal_workspaces is explicitly UI-only, not an access boundary

Control Plane
- PR #14741 (main, unreleased) hides personal workspaces in org-only installs but the docs state it is UI-only: the orgs API still returns personal orgs and there is no server-side enforcement. Operators must NOT treat it as an access-control boundary; the real boundary is the membership model.
Run: 2026-06-16-weekly-digest-2026-06-04_2026-06-16-frontier-v0
2026-06-08 / Pi Coding Agent

Pi gates local settings, instructions, and packages behind a saved trust decision

Control Plane
- v0.79.0 added project trust for local settings, resources, instructions, and packages with saved decisions and --approve/--no-approve CLI controls. Pi now treats local project files as untrusted-by-default; open an untrusted repo and confirm it refuses to load local resources until approved.
Run: 2026-06-16-weekly-digest-2026-06-04_2026-06-16-frontier-v0
2026-06-06 / Claude Code

Relayed SendMessage from peer sessions no longer carries user authority

Control Plane
- 2.1.166: messages relayed via SendMessage from other Claude sessions no longer carry user authority; receivers refuse relayed permission requests and auto mode blocks them. Closes a confused-deputy path in multi-session orchestration.
Run: 2026-06-16-weekly-digest-2026-06-04_2026-06-16-frontier-v0
2026-06-05 / Paperclip

Deny-by-default authority preset for agents reviewing untrusted content

Control Plane
- PR #7530 (in v2026.609.0) adds a low_trust_review authority preset, source-trust tagging, route containment, and quarantine so an agent reviewing a hostile PR/comment/attachment gets narrower authority and its output cannot flow into higher-trust context. Enforced authority for the untrusted-input boundary, not a dashboard label.
Run: 2026-06-16-weekly-digest-2026-06-04_2026-06-16-frontier-v0
2026-06-04 / Codex

Remote controllers are now listable and revocable, and approvals carry environment identity

Control Plane
- CLI 0.137.0 lets remote-control clients pair and have controller grants listed/revoked via app-server v2 RPCs, and binds permission requests/approvals to an environment identity. A concrete authority-inventory and revocation surface for who can drive a session remotely.
Run: 2026-06-16-weekly-digest-2026-06-04_2026-06-16-frontier-v0
2026-06-03 / Claude Code

Permission and deny rules now enforced as written across WebFetch, Windows paths, and Glob/Grep

Control Plane
- Three distinct gaps where a configured permission/deny rule silently failed to apply are closed in the 2.1.160-2.1.162 line: custom WebFetch rules now override built-in preapproved domains, Windows rules with backslashes or case-variant paths now match, and Read deny rules now hide files from Glob and Grep results.
- Operators who wrote allow/deny policy and assumed it was enforced were running with a false sense of coverage; the fix is gated purely on upgrading past these versions, so the operator action is 'upgrade, then re-audit whether any policy was silently bypassed in the prior window.'
- The Read-deny-vs-Glob/Grep gap is the sharpest: a file an operator denied for Read was still discoverable (and its path/contents surfaceable) via search tools, defeating the access-control intent.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
2026-06-03 / Claude Code

Agent view exposes why a session is blocked and fan-out progress for scripted supervision

Control Plane
- claude agents --json now includes a waitingFor field naming what a blocked session is waiting on (e.g. a permission prompt), and claude agents rows now show done/total progress before detail when work is fanned out.
- Operators scripting or monitoring agent fleets can now programmatically distinguish 'stuck on a permission prompt' from other waits and read parallel-task completion, which is the difference between a watchdog that can unblock a session and one that can only detect silence.
- The operator action is to wire waitingFor and the progress counter into supervision tooling so stuck-agent triage stops requiring a human to open each session.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
2026-06-03 / Codex

ChatGPT iOS 1.2026.146 adds optional Face ID / passcode lock for Codex

Control Plane
- An operator running Codex on iOS can now require Face ID or a passcode to open Codex, adding a device-level authority gate that did not exist before.
- It is optional, so the operator decision is whether to enable it as policy for mobile-deployed Codex access.
- Verification path: update to 1.2026.146, enable the lock, confirm Codex requires biometric/passcode on foreground before trusting mobile as an access surface.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
2026-06-03 / Gemini CLI

Policy file survives cross-device mounts and corruption via EBUSY fallback and TOML recovery

Control Plane
- Operators running in containers with cross-device mounts no longer hit silent policy-update failures - atomic rename now falls back to copy-then-unlink on EBUSY/EXDEV.
- A corrupted policy TOML is auto-backed-up to .bak and rebuilt from scratch rather than blocking on a syntax error, removing a manual-intervention failure mode.
- Verification path: packages/core/src/policy/config.ts adds the fallback and recovery; persistence.test.ts covers both paths.
- Single operator class (operator persisting policy/permission config), single consequence (policy persistence no longer fails silently).
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
2026-06-03 / Gemini CLI

Gemini 3.5 Flash GA routes to flagged users via backend experiment flag, no client update

Control Plane
- Operators auditing which model their CLI calls cannot rely on client version alone - model selection is now gated server-side by experiment flag GEMINI_3_5_FLASH_GA_LAUNCHED (ID 45780819) via hasGemini35FlashGAAccess().
- Auto-routing logic silently switches to Flash GA when the flag is enabled for a user cohort, so the same binary can route to different models across users.
- Verification path: Config.hasGemini35FlashGAAccess() and the registered experiment flag determine routing; the model in use is no longer fully determined by local config.
- Single decision: operators must treat backend flag state as part of the model-routing audit surface.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
2026-06-03 / Hermes Agent

Docker dashboard insecure binding now requires explicit HERMES_DASHBOARD_INSECURE=1 opt-in

Control Plane
- The dashboard no longer infers insecure mode from bind host, so operators whose Docker setups relied on that inference must add HERMES_DASHBOARD_INSECURE=1 explicitly or the dashboard will not bind insecurely.
- Existing Docker and hosted deployments must update env configuration before upgrading to v0.15.1 to avoid a broken or unexpectedly-secured dashboard.
- Verification path: upgrade to v0.15.1, set HERMES_DASHBOARD_INSECURE=1 only where intended, and confirm the dashboard binds as expected without falling back to host-derived inference.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
2026-06-03 / Hermes Agent

Bitwarden Secrets Manager integration replaces per-provider API keys

Control Plane
- Operators managing credentials must decide whether to migrate from per-provider API keys to centralized Bitwarden Secrets Manager, changing where secrets live and how they rotate.
- Centralized secret management enables rotation and revocation that scattered per-provider keys did not; an operator wiring CI/automation must re-point credential sourcing.
- Verification path: configure Bitwarden Secrets Manager on v0.15.0, confirm the agent resolves credentials from it, and test a rotation to verify the agent picks up the new secret.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
2026-06-03 / Hermes Agent

Kanban becomes a multi-agent orchestration platform with auto-decomposition, swarm topology, and worktree-per-task

Control Plane
- Operators who ran Kanban as a task board must now decide whether to adopt orchestrator auto-decomposition and swarm topology, which turn a queue into a self-spawning multi-agent fleet with new operating state to supervise.
- Per-task model overrides and worktree-per-task change the cost and isolation profile of every queued task; an operator must re-plan budget and concurrency.
- Verification path: deploy v0.15.0, queue a decomposable task, and confirm the orchestrator spawns the expected sub-agents in isolated worktrees before trusting it with real work.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
2026-06-03 / OpenClaw

Skill Workshop adds a pending-proposal approval workflow with CLI/Gateway review and a skill_workshop agent tool

Control Plane
- Skill Workshop introduces a new pending-proposal lifecycle that an operator must approve or reject via CLI or Gateway before a skill takes effect, inserting a human-in-the-loop gate into skill provisioning.
- The skill_workshop agent tool lets agents themselves file proposals, expanding the automation surface; operators must decide who may review and who may self-approve.
- Decision is for the control-plane admin/skill-author: configure the review path and authority for skill proposals.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
2026-06-03 / Paperclip

Unclaimed self-hosted deployments get a one-time browser claim to bootstrap the first admin

Control Plane
- Operators standing up a private self-hosted deployment now have a defined bootstrap path to create the first admin before any invite exists, replacing ad-hoc seeding.
- Whoever completes the one-time browser claim becomes the first admin, so an operator must claim a freshly deployed instance promptly to avoid a race for control.
- This changes the deployment runbook: the claim step is now the gate that establishes ownership of the control plane.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
2026-06-03 / Paperclip

Company skills become first-class resources with an install/reset/audit/export/assign CLI

Control Plane
- Skills move from implicit configuration to governed resources: an operator can now audit which skills are installed and assigned, and export the catalog for review or provenance tracking.
- The CLI verbs (install, reset, audit, export, assign) give platform operators a programmatic path to manage agent capabilities across a company instead of clicking through a board.
- Assignment is a distinct authority action — an operator decides which agents get which skills — so capability grants become reviewable operating state rather than ambient defaults.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
2026-06-03 / Agent Zero

Office, Desktop, and Editor plugins become toggleable behind a protected plugin-state API

Control Plane
- Operators can disable Office, Desktop, or Editor plugins (Desktop computer-use especially) on deployments that should not hold those capabilities, via the v1.19 plugin-toggle endpoint.
- The endpoint is described as 'protected' but the release note documents no auth model or role-based capability management, so treat it as a disable lever, not yet an audited capability register.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
2026-06-03 / OpenHands

ACP provider credentials now route through cipher-protected agent_context.secrets, not acp_env

Control Plane
- Operators running ACP agents must understand provider API keys/base URLs now flow through the cipher-protected secrets channel; the deprecated acp_env channel no longer carries credentials.
- Changes the persistence and exposure surface for agent provider credentials, with SDK gap-fill logic specifically preventing re-folding into the insecure acp_env channel.
- Verification path: confirm ACP provider creds appear via agent_context.secrets and are absent from acp_env in agent context.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
2026-06-03 / OpenHands

DELETE /api/organizations now cascade-deletes the sole-org requester (personal org)

Control Plane
- Operators must understand that deleting a personal org now also deletes the requesting user account, enabling re-onboarding on next login — a destructive identity-state change behind one endpoint.
- Changes operating-state semantics of an existing destructive API: requires backup discipline before org deletion; multi-org members are protected by preflight orphan detection.
- Verification path: test DELETE /api/organizations against a sole-org account vs a multi-org member and confirm orphan-rejection behavior.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
2026-06-03 / Flue

v0.9.2 adds an activate_skill tool letting agents load skills autonomously

Control Plane
- Operators configuring skills now get a new agent-facing `activate_skill` tool: agents load full skill instructions on demand before matching work, shifting skill loading from operator-orchestrated to agent-initiated — a proactivity/authority change the operator should be aware of when scoping which skills are available.
- Workspace skills are reread on activation, so edits during an active session take effect (lazy loading preserved); verification is concrete (configure a skill, confirm the agent self-activates it and picks up an edit mid-session).
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0

May 2026

2026-05-30 / Claude Code

Auto Mode now available on Bedrock, Vertex, and Foundry for Opus 4.7 / 4.8

Control Plane
- Auto Mode's permission-handling posture, previously tied to first-party Anthropic auth, now extends to the cloud provider APIs (AWS Bedrock, Google Vertex, Foundry) for Opus 4.7 and 4.8, opt-in via CLAUDE_CODE_ENABLE_AUTO_MODE=1.
- The operator decision is governance-shaped: teams running Claude Code through a cloud-provider procurement path can now deploy the reduced-prompt autonomy posture they could not before, which changes what consent ceremony exists on those deployments.
- Because Auto Mode shifts permission decisioning away from per-action prompts, an operator enabling it on a Bedrock/Vertex deployment must confirm their managed-settings deny rules carry the governance weight the prompts used to.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
2026-05-27 / Claude Code

Auto mode becomes the default permission posture

Control Plane
- Operators with managed Claude Code deployments must re-audit what Auto mode classifies as safe by default — the consent gate is gone.
- Admins relying on the opt-in consent dialog as a visible posture check have lost that surface; equivalent visibility now comes from managed-settings policy, not from a runtime prompt.
- Skill authors should evaluate `disallowed-tools` for skills that should run with a reduced tool surface.
- Hook authors should consider whether `MessageDisplay` is a governance gain or a censorship hazard for their deployment.
Run: 2026-05-27-weekly-digest-2026-05-13_2026-05-27-frontier-v0
2026-05-27 / Codex

Goal mode graduates default-on; remote computer use after lock ships

Control Plane
- Operators using Codex must decide whether goal mode is permitted as a baseline or constrained via permission profiles — the inheritance + managed-requirements features are the right tool for this.
- Evaluators of remote computer use after Mac lock should treat the locked-host surface as a new authority decision, not a default; short-lived authorization and relock-on-input are sensible defaults, but the policy for which tasks may operate against a locked host is still an operator choice.
- Plugin-marketplace evaluators (ChatGPT Business; Enterprise coming soon) should treat plugin distribution-by-marketplace as a new supply-chain surface to govern.
Run: 2026-05-27-weekly-digest-2026-05-13_2026-05-27-frontier-v0
2026-05-27 / Codex

Permission profiles get inheritance and an org-managed enforcement file

Control Plane
- Enterprise operators should restructure permission policy: stop maintaining flat profile lists; build a base profile plus per-team derivations using inheritance.
- Decide where `requirements.toml` lives (repo-rooted, org-rooted, signed) before depending on enforcement — the distribution and trust model are not yet documented.
- Migrate off legacy profile configs; 0.134.0 rejects them with migration guidance.
- Normalize permission selection on `--profile` as the canonical handle; flag-soup approaches are now legacy.
Run: 2026-05-27-weekly-digest-2026-05-13_2026-05-27-frontier-v0
2026-05-27 / Gemini CLI

Auto modes collapse and PolicyEngine reaches into ACP sessions

Control Plane
- Operators on previous Auto variants must re-audit which behaviors the consolidated Auto mode treats as safe — the merger may have loosened or tightened constraints; release notes do not enumerate.
- `AUTO_EDIT` operators should explicitly decide whether shell-redirect auto-approval is acceptable for their environment.
- Operators evaluating Gemini ACP integration should treat PolicyEngine-in-ACP as the new enforcement boundary; the 'deadlock fix' framing understates the structural shift.
Run: 2026-05-27-weekly-digest-2026-05-13_2026-05-27-frontier-v0
2026-05-27 / Hermes Agent

`hermes proxy`: local OpenAI-compatible endpoint backed by operator OAuth

Control Plane

composes with Aider , Cline , Codex , Continue
- Operators running `hermes proxy` on the documented loopback default (`--host 127.0.0.1`) inherit a low-risk posture; the proxy accepts client `Authorization` headers and strips them before attaching the Hermes OAuth upstream. Operators changing the bind to a non-loopback address must place their own auth in front of the port — the proxy itself does not authenticate local callers.
Run: 2026-05-27-weekly-digest-2026-05-13_2026-05-27-frontier-v0
2026-05-27 / Hermes Agent

Honcho identity mapping and credential-pool isolation

Control Plane

composes with Aider , Cline , Codex , Continue
- Multi-user gateway operators should upgrade past the Honcho commits (week of 2026-05-21) and the credential-pool isolation commit (2026-05-27) before running shared-thread deployments — these are quiet correctness fixes for cross-user contamination.
Run: 2026-05-27-weekly-digest-2026-05-13_2026-05-27-frontier-v0
2026-05-27 / Paperclip

Scoped agent permissions, layered routine secrets, document locks

Control Plane
- Multi-agent operators: re-evaluate Paperclip's authz model. The principal-access backfill means pre-existing data is being normalized to the new model — confirm any operator action needed for older versions.
- Secret-handling operators: read PR #6212 before configuring routine env in a deployment where secrets matter — the `agent < project < routine` precedence is a structural operator concept.
- Approval-discipline operators: migrate to lock-backed approval; document locks give approval a persistent surface.
- ACPX-Claude operators: confirm `~/.claude/settings.json` is configured as the source of truth for Claude permissions — the Paperclip control plane defers to it.
Run: 2026-05-27-weekly-digest-2026-05-13_2026-05-27-frontier-v0
2026-05-13 / OpenClaw

Per-sender tool policies via channel-scoped sender keys

Control Plane
- Operators running OpenClaw with public-facing channels can now restrict dangerous tools by requester identity rather than only by agent. Review your tool surfaces and decide whether the broader trust model (per-channel × per-sender) belongs in your deployment.
- Authority restriction now extends across global, agent, group, core, bundled, and plugin tool surfaces — operators should re-audit which surfaces hold authority decisions in their deployment and whether the requester-level layer makes some prior per-agent restrictions redundant.
- Three claim-level updates land in the same release: memory-wiki ingest now requires admin scope, Obsidian search requires write scope, and `openclaw models auth login --provider openai` defaults to ChatGPT/Codex login (API-key setup is now behind `--method api-key`). Setup scripts assuming read-only or API-key-first paths need to be updated.
Run: 2026-05-13-partial-cycle-openclaw-refresh-2026-05-13-frontier-v0
2026-05-12 / Paperclip

Secrets provider vaults (AWS Secrets Manager), host env isolation fix, cursor_cloud adapter

Control Plane
- Operators running SSH-managed execution environments should upgrade immediately: the host env isolation fix (PR #5142) closes a path where host environment variables (API keys, tokens, paths) were being forwarded to remote execution targets.
- Operators managing credentials at scale should evaluate the AWS Secrets Manager import path in Secrets settings UI — this enables rotation-aware credential management with an access-event audit trail.
- Operators using Cursor as an adapter can now configure the new `cursor_cloud` adapter for cloud-hosted Cursor routing with session reuse, streaming, and cancellation.
Run: 2026-05-12-partial-cycle-paperclip-2026-05-07_2026-05-12-frontier-v0
2026-05-12 / OpenHands

Sub-agent delegation (opt-in) and critic evaluation GUI

Control Plane
- Operators running multi-task sessions can now enable sub-agent delegation via `enable_sub_agents`. Built-in sub-agents (bash-runner, code-explorer, general-purpose, web-researcher) handle scoped tasks with restricted tool surfaces. Default is off -- enable deliberately.
- Operators should configure `CRITIC_API_KEY` to route critic evaluation spend separately from the primary model key if centralized cost control matters.
- The critic display is deployment-controlled via `OH_ENABLE_CRITIC_BY_DEFAULT` (disabled by default). Deployments that want it enabled should set that flag; per-deployment toggle is `verification.critic_enabled`.
Run: 2026-05-12-partial-cycle-openhands-2026-05-07_2026-05-12-frontier-v0
2026-05-12 / OpenClaw

Per-agent message restrictions, gated code install, and onboarding wayfinding

Control Plane Platform
- Operators deploying public-facing or sandboxed agents should evaluate `tools.message.crossContext` and `tools.message.actions.allow` overrides to restrict agent message sends to the current conversation without changing the global bot policy.
- Operators running long-horizon OpenClaw sessions should know that session memory is now bounded: the memory dreaming promotion cap compacts oldest auto-promoted sections while preserving user-authored notes. Unbounded auto-memory growth is no longer the default behavior.
- Operators deploying OpenClaw for new users should test the improved CLI onboarding wayfinding: setup, onboarding, configure, and channel commands now explain the next useful command at each step.
Run: 2026-05-12-partial-cycle-openclaw-2026-05-07_2026-05-12-frontier-v0
2026-05-12 / Hermes Agent

Durable Kanban with hallucination gate, redaction-on-by-default, channel allowlists

Control Plane
- Operators upgrading existing Hermes deployments must verify that secret redaction is now ON by default. Log pipelines that relied on unredacted output will see sanitized logs after upgrade.
- Discord operators with role-gated access (`DISCORD_ALLOWED_ROLES`) should re-verify their role-scoping configuration: the guild-scoped fix (CVSS 8.1) may change behavior in cross-guild bot deployments.
- Operators building multi-agent workflows on Hermes should evaluate the Kanban board's reliability primitives (heartbeat reclaim, zombie detection, hallucination gate, per-task retries) before building a custom coordination layer.
- Operators using cron should evaluate `no_agent` mode for script-only automation that does not require LLM invocation.
Run: 2026-05-12-partial-cycle-hermes-agent-2026-05-07_2026-05-12-frontier-v0
2026-05-12 / Gemini CLI

Session resume now surfaces errors and finds legacy sessions

Control Plane
- Operators using --resume with legacy session formats should re-test: prior to this fix, resume failures silently started new sessions. Verify the behavior after upgrade.
Run: 2026-05-12-partial-cycle-gemini-refresh-2026-05-12-frontier-v0
2026-05-12 / Codex

PreToolUse hooks can now rewrite tool inputs before execution

Control Plane
- Hook authors who returned updatedInput in PreToolUse hooks expecting rewrites to apply should re-test: prior to this fix, the original input was used; after this fix, the rewritten input is used. Verify existing hooks behave as intended after upgrade.
- Operators can now build input-sanitizing PreToolUse hooks that modify tool arguments before dispatch -- path normalization, argument masking, destination redirection.
Run: 2026-05-12-partial-cycle-codex-refresh-2026-05-12-frontier-v0
2026-05-12 / Claude Code

Agent view, goal completion, and governance hardening

Control Plane
- `claude agents` is the new canonical surface for multi-session supervision; operators running parallel Claude Code sessions should evaluate it now as their primary management interface.
- /goal changes how long-running autonomous work is structured; operators should test goal-based termination against their most common multi-turn workflows.
- `continueOnBlock` enables advisory governance hooks; existing PostToolUse blocks should be redesigned to pass rejection reasons so Claude can adapt rather than just stop.
- `x-claude-code-agent-id` / `x-claude-code-parent-agent-id` headers and OTel span attributes enable call-tree attribution; logging pipelines receiving Anthropic API calls should start capturing these to distinguish parent sessions from subagents.
- API key auth now disables Remote Control, /schedule, and claude.ai MCP connectors; operators using API key should audit reliance on these surfaces before upgrading.
Run: 2026-05-12-partial-cycle-claude-code-2026-05-07_2026-05-12-frontier-v0
2026-05-11 / Codex

Permissions glance surface and role-aware plugin sharing

Control Plane
- Run receipts should record permission posture + approval mode as standard fields.
- Plugin share role-awareness affects whether configs can be shared across roles.
- Authority visibility in the TUI is a worked example of governance ergonomics worth borrowing.
Run: 2026-05-11-partial-cycle-codex-2026-05-08_2026-05-11-frontier-v0
2026-05-11 / Gemini CLI

Subagents become pluggable; sessions become portable

Control Plane
- Capability-profile assumption "subagents inherit approval mode" is now under-specified.
- Run-contract design should record which subagent protocol variant a run used.
- Adapter work should distinguish local from remote subagent execution.
- Session export/import gives operators a stable serialization point.
Run: 2026-05-11-partial-cycle-2026-05-08_2026-05-11-frontier-v0
2026-05-07 / Paperclip

Agent labor needs operating state, not just parallelism.

Control Plane

Run: 2026-05-07-expanded-watchlist-dry-run
2026-05-07 / Codex / Gemini CLI / Hermes Agent / OpenClaw

Persistent agent state is becoming a product surface

Control Plane
- Developers need to know which goals, memory patches, recaps, sessions, and skill maintenance loops shaped a serious run.
Run: 2026-05-07-commit-harvest-2026-04-23_2026-05-07-frontier-v1
2026-05-07 / Codex / Gemini CLI / OpenHands / OpenClaw / Paperclip / Agent Zero

Permissions, secrets, and sandboxes are moving into the foreground

Control Plane
- The harness must make trust state visible: what can be read, what can be changed, which credentials are exposed, and where execution happens.
Run: 2026-05-07-commit-harvest-2026-04-23_2026-05-07-frontier-v1
2026-05-07 / Paperclip / OpenHands / Hermes Agent / Codex / OpenClaw

Agent systems are growing control planes

Control Plane
- Once agents coordinate across tasks, runtimes, gateways, and integrations, operators need liveness, cost, role, session, and recovery controls.
Run: 2026-05-07-commit-harvest-2026-04-23_2026-05-07-frontier-v1
2026-05-06 / Codex

Worker-native goals unlock longer horizons.

Control Plane
- Operators now need to ask which durable objective the worker is pursuing, whether it is still aligned with the operator's charter, and how it maps to the current run scope.
Run: 2026-05-06-manual-2026-04-22_2026-05-06-frontier-v0
2026-05-06 / Claude Code / Gemini CLI / Hermes Agent

Worker-native state is becoming a memory layer.

Control Plane
- Recaps, memory patches, skill curators, and task state are moving into worker tools. Operators should use them, but should preserve an operator-owned record of what state governed each run.
Run: 2026-05-06-manual-2026-04-22_2026-05-06-frontier-v0
2026-05-06 / Codex / Claude Code / Gemini CLI / Pi Coding Agent

Authority semantics are explicit but fragmented.

Control Plane
- Permission profiles, workspace trust, env loading, hooks, MCP behavior, extension schemas, and provider transports differ by worker and release.
Run: 2026-05-06-manual-2026-04-22_2026-05-06-frontier-v0
2026-05-06 / Claude Code / Codex / Gemini CLI / Hermes Agent

Verification is becoming a worker capability.

Control Plane
- Provider-native review, multi-agent execution, subagent evals, curator reports, and QA-like cloud fleets can catch useful issues, but their verdicts are not automatically the operator's truth.
Run: 2026-05-06-manual-2026-04-22_2026-05-06-frontier-v0
2026-05-06 / Codex / Claude Code / Gemini CLI / Hermes Agent

Provider-native long-horizon state is now table stakes.

Control Plane

Run: 2026-05-06-candidate-2026-04-22_2026-05-06-frontier-v1

<- All signals