Field Notes For Agent Operators
Coding agents are changing faster than operating policy.
Weekly, source-backed field notes for teams running coding agents in production: what changed, what broke assumptions, what got easier, and what to test before the next rollout. Covers 10 providers.
Private Launch
Founding member access
Support the weekly operator brief and record membership through BitterHub as private launch access comes online.
Latest Issue
Who's Allowed to Say Yes
The most consequential pattern this fortnight was not a new capability but the authority work racing to catch up with one. Agents got deeper this window (subagents spawning subagents, shared-tenant orgs, agents reviewing untrusted code, real desktop control reaching Europe), and nine of ten providers spent it deciding and structurally enforcing who is allowed to do what, repeatedly closing the gap between a control they had documented and the one their runtime enforced. A Hermes maintainer named it in a commit: an unpaired write-deny rule is 'theater.' The catch for operators is channel: several of the sharpest fixes are merged to a default branch, not yet in a tagged release. The quieter second story is a market reshuffle: Google began steering Gemini CLI users toward a separate successor CLI, Codex added import of Claude Code setup, and Anthropic's new Fable 5 model was picked up by other harnesses within days.
- Upgrade / check
- Claude Code: upgrade past 2.1.172, which fixes untrusted project settings setting OTEL client-certificate paths without a trust prompt (2.1.169) and pre-warmed background workers reading another directory's .mcp.json approvals and trust (2.1.172). Re-audit background-agent and untrusted-repo workflows. https://code.claude.com/docs/en/changelog
- Try
- Claude Code operators: subagents can now spawn subagents five levels deep (2.1.172), and 2.1.178 made the auto-mode classifier evaluate a spawn before it launches. Upgrade past 2.1.178, then use the new Tool(param:value) permission syntax (Agent(model:opus)) to cap model tiers inside delegated trees. https://code.claude.com/docs/en/changelog
- Watch
- The authority build-out is structural, not cosmetic: Claude Code argument-aware permissions and a classifier that gates subagent spawns, Paperclip deny-by-default review containment for untrusted content and per-tenant identity isolation, OpenHands first-signer-owns-it org bootstrap and model-access gating, Codex listable and revocable remote-control grants, Pi project trust. Watch whether per-action consent is being replaced by versioned, enforced policy faster than operators can audit it. https://github.com/paperclipai/paperclip/pull/7530
Provider Updates
Recent Signals
-
Skills were poisoning every memory store and a skill delete could wipe the working tree (unreleased)
June 16 commits (main) stop a /skill invocation poisoning every connected memory provider with its raw body, and add tree-escape validation so an agent-triggered skill delete cannot rmtree outside the skills root (a fix ported from an incident that wiped another tool user's working directory). The self-improving-agent risk class made concrete.
-
Subagents can spawn subagents five deep, and auto mode now classifies spawns before launch
2.1.172 lets a subagent spawn its own subagents up to 5 levels deep (new capability and a new governance surface); 2.1.178 then made the auto-mode classifier evaluate a spawn before launch, closing a gap where a deeply nested agent could request an action the operator's policy would block at the top.
-
Permission rules can finally match a tool's arguments (Agent(model:opus))
2.1.178 added Tool(param:value) syntax so a rule can match input parameters, e.g. Agent(model:opus) blocks Opus subagents; permissions move from all-or-nothing per tool to per-argument.
-
Concurrency becomes a governed, billable resource (Personal 3, commercial 10; unreleased)
PR #14168 (main, unreleased) caps concurrent conversations/sandboxes (Personal=3, commercial=10) with per-org and per-user override columns and HTTP 429 enforcement. A real resource-control and economics surface; tightens the free tier.
-
Fire-and-forget background subagents that re-inject results as a new turn (unreleased)
delegate_task(background=true) (main) dispatches an async subagent and re-injects its result as a new turn, with /stop and /agents as the control surface and a max_async_children cap. The same week removed the default 600s subagent timeout, so runaway detection now rests on heartbeat staleness alone. Changes the unit of work and the receipt boundary.
-
Three path-traversal holes in agent skill install/link/uninstall (fixed on main only)
Commit bca5667fc / PR #27767 (main, ahead of every stable, preview, and nightly tag as of 2026-06-16) fixes three path-traversal vulnerabilities so a malicious skill package cannot write outside .gemini/skills or delete sibling directories. The clearest confirmation that agent skill packages are an untrusted-input boundary; treat third-party skill installs as untrusted until the carrying release ships.