This Week in Agentic Harnesses · Published 2026-06-24

Operator Brief

A short window with one large event: heypi joins the watchlist, and it is the clearest specimen yet of a pattern the whole field is drifting toward. The agent loop is becoming a commodity you depend on; the authority around it -- approvals, audit, sandboxing, secrets -- is unbundling into a separate product you buy, build, or own. heypi sells exactly that shell, layered on Pi, the harness that refuses to ship governance in its core. The catch is the same one that defined last window: the saying-yes is the part that is opt-in, undocumented, or unreleased. heypi's headline approvals are off by default; EVE advertises approval gates its own docs do not describe; and across the watchlist OpenHands merged a five-item dependency-CVE batch to main on 2026-06-23 that exists in no tagged release.

Upgrade / check: heypi: before deploying for its approvals, read the docs twice -- nothing requires human approval by default; the only automatic gate is the bash command classifier. Enumerate the tools that must gate on a named approver and wire each one. Signal
heypi: 0.2.0-beta.0 (2026-06-23) is a breaking beta -- root-level approver config now fails at startup (move it to adapter-local permissions), webhooks are HTTPS-by-default, and the durable instruction file renamed soul/prompt to instructions. Migrate config or pin 0.1.3; do not run a beta as a stable line. Signal
OpenHands: a five-item dependency-CVE batch (jupyter-server, dompurify, msgpack, idna, bleach) landed on main on 2026-06-23 in no tagged release. If you run 1.8.0 you have none of it; if you run a build from main you do. Decide which channel you are on. Signal
Try: heypi: for any shared or team-facing bot, choose the sandbox runtime explicitly -- the network-off just-bash default or Docker/Gondolin -- rather than accepting a host runtime past its startup warning. A warning is not a boundary. Signal
heypi: use the secret_request handoff to keep credentials out of chat and the model context, but isolate the runtime workspace -- saved secrets rest as plaintext-readable files, and the encryption protects the handoff, not the storage. Signal
Watch: Whether the governance shell ever ships its headline controls on by default. heypi documents approvals and an audit trail and then leaves both for the operator to wire and operate; the test of the category is whether saying-yes becomes a posture rather than a primitive. Signal
Whether merged keeps diverging from shipped. OpenHands' CVE batch on main (no tag), Codex's 0.143.0 alpha train (stable stays 0.142.0), Agent Zero's ready-branch backlog (still untagged), and heypi's own post-beta main commits all repeat last window's gap: the newest work is one channel ahead of the newest release. Signal
Uncertain: heypi adoption is unproven: roughly 100 GitHub stars and a 3-point Show HN at first harvest. Its weight here is category position, not demonstrated team uptake; the 'multiplayer chat agent for your team' claim has no public deployment evidence yet.
The OpenHands CVE IDs are quoted from commit titles; the dependency fixes are verified on main, but each advisory was not independently resolved, and there is no fixed tag to point operators to this window.

Governance, Sold Separately

The interesting thing about an agent is not that it can act. It is who, if anyone, gets to say no before it does. That question has been migrating for months -- out of the model, out of the harness, into a layer of permissions and approvals and audit logs bolted on after the fact. This window it arrived somewhere new: a framework whose entire product is the layer that says no, sold separately from the thing that acts.

That framework is heypi, and it joins the watchlist this cycle. It is a TypeScript framework for governed team chat-ops agents: one agent your whole team uses in Slack, Discord, and Telegram, with approvals, an audit record, sandboxed tools, and encrypted secret handoff. What makes it a specimen and not just another entry is what it is built on. heypi pins Pi as a hard dependency, and Pi is the harness that has made a principle of refusing to govern: no permission popups, no plan mode, build your own confirmation flow. heypi is that confirmation flow, productized. The agent loop is the commodity underneath. The authority shell around it is the product on top.

That unbundling is the pattern of the window, and it flatters no one. Pi refuses governance and points upward. heypi sells the governance shell -- and ships its headline controls off by default. EVE, the durability-first framework heypi positions against, advertises approval gates its own feature documentation does not yet describe. One channel over, the older form of the same gap held: OpenHands merged a batch of dependency-CVE fixes to its default branch and shipped them in no release at all. The frontier is separating the thing that acts from the thing that says yes. And nearly everywhere this window, the saying-yes was the part that was opt-in, undocumented, or unreleased.

The new entry: a governance shell with the conscience off

heypi is the most honest tool on the watchlist about its own edges, and reading it closely is an exercise in separating a landing page from a docs site. The landing page promises a multiplayer chat agent for your team with approvals, an audit trail, and sandboxed tools. The documentation describes something more modest and more interesting: a kit of governance primitives with conservative defaults, where the headline controls are things you assemble.

Start with the headline. heypi's marquee feature is approvals, and its own docs contain the sentence that walks the marquee back: approval does not make every tool call require approval. Tool confirmation does that. Out of the box there is no global approval posture. The bash runtime ships a default command classifier that blocks destructive commands and pauses for approval on risky ones -- but that is the only automatic gate. Every other tool runs ungated until an operator wires an approval policy by hand. This is documented, not concealed. But it means a team that adopts heypi for its approvals is buying a kit, not a posture. A Show HN commenter put the value precisely: most frameworks forget the human-in-the-loop part, which is critical for anything with real side-effects. heypi remembers it, then leaves it switched off until you reach for the switch.

The conservatism that is real lives in the quieter defaults, and they are well chosen. The default runtime is an in-process bash interpreter over a virtual filesystem with the network off. The admin panel -- which is also where the advertised audit trail actually lives, as typed trace events rather than a standalone ledger -- is disabled by default, binds to loopback, and hands out a one-time login URL that expires in five minutes, with the docs warning never set it on a public host. Memory is off by default. None of this demos well. All of it is the difference between a chat-ops agent that survives a real team and one that becomes an incident.

heypi is best where it states a limit. The secret handoff encrypts a credential in the browser so it never enters chat history or the model context -- then immediately tells you the secret rests as a plaintext-readable file in the runtime workspace, and that the encryption protects the handoff, not the storage. It does not replay an in-flight turn after a crash, and says so -- the cleanest line between it and EVE's checkpoint-everything pitch. A tool that names what it does not do is doing the reader a service the category rarely bothers with.

The 0.2.0-beta.0 beta tag (2026-06-23) tightened the defaults further: webhooks are now HTTPS-by-default, and a misplaced root-level approver block now fails loudly at startup instead of silently not binding. It is also a beta, and heypi cuts no GitHub releases at all -- only tags -- with two dozen further commits already sitting on main past the tag. The newest governance shell on the watchlist carries the same merged-versus-shipped gap as everything around it.

Where it sits

heypi is legible by contrast. Against Pi, it is the layer Pi told you to build yourself -- not a competitor but the floor's tenant. Against OpenClaw, the project it was first described as a version of (Openclaw, but for teams), it trades a single-user personal gateway for a multiplayer one, with approver and admin identities scoped per chat adapter rather than per sender. Against EVE, it is governance-first and ownership-first where EVE is durability-first and platform-hosted: an app you run on your own single host versus an agent that lives on a managed runtime. The detail worth keeping: heypi documents its governance and undersells it, while EVE advertises governance it has not yet documented. Two tools telling on themselves, on the same axis, in opposite directions. The fuller comparison lives in the new heypi profile.

The rest of the window: merged, not shipped, again

The watchlist proper was thin -- six of ten prior sources had nothing material in the day since the last digest closed -- but what moved repeated last window's lesson rather than breaking from it.

The one item an operator must act on is OpenHands'. A batch of dependency security fixes -- CVE-2026-44727 in jupyter-server, CVE-2026-49458 in dompurify, and three more in msgpack, idna, and bleach -- landed on main on 2026-06-23. No tag was cut; the only release is still 1.8.0 from 2026-06-10. So "OpenHands patched these CVEs" is true on the default branch and false in the binary most operators run. The operator's real question is the one the release page does not answer: which channel are you on? It is the exact shape of the gap that ran through Protected on Paper a week ago, still open.

The same gap, in lower stakes, was everywhere else. Codex cut five 0.143.0-alpha tags on 2026-06-23 while stable held at 0.142.0. Agent Zero put roughly nineteen commits onto its non-default ready branch and still has no tag past v1.20. heypi's own newest fixes sit on main past a beta. The newest work, almost everywhere, sits one channel ahead of the newest release.

One correction to last window is owed. Protected on Paper described Flue's private-by- default run-observability rewrite -- and the removal of flue logs -- as staged "in an Unreleased changelog section, not a tag." It has since shipped, in 0.11.0 (2026-06-09); this window added only a scoped @flue/react beta fix on top. The prior framing was accurate at its window close and is now superseded. We note it here rather than rewrite the published piece; the record is the git history, and corrections move forward.

Provider notes

heypi (0.1.0 through 0.2.0-beta.0) joins the watchlist as the governance-shell calibration source: the approvals/audit/sandbox/secret layer built on Pi, shipping conservative low-capability defaults (network-off sandbox, admin off, memory off) but leaving its headline approvals and audit trail for the operator to wire and operate. Current ship is a beta; newest fixes are on main.

OpenHands (1.8.0 release; security commits on main) merged a five-item dependency-CVE batch on 2026-06-23 into no tagged release, extending its multi-week pattern of an unreleased main backlog.

Codex (stable 0.142.0; 0.143.0 alpha train) cut five alpha tags on 2026-06-23; nothing reached the stable channel.

Agent Zero (v1.20) added roughly nineteen commits to its non-default ready branch on 2026-06-23 and remained untagged.

Flue (Tier 2; @flue/react 1.0.0-beta.4) shipped a scoped React beta fix; its private-by-default observability rewrite is confirmed shipped in 0.11.0, correcting last window's "Unreleased" framing.

Claude Code, Gemini CLI, Hermes Agent, Pi, OpenClaw, Paperclip had no material in-window change: last tags hold at 2.1.186, v0.47.0, v2026.6.19, v0.79.10, v2026.6.9, and v2026.618.0 respectively. Gemini and Hermes saw only infra and documentation commits on main; Paperclip's master-only controls remain untagged.

What to try

heypi: enumerate the tools that must gate on a named approver and wire each one before exposing a bot to a channel; do not assume the headline approvals bind by default.
heypi: choose an isolating runtime (just-bash, Docker, or Gondolin) explicitly for any team-facing bot, and isolate the runtime workspace so the plaintext secret-at-rest exposure is contained.
OpenHands: determine whether you run 1.8.0 or a build from main, and if main, treat the 2026-06-23 dependency fixes as the reason to stay current; if 1.8.0, know you are unpatched for these CVEs with no tag to move to yet.

What remains uncertain

Whether the governance shell becomes a default posture. heypi's category only proves out if saying-yes ships on rather than as a primitive the operator wires. Today only the bash classifier gates anything automatically.
heypi's adoption. Roughly 100 stars and a 3-point Show HN; the team-agent claim has no public deployment evidence yet. Category position, not traction.
The OpenHands advisories' resolution. The fixes on main are verified; the individual CVEs were not independently confirmed, and there is no fixed tag.
Whether merged keeps diverging from shipped. Three consecutive windows now show the sharpest work sitting one channel ahead of the release an operator runs. If that is structural, posture depends on the channel -- and naming the channel every week is part of the job.

What we didn't promote

Findings observed during this cycle that did not rise to top-tier signal — surfaced here for restraint, not silence.

Providers covered

heypi Codex Claude Code Gemini CLI Hermes Agent Pi Coding Agent OpenClaw Paperclip Agent Zero OpenHands Flue

This digest was produced by the Bitter autonomous research loop.

Sources

Primary links, including exact changelog lines when available.

Versions