Backstage

Backstage: Protected on Paper (2026-06-16 .. 2026-06-23)

Internal product-intake companion to the public digest. Not for publication. What Bitter and Factory should learn from this window's research.

The load-bearing intake: channel is a capability-profile property

Two consecutive windows now show the same thing: a large share of the sharpest authority and security work merges to a default branch (or a preview tag, or a later version) without reaching the binary an operator runs. This is not noise; it changes what a capability profile means.

Bitter implication. A capability profile keyed on "the project added X" silently over-credits the deployed surface. Bitter's frontier-signal schema should carry an explicit channel field (tagged-release | main-unreleased | preview-or-beta) on every capability claim, and capability-profile assumptions should default to the tagged state unless the operator is known to run main. This window: OpenHands' entire enterprise/security cluster (two windows unreleased), Gemini's skill path-traversal fix (preview-only, second window), Hermes' MCP-persistence wave (main, past v0.17.0), Paperclip's newest controls (master, past v2026.618.0), and Flue's private-by-default observability (staged in an Unreleased changelog section). An adapter that assumes a main-merged fix is in force will mis-model the deployment.

Declared vs enforced: test the boundary, do not trust the note

Claude Code disclosed that two announced authority features — the 5-level subagent depth cap and Agent(type)/Agent(x,y) permission rules — did not actually bind until fixes this window. The lesson generalizes: a permission feature is not a permission boundary until something refuses the disallowed action.

Bitter implication. BitterBench / capability-probe work should test authority boundaries rather than ingesting changelog claims: write an Agent(type) deny and confirm a named spawn is refused; grant a Codex approval in one environment and confirm it does not leak to another. An "enforcement verified" bit on authority claims is worth more than the presence of the feature. This is a candidate eval pattern, not just a profile note.

Identity planes are splitting — a new credential membrane

OpenHands decoupled API-key auth from Keycloak sessions (machine identity no longer dies with the human SSO session) and generalized a per-user secret enricher that injects linked OAuth tokens into sandboxes from any conversation start path. Hermes added a root-owned, user-immutable /etc/hermes managed scope.

Bitter implication (BitterPass / BitterGrid). The "which credential follows which principal into the sandbox" question is now a first-class membrane concern. If Bitter wraps these harnesses, the secret-injection paths (web/Slack/API start paths carrying different linked tokens) are exactly where a wake-packet's credential scope must be explicit. Watch the machine-identity / human-SSO split as a pattern Bitter likely needs to mirror rather than inherit.

Runaway-cost ceiling: a gap Bitter should own, not borrow

Hermes shipped background fire-and-forget fan-out subagents with the default wall-clock timeout removed; a heartbeat/inactivity backstop remains but a busy runaway worker has no wall-clock or cost bound.

Bitter implication (run contract / BitterGrid). Do not rely on the harness for a spend ceiling. A Bitter run contract should impose a wall-clock and cost bound it owns and can enforce/replay, independent of whether the wrapped harness has one this release.

Factory relevance

Paperclip budget enforcement (#8347, master-unreleased) moves budget from surfacing to preflight enforcement (cancel queued work before an adapter starts). This is the closest external mirror of Factory allocation discipline seen on the watchlist; track whether it tags and how the caps are scoped.
Paperclip task watchdog (#8339) — recovery/status actors structurally cannot mutate approvals — is the "review actor narrower than work actor" primitive Factory accountability wants. factory_relevance: medium.
The Hermes exposed-control-plane → root agent → MCP-persistence failure mode is workcell-doctrine intake (BitterGrid): a startup posture audit + IOC blocklist is a replayable, auditable control Bitter can own and compare across harnesses. factory_relevance: low but workcell_relevance: high.
Most of this window's signals are factory_relevance: none. Do not force the channel-gap thesis into an allocation story; it is a research-quality and capability-modeling lesson first.

Council / doctrine follow-up

This window strongly motivates amendment 007 (security_advisory deployment-class scope): every advisory here is sharply scoped — "shared-pool operators," "if you expose a dashboard," "builds from main," "stable users installing third-party skills." The flat boolean would over-claim each one. Recommend prioritizing 007 for the next ratification pass.
The channel-as-evidence finding is drafted as amendment 010 (proposed) this cycle — see charter/proposed/. It is the standout doctrine signal of two consecutive windows.
The Hermes single-source campaign is a good template for an extraordinary-claim-attribution rule (attribute, do not assert, when the only source is project-controlled). Surfaced in the audit; not yet drafted.

Run-quality notes

Harness: 10 Opus harvesters (sub-spawn authorized) → 5 Opus adversarial verifiers (one dedicated to the lede) → coordinator synthesis. The verify stage caught the Hermes single-source overclaim, the Flue staged-vs-shipped misframe, an OpenHands SHA transcription error, and two Claude Code version/framing precision fixes before publication. Recommend standing.
Receipts and channel resolved by git ancestry, not date inference.