Gemini CLI
Every signal accepted for Gemini CLI. Each links to the run that produced it. The Gemini CLI profile carries the current evergreen state.
June 2026
-
v0.45.0 stable bundles terminal hardening, session-context cleanup, and an MCP blacklist-bypass fix
- Operators on preview or older stable builds get a single upgrade decision: move to v0.45.0 to pick up Termux relaunch/resize fixes, session-context filtering on history resume, sequential tool execution for update_topic, Vim keybinding fixes, and an MCP blacklist-bypass prevention fix.
- The MCP blacklist-bypass prevention is the security-bearing item: it closes a path where a blacklisted MCP tool/server could still be reached, so operators relying on MCP allow/deny controls should upgrade before trusting the blacklist.
- Verification path: release tag v0.45.0 notes (published 2026-06-03T01:05:14Z) enumerate the bundled fixes.
- Single composite upgrade decision - bundled small fixes all gated on 'upgrade to v0.45.0' stay one signal.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
-
Policy file survives cross-device mounts and corruption via EBUSY fallback and TOML recovery
- Operators running in containers with cross-device mounts no longer hit silent policy-update failures - atomic rename now falls back to copy-then-unlink on EBUSY/EXDEV.
- A corrupted policy TOML is auto-backed-up to .bak and rebuilt from scratch rather than blocking on a syntax error, removing a manual-intervention failure mode.
- Verification path: packages/core/src/policy/config.ts adds the fallback and recovery; persistence.test.ts covers both paths.
- Single operator class (operator persisting policy/permission config), single consequence (policy persistence no longer fails silently).
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
-
CI labeler switched to pull_request_target, granting write context to fork PR runs
- Contributors and maintainers should note the PR-size labeler now runs under pull_request_target, which executes in the base-repo context with write-capable token access on fork PRs.
- This is the classic pwn-request surface: pull_request_target with any checkout or execution of fork-controlled content can leak the elevated token; operators forking or auditing the repo's CI should confirm the workflow does not check out and run untrusted PR code.
- Verification path: .github/workflows/pr-size-labeler.yml line 4 trigger change from pull_request to pull_request_target.
- Single decision for the repo-security auditor: review this workflow's token scope and whether it touches fork-controlled inputs.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
-
Gemini 3.5 Flash GA routes to flagged users via backend experiment flag, no client update
- Operators auditing which model their CLI calls cannot rely on client version alone - model selection is now gated server-side by experiment flag GEMINI_3_5_FLASH_GA_LAUNCHED (ID 45780819) via hasGemini35FlashGAAccess().
- Auto-routing logic silently switches to Flash GA when the flag is enabled for a user cohort, so the same binary can route to different models across users.
- Verification path: Config.hasGemini35FlashGAAccess() and the registered experiment flag determine routing; the model in use is no longer fully determined by local config.
- Single decision: operators must treat backend flag state as part of the model-routing audit surface.
Run: 2026-06-03-weekly-digest-2026-05-28_2026-06-03-frontier-v0
May 2026
-
Local and remote session invocation protocols land stable
- Operators building delegated workflows on Gemini CLI should re-test against v0.44.0 stable; the remote invocation protocol is no longer preview.
- Multi-scope deployments must audit agent name overlaps before upgrading — the new `first-wins prioritize project` resolution changes which definition wins.
- Until Google documents where remote invocations actually run, treat the remote path as infrastructure-to-be-defined; do not depend on it for production.
Run: 2026-05-27-weekly-digest-2026-05-13_2026-05-27-frontier-v0
-
Auto modes collapse and PolicyEngine reaches into ACP sessions
- Operators on previous Auto variants must re-audit which behaviors the consolidated Auto mode treats as safe — the merger may have loosened or tightened constraints; release notes do not enumerate.
- `AUTO_EDIT` operators should explicitly decide whether shell-redirect auto-approval is acceptable for their environment.
- Operators evaluating Gemini ACP integration should treat PolicyEngine-in-ACP as the new enforcement boundary; the 'deadlock fix' framing understates the structural shift.
Run: 2026-05-27-weekly-digest-2026-05-13_2026-05-27-frontier-v0
-
Session resume now surfaces errors and finds legacy sessions
- Operators using --resume with legacy session formats should re-test: prior to this fix, resume failures silently started new sessions. Verify the behavior after upgrade.
Run: 2026-05-12-partial-cycle-gemini-refresh-2026-05-12-frontier-v0
-
Subagents become pluggable; sessions become portable
- Capability-profile assumption "subagents inherit approval mode" is now under-specified.
- Run-contract design should record which subagent protocol variant a run used.
- Adapter work should distinguish local from remote subagent execution.
- Session export/import gives operators and Bitter a stable serialization point.
Run: 2026-05-11-partial-cycle-2026-05-08_2026-05-11-frontier-v0
-
Persistent agent state is becoming a product surface
- Developers need to know which goals, memory patches, recaps, sessions, and skill maintenance loops shaped a serious run.
Run: 2026-05-07-commit-harvest-2026-04-23_2026-05-07-frontier-v1
-
Permissions, secrets, and sandboxes are moving into the foreground
- The harness must make trust state visible: what can be read, what can be changed, which credentials are exposed, and where execution happens.
Run: 2026-05-07-commit-harvest-2026-04-23_2026-05-07-frontier-v1
-
Accessibility is a frontier capability, not marketing polish
- Everyday adoption depends on setup recovery, visible progress, voice/chat surfaces, readable UI, OAuth clarity, and fewer dead ends.
Run: 2026-05-07-commit-harvest-2026-04-23_2026-05-07-frontier-v1
-
Integrations are volatile; the operating loop has to be durable
- Provider lists, plugin systems, transports, and model profiles will keep changing.
Run: 2026-05-07-commit-harvest-2026-04-23_2026-05-07-frontier-v1
-
Worker-native state is becoming a memory layer.
- Recaps, memory patches, skill curators, and task state are moving into worker tools. Operators should use them, but should preserve an operator-owned record of what state governed each run.
- Add worker-native state fields to adapter receipts: recap handles, memory patch ids, curator reports, skill reports, and resume state.
-
Authority semantics are explicit but fragmented.
- Permission profiles, workspace trust, env loading, hooks, MCP behavior, extension schemas, and provider transports differ by worker and release.
- Bitter capability profiles should record worker-native permission and trust semantics instead of assuming a uniform authorization model.
-
Verification is becoming a worker capability.
- Provider-native review, multi-agent execution, subagent evals, curator reports, and QA-like cloud fleets can catch useful issues, but their verdicts are not automatically the operator's truth.
- Treat worker verification as evidence inputs. BitterQA or the run contract should still own the final evidence standard and settlement.
-
Plugin, extension, and skill ecosystems are becoming the integration surface.
- The practical power of worker CLIs increasingly depends on plugins, hooks, extensions, skills, and transport modules, not just the base model.
- Adapter receipts should include enabled plugin/extension/skill surfaces and should distinguish worker-local skills from Bitter-owned memory.
-
Worker integrations are not durable doctrine.
- Pi removed built-in Gemini CLI and Antigravity support while adding many providers; Gemini preview/nightly channels differ materially; Codex alpha releases and app-server surfaces move quickly.
- Keep worker adapters thin, versioned, source-contracted, and replaceable. The stable Bitter asset is the run contract and receipt chain.
-
Provider-native long-horizon state is now table stakes.