- Replaced direct `strings.ToLower` usage with `util.OpenAICompatibleProviderKey` for generating provider keys.
- Updated auth and executor workflows to use namespaced keys for OpenAI-compatible providers.
- Adjusted tests to validate namespaced key handling, including new test cases for provider registration and execution logic.
- Added `OpenAICompatibleProviderKey` helper in `util` for consistent key transformations.
Closes: #3600
- Updated JSON schema handling to remove `$comment` and `enumDescriptions` fields during schema transformations.
- Adjusted test cases to validate the removal of these fields both at root and nested levels.
- Expanded unsupported schema keywords to include `$comment` and `enumDescriptions` for Gemini compatibility.
Closes: #3512
- Updated `ConvertCodexResponseToClaude` to delay emitting `function_call` start events until the `name` field is resolved.
- Introduced `pendingCodexFunctionCall` for buffering incomplete function calls.
- Added tests to ensure proper behavior for deferred starts, including argument buffering and finalization.
Closes: #3471
- Updated `ConvertOpenAIRequestToGemini` and `ConvertOpenAIRequestToCodex` to handle `input_audio`, retaining `data` and `format` fields.
- Added helper `openAIInputAudioMimeType` for determining MIME types from audio formats.
- Introduced unit tests to validate correct preservation of `input_audio` data and format.
Closes: #3447
- Enhanced `ConvertOpenAIResponsesRequestToOpenAIChatCompletions` to include `reasoning_content` in assistant and tool call messages.
- Introduced `collectOpenAIResponsesReasoningContent` for aggregating reasoning summaries.
- Added tests to validate reasoning attachment in various scenarios, including empty reasoning, tool calls, and reasoning followed by user messages.
Closes: #3397
- Updated handling in `ConvertOpenAIResponsesRequestToOpenAIChatCompletions` to retain `input_image` detail fields such as `image_url` and `detail`.
- Added `TestConvertOpenAIResponsesRequestToOpenAIChatCompletions_PreservesInputImageDetail` to verify preservation of image details during transformation.
Closes: #3385
- Updated `ConvertOpenAIResponsesRequestToOpenAIChatCompletions` to retain `tool_choice` with raw byte handling.
- Added `TestConvertOpenAIResponsesRequestToOpenAIChatCompletions_PreservesStructuredToolChoice` to ensure function and type fields are preserved in transformations.
Closes: #3384
- Introduced `CooldownStateStore` interface for managing independent cooldown state persistence.
- Implemented `FileCooldownStateStore` for storing cooldown states as per-auth `.cds` files with atomic writes and stale file cleanup.
- Enhanced `Manager` to support restoring state from `CooldownStateStore` and persisting state changes during auth updates.
- Updated tests to validate cooldown state saving, loading, concurrency handling, and error scenarios.
Closes: #3368
- Introduced `SetTransientErrorCooldownSeconds` to enable configurable cooldowns for transient errors (e.g., 408/500/502/503/504).
- Updated retry scheduling logic to use the new `nextTransientErrorRetryAfter` function.
- Modified config parsing to include `transient-error-cooldown-seconds` with support for disabling or defaulting to legacy behavior.
- Expanded tests to validate transient cooldown logic with various configurations and edge cases.
Closes: #3315
- Added `convertResponsesToolToOpenAIChatTools` and helper methods to handle namespace tools during request conversions.
- Enhanced response handling to restore namespace context for function calls using `applyResponsesFunctionCallNamespaceFields` and related utilities.
- Updated tests to validate namespace flattening, function call restoration, and non-stream response handling.
Closes: #3298
- Added logic in `ConvertOpenAIResponsesRequestToClaude` to exclude `apply_patch` custom tools.
- Introduced `isOpenAIResponsesApplyPatchCustomTool` helper function to identify and filter the tool.
- Added `TestConvertOpenAIResponsesRequestToClaude_DropsApplyPatchCustomTool` to validate the behavior.
Closes: #3243
- Refactored `ConfigReloadHook` to use `reloadConfigFromWatcher` for consistency.
- Added async `reloadConfigAfterManagementSaveAsync` to handle post-save operations.
- Introduced `ReloadConfigIfChanged` in watcher for manual trigger support.
- Enhanced config reload paths to separate auth synthesis from standard updates.
- Updated `applyConfigUpdate` logic to allow more granular reload behaviors.
Closes: #3235
- Introduced the `gpt-image-2` model in Codex built-ins and updated visibility logic in the registry.
- Added direct proxy support for OpenAI image generation and editing endpoints.
- Implemented new execution paths for `/images/generations` and `/images/edit`, ensuring seamless handling for both JSON and multipart payloads.
- Expanded test coverage to validate the new model and direct proxy features, including streaming scenarios and error handling.
- Introduced `TestRequestCodexTokenCompletionKeepsConcurrentSessionPending` to validate proper handling of concurrent OAuth sessions.
- Refactored Codex OAuth logic to use `newCodexOAuthService` for improved testability.
Closes: #3171
- Added `TestConvertClaudeRequestToOpenAI_ToolSchemaAddsMissingObjectProperties` to validate automatic addition of missing `properties` in `object` schemas.
- Introduced `normalizeObjectSchemaProperties` to recursively ensure schemas of type `object` include an empty `properties` field if absent.
- Updated `ConvertClaudeRequestToOpenAI` to apply schema normalization for improved compatibility with OpenAI schema expectations.
Closes: #3165
- Updated Gemini, Gemini CLI, and Antigravity logic to delete `thinkingConfig` when `ModeNone` is set, `Budget=0`, and `Level` is empty.
- Adjusted tests to validate this behavior across multiple scenarios and models with zero-allowed configurations.
- Extended test cases for additional coverage of mixed-model behavior.
Closes: #3138
Add executor-scoped replay cache aligned with Codex HOME replay:
Scope, observe SSE/non-stream responses, store normalized thought_signature
and function_call_part items, apply on the next streamGenerateContent
request, and invalidate on invalid signature responses.
Gemini/flash/agent models use HOME replay; native per-part signature
replay is not wired on upstream/dev. Wire non-stream and stream paths
in antigravity_executor and purge expired entries from signature_cache.
Includes unit tests and HOME-provider-replay documentation.
- Deleted `geminicli` provider and related `Apply` logic.
- Removed all translator packages specific to Gemini CLI (Claude, Codex integrations).
- Purged associated test files for Gemini CLI translation.
- Removed `GeminiAuthenticator` and all associated authentication logic (OAuth flows, token handling, refresh logic).
- Deleted internal/executor Gemini OAuth support, including bearer token handling and runtime API logic.
- Purged all tests, configs, and command-line flags specific to Gemini OAuth flows.
- Updated documentation and aliases to reflect Gemini removal.
- Renamed `parseRetryDelay` to `ParseRetryDelay` and `deleteJSONField` to `DeleteJSONField`.
- Updated references in `antigravity_executor` and tests to use the new `helps` package.
- Adjusted import paths and test cases to ensure compatibility with the new location.
- Updated README files to reflect changes in the retry logic references.
- Updated `.github/ISSUE_TEMPLATE/bug_report.md` to remove deprecated Gemini CLI mention.
- Introduced tests for `ConvertOpenAIRequestToGemini`, `ConvertOpenAIResponsesRequestToGemini`, and related Claude functions to ensure trailing model-prefill turns are removed.
- Enhanced tool call ID handling with `util.SanitizeClaudeToolID` to standardize IDs in Claude-related conversions and tests.
- Updated logic in Gemini and Claude translators to handle edge cases for trailing assistant prefill and tool ID sanitization, ensuring compatibility across input variants.
Closes: #3113
- Added `TestConvertOpenAIChatCompletionsResponseToOpenAIResponses_CompletedOmitsTopLevelOutputText` to ensure `output_text` is excluded in streamed responses.
- Added `TestConvertOpenAIChatCompletionsResponseToOpenAIResponses_ToolCallCompletedOmitsTopLevelOutputText` to validate behavior during tool call completions.
- Introduced `TestConvertOpenAIChatCompletionsResponseToOpenAIResponsesNonStream_OmitsTopLevelOutputText` to confirm the omission of `output_text` in non-streamed responses.
- Expanded test coverage to ensure consistency with native OpenAI responses.
- Implemented `TestUploadAuthFile_PreservesPriorityAttributes` to ensure priority attributes and metadata are preserved during auth file uploads.
- Updated `UploadAuthFile` logic to utilize `SynthesizeAuthFile` for better handling of generated auth attributes and metadata.
Closes: #2924
- Refactored `ConvertOpenAIResponsesRequestToClaude` logic to align tool use with corresponding tool results.
- Introduced helper functions for appending and flushing pending reasoning and tool use messages.
- Expanded tests to validate message order and content consistency when processing tool calls and results.
- Introduced `xaiNormalizeReasoningSummaryData` and related functions to normalize `reasoning_text` events into `reasoning_summary` shapes for standardization.
- Updated WebSocket and streaming logic to process normalized reasoning summary events correctly.
- Enhanced tests to validate normalization, order of events, and output structure in both stream and non-stream scenarios.
Use the agy CLI User-Agent family (antigravity/cli/{version} darwin/arm64)
on CPA macOS/arm64 hosts instead of the legacy hub-style antigravity/{version}
string. Resolve the cached version from the CLI auto-updater manifest
(darwin_arm64.json), then the GCS latest pointer, then antigravity-cli GCS
prefix listing, with fallback 1.0.8 when all sources fail.
Update AntigravityUserAgent helpers and executor default UA comment to match.
- Added `isCodexUsageLimitError` to detect and handle `usage_limit_reached` errors from Codex responses.
- Updated `newCodexStatusErr` to treat usage limit errors as HTTP 429 with proper `RetryAfter` handling.
- Enhanced test coverage to validate usage limit error handling, including reset time parsing and retry behavior.
Closes: #2886
- Changed default plugin `Enabled` state from `true` to `false` across configurations, runtime logic, and YAML defaults.
- Added helper function `enabledPluginConfigs` for generating plugin configs with `Enabled` set explicitly.
- Expanded unit tests in `pluginhost`, `config`, and `management` to validate behavior changes for disabled plugins, default settings, and skipped load scenarios.
- Refactored content block start/stop logic into `startCodexTextBlock` and `stopCodexTextBlock` for better readability and reusability.
- Updated logic to ensure proper handling of "output_text" block events to avoid ghost stop emissions.
- Added `TestConvertCodexResponseToClaude_StreamTextBeforeToolCallsDoesNotEmitGhostStop` to validate content block start/stop behavior in streamed responses.
- Introduced `applyResponsesFunctionCallNamespaceFields` to manage name and namespace settings in response items.
- Added `splitResponsesQualifiedFunctionCallFromRequest` for handling qualified names and matching them with namespaces.
- Updated response generation logic to preserve namespace and function call structure in multiple response pathways.
- Expanded unit tests to validate namespace and function call restoration in both stream and non-stream scenarios.
* fix(translator): emit Claude server tool blocks for Codex web_search_call streams
Map Codex Responses streaming web_search_call events to Claude SSE
server_tool_use and web_search_tool_result blocks, with deduplication
and a focused stream regression test.
* fix(translator): stabilize Codex web_search fallback tool_use IDs
Reuse the active fallback web_search tool_use ID across later stream
events so tool_result blocks stay paired when upstream omits item IDs.
This is defensive hardening; live Codex streams already provide ws_* IDs.
* fix(translator): emit Codex web_search blocks from populated items
Wait for output_item.done before emitting Claude web_search tool_use
and tool_result blocks, and avoid deduping early added/completed events
that arrive before action.query is available. Matches live Responses
stream ordering seen in local tmux verification.
* fix(translator): map Codex web_search_call items in non-stream Claude responses
Emit server_tool_use and web_search_tool_result blocks from completed
response.output web_search_call items, matching the streaming translator.
* fix(translator): keep non-stream web_search on end_turn and dedupe output items
Do not treat server web_search_call items as client tool_use for stop_reason.
Skip duplicate or query-less open_page web_search output items in non-stream
translation, matching spark live behavior.
* feat(plugin): add ModelRouter before auth with single-slot routing targets
## Motivation
Plugins that need to change execution based on the **original inbound request**
(protocol format, raw body, headers, query, stream flag, metadata, etc.) often
resorted to virtual/trampoline models or routing inside interceptors. This
commit adds **ModelRouter**: a pluggable layer **before** model-to-provider
resolution and AuthManager credential selection, so plugins can declare who
executes a request without spoofing the client model name.
This is a **new capability**, not a bugfix on the existing chain. With no
ModelRouter plugins loaded, behavior matches upstream.
## Pipeline placement
- `execute`, `stream`, and `count` (and image paths via AuthManager) call
`applyModelRouter()` before building `coreexecutor.Request`.
- Routing runs **before** the request interceptor (before auth), so routers see
the client’s original context. After a plugin executor is chosen, the existing
**after-auth interceptor → response/stream interceptor** chain still applies.
- Internal `ExecuteModel` / `ExecuteModelStream` (host callbacks) support
`SkipRouterPluginID` so nested calls do not re-enter the same router.
## Routing API (single slot, mutually exclusive)
`ModelRouteResponse` uses **one target slot** to avoid ambiguity when both
`TargetExecutorPluginID` and `TargetProvider` were set and the host ignored one:
| Field | Meaning |
|-------|---------|
| `Handled` | `false`: this router declines; try the next router or default path |
| `TargetKind` | `self` \| `executor` \| `provider` (pick one) |
| `Target` | `self`/`executor`: plugin ID; `provider`: built-in provider key |
| `TargetModel` | Optional on `provider` only; empty keeps client `RequestedModel` |
| `Reason` | Optional diagnostic text |
- **self**: the router plugin’s own executor (`Target` normalized to the router’s plugin ID).
- **executor**: another plugin’s executor; host pre-checks with `executorPluginReady()`
(executor declared and provider identifier resolvable) to avoid handled routes that 500 at execution.
- **provider**: skip registry model resolution; fixed built-in AuthManager path; optional
`TargetModel` for execution model only—**does not** change outward requested-model metadata.
Routers run in **descending plugin priority** (tie-break: ascending plugin ID). Panic, error,
invalid target, or unavailable executor/provider → log and **fall through to the next router**;
if none handle, use the original provider+auth flow.
## Context exposed to routers
`ModelRouteRequest` includes:
- `SourceFormat`, `RequestedModel`, `Stream`
- `Headers`, `Query`, `Body` (defensive copies)
- `Metadata` (best-effort read-only context snapshot)
- `AvailableProviders`: built-in provider keys with at least one **non-disabled** auth
(`AuthManager.AvailableProviders()`). **Does not** reflect per-model cooldown or transient
unavailability—treat as an optimistic snapshot.
Adds `AuthManager.HasProviderAuth()` and `AvailableProviders()`, excluding `Disabled` and
`StatusDisabled` auths consistently with credential selection.
## Host and RPC
- Go plugins: `pluginapi.ModelRouter` + `RouteModel()`.
- RPC plugins: `pluginabi.MethodModelRoute` (`model.route`), capability flag `model_router`.
- `pluginhost.Host` implements `RouteModel` / `RouteModelExcept`; handlers use
`SetModelRouterHost` or a `PluginHost` type assertion; **direct executor** paths use
`ExecutePluginExecutor*` / `CountPluginExecutor`.
- No bundled example ModelRouter plugin; capability is active only when a third-party plugin
declares `model_router` and loads.
## Plugin RPC schema (policy A, upstream-aligned)
- `pluginabi.SchemaVersion` stays **1**: capability additions (`model_router`, `model.route`)
do not bump the number; increment only on breaking RPC JSON changes.
- Host sends `schema_version` at register; reject only if the plugin declares a **higher**
version than the host.
- No unpublished “ModelRouter requires schema ≥ 3” gate (v3 single-slot API was never public).
- Existing plugins and examples without `model_router` (`schema_version: 1`) need no changes.
- RPC ModelRouter: `schema_version: 1` + `model_router: true` + implement `model.route`.
## Path consistency within this commit
- Provider routes reuse image-only model checks (e.g. `gpt-image-2`) on the normalized model,
same as the default AuthManager path.
- `count` aligned with execute/stream: `SkipRouterPluginID`, query/headers injection,
interceptor skip semantics.
- Handlers: `modelRoutersEnabled` treats hosts without `HasModelRouters` as disabled
(same as before ModelRouter existed); `pluginhost.Host` implements the detector.
- API docs: `ModelRouter` explicitly includes built-in **provider** targets (in addition to
plugin executors and the router’s own executor).
## Testing
go test ./internal/pluginhost ./sdk/api/handlers ./sdk/pluginapi ./sdk/pluginabi ./sdk/cliproxy/auth
go build -o test-output ./cmd/server && rm test-output
go test ./...
* fix(handlers): address ModelRouter review feedback
- Use modelExecutionQuery for plugin executor and AuthManager paths so
inbound URL query matches router/header behavior
- Guard queryFromContext when gin Request.URL is nil
- Read plugin executor stream chunks via nextStreamChunk to exit on cancel
- Drop redundant clonePluginMetadata on capability record meta
Tests cover query propagation, stream cancel, and nil URL safety.
* feat(plugin): add Claude web search router example
Add a Claude Code web_search ModelRouter example that can route matching Claude requests through Antigravity, Codex, xAI, or Tavily.
The plugin includes executor orchestration, backend fallback/penalty handling, Tavily API key support, Claude-compatible response assembly, stream forwarding, and focused unit coverage for detection, fallback routing, model resolution, penalties, stream forwarding, and Tavily behavior.
Verification: go test -count=1 ./... in examples/plugin/claude-web-search-router/go; go build -buildmode=c-shared for the plugin; go build ./cmd/server; live local CPA curl coverage for plugin load, four explicit routes, fallback, and Codex spark routing.
* fix(pluginhost): validate executor routes before fallback
* fix(pluginhost): skip oauth-only executor routes
Keep RPC streaming executor callback scopes alive until async streams close, detach nested host.model.execute_stream contexts from request cancellation, and clean up the stream bridge on stream completion.
- Introduced `disable-claude-cloak-mode` configuration to globally disable Claude cloak mode with credential-level overrides.
- Enhanced `getCloakConfigFromAuth` to support fallback to metadata for cloak settings.
- Updated cloak configuration precedence logic, integrating global, credential, and default modes.
- Updated config and watcher diff handling to include `disable-claude-cloak-mode`.
Closes: #2789
- Added `ConvertClaudeToolResultContent` to standardize Claude tool_result content, preserving JSON structure and splitting out base64-encoded images.
- Updated Gemini and Gemini-CLI translators to use the new utility for generating deterministic function responses and inline image parts.
- Added comprehensive test cases for content types and edge cases, ensuring correct handling of string, JSON, and image blocks.
Closes: #2781
- Added `sanitizeClaudeWebSearchDomains` to remove empty `allowed_domains` and `blocked_domains` fields for built-in web_search tools, addressing ambiguity errors from Anthropic.
- Integrated domain sanitization into the Claude message preparation pipeline.
- Added test cases to validate correct handling of empty and non-empty domain fields across various tool types.
Closes: #2681