- Added `normalizeCodexParallelToolCallsForTools` to conditionally remove `parallel_tool_calls` when `tools` are missing or empty.
- Integrated normalization into Codex executor workflows for improved request handling.
- Introduced unit tests to validate behavior across different tool scenarios.
Closes: #3903
- Updated JSON schema handling to remove `$comment` and `enumDescriptions` fields during schema transformations.
- Adjusted test cases to validate the removal of these fields both at root and nested levels.
- Expanded unsupported schema keywords to include `$comment` and `enumDescriptions` for Gemini compatibility.
Closes: #3512
- Introduced the `gpt-image-2` model in Codex built-ins and updated visibility logic in the registry.
- Added direct proxy support for OpenAI image generation and editing endpoints.
- Implemented new execution paths for `/images/generations` and `/images/edit`, ensuring seamless handling for both JSON and multipart payloads.
- Expanded test coverage to validate the new model and direct proxy features, including streaming scenarios and error handling.
Add executor-scoped replay cache aligned with Codex HOME replay:
Scope, observe SSE/non-stream responses, store normalized thought_signature
and function_call_part items, apply on the next streamGenerateContent
request, and invalidate on invalid signature responses.
Gemini/flash/agent models use HOME replay; native per-part signature
replay is not wired on upstream/dev. Wire non-stream and stream paths
in antigravity_executor and purge expired entries from signature_cache.
Includes unit tests and HOME-provider-replay documentation.
- Deleted `geminicli` provider and related `Apply` logic.
- Removed all translator packages specific to Gemini CLI (Claude, Codex integrations).
- Purged associated test files for Gemini CLI translation.
- Removed `GeminiAuthenticator` and all associated authentication logic (OAuth flows, token handling, refresh logic).
- Deleted internal/executor Gemini OAuth support, including bearer token handling and runtime API logic.
- Purged all tests, configs, and command-line flags specific to Gemini OAuth flows.
- Updated documentation and aliases to reflect Gemini removal.
- Renamed `parseRetryDelay` to `ParseRetryDelay` and `deleteJSONField` to `DeleteJSONField`.
- Updated references in `antigravity_executor` and tests to use the new `helps` package.
- Adjusted import paths and test cases to ensure compatibility with the new location.
- Updated README files to reflect changes in the retry logic references.
- Updated `.github/ISSUE_TEMPLATE/bug_report.md` to remove deprecated Gemini CLI mention.
- Introduced `xaiNormalizeReasoningSummaryData` and related functions to normalize `reasoning_text` events into `reasoning_summary` shapes for standardization.
- Updated WebSocket and streaming logic to process normalized reasoning summary events correctly.
- Enhanced tests to validate normalization, order of events, and output structure in both stream and non-stream scenarios.
Use the agy CLI User-Agent family (antigravity/cli/{version} darwin/arm64)
on CPA macOS/arm64 hosts instead of the legacy hub-style antigravity/{version}
string. Resolve the cached version from the CLI auto-updater manifest
(darwin_arm64.json), then the GCS latest pointer, then antigravity-cli GCS
prefix listing, with fallback 1.0.8 when all sources fail.
Update AntigravityUserAgent helpers and executor default UA comment to match.
- Added `isCodexUsageLimitError` to detect and handle `usage_limit_reached` errors from Codex responses.
- Updated `newCodexStatusErr` to treat usage limit errors as HTTP 429 with proper `RetryAfter` handling.
- Enhanced test coverage to validate usage limit error handling, including reset time parsing and retry behavior.
Closes: #2886
- Introduced `disable-claude-cloak-mode` configuration to globally disable Claude cloak mode with credential-level overrides.
- Enhanced `getCloakConfigFromAuth` to support fallback to metadata for cloak settings.
- Updated cloak configuration precedence logic, integrating global, credential, and default modes.
- Updated config and watcher diff handling to include `disable-claude-cloak-mode`.
Closes: #2789
- Added `sanitizeClaudeWebSearchDomains` to remove empty `allowed_domains` and `blocked_domains` fields for built-in web_search tools, addressing ambiguity errors from Anthropic.
- Integrated domain sanitization into the Claude message preparation pipeline.
- Added test cases to validate correct handling of empty and non-empty domain fields across various tool types.
Closes: #2681
- Added methods for managing and tracking WebSocket transcript state, including recording, prepending, and replacing transcript inputs.
- Implemented `executeCompactionTriggerFromWebsocketContext` to support compaction triggers using recorded transcript context.
- Enhanced upstream-downstream ID mapping with additional utilities and state synchronization.
- Expanded test coverage to validate transcript state management, compaction payload generation, and WebSocket response handling.
- Developed `XAIWebsocketsExecutor` for handling xAI Responses via WebSocket transport.
- Introduced session and state management with `codexWebsocketSessionStore` and `xaiWebsocketIDStateStore`.
- Added robust ID mapping for upstream and downstream request/response sequences.
- Enhanced error propagation and handling of WebSocket terminal events.
- Included utility methods for WebSocket request preparation, connection management, and state tracking.
- Added foundational support for compact and streamed responses via enhanced session tracking.
- Enhanced Codex Websockets Executor to capture `response.done` as a terminal event, alongside `response.completed` and `error`.
- Improved error propagation for upstream websocket errors with comprehensive message handling.
- Introduced utility functions for recognizing terminal events and extracting error messages.
- Expanded tests to validate new websocket event logic, including terminal event handling and upstream error propagation.
- Introduced `executeCompact` to handle non-streaming compact responses via the `/responses/compact` endpoint.
- Added `executeCompactionTriggerStream` for streaming responses triggered by `compaction_trigger`.
- Enhanced request preparation with `prepareResponsesRequestTo` for dynamic response formats.
- Updated logic to bypass streaming for `/responses/compact` and added fallback behaviors.
- Added comprehensive tests for compact response handling and event streaming validations.
- Updated Antigravity Credits fallback to handle KV store unavailability as a service error.
- Enhanced signature caching mechanisms with request-time KV access and sliding expiration.
- Added and improved tests for KV client interactions, including error handling and expiration behaviors.
- Introduced `CacheSignatureBestEffort` for non-critical signature caching and clarified function flows with required context.
- Ensured consistent error reporting for missing or unavailable KV stores in various scenarios.
- Replaced direct `homekv` calls with injectable KV client interfaces for `antigravity` and `codex_reasoning_replay` modules.
- Improved error reporting and handling for KV operations, including `KVGet`, `KVSet`, `KVDel`, and `KVExpire`.
- Introduced dedicated fake KV clients for expanded and granular test coverage.
- Added new unit tests to validate KV client behaviors and error scenarios, ensuring robustness and sliding expiration functionality.
Adds a fourth value for the disable-image-generation setting:
- false: inject image_generation (unchanged)
- true: strip everywhere + 404 on /v1/images/* (unchanged)
- chat: strip on non-images endpoints, keep /v1/images/* (unchanged)
- passthrough: never inject and never strip on non-images endpoints
(the client payload is forwarded unchanged); behaves like
"chat" on /v1/images/* endpoints.
image_generation injection (codex executors) is already gated on the Off
mode, and the /v1/images/* 404 gate is already gated on the All mode, so
passthrough only required a change to the payload strip logic in
payload_helpers.go, now expressed via shouldStripImageGeneration().
Closes#3831
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a native Antigravity WebSearch path for Claude typed WebSearch requests.
Detect Claude Messages requests whose tools are only typed WebSearch tools
(web_search_20250305 / web_search_20260209), and convert them into an
Antigravity requestType=web_search payload instead of sending the request
through the normal tool-calling path.
Preserve the user's requested model. The native path is enabled only when that
Antigravity model is known to support Google Search. Capability data fetched
from Antigravity model info is used only as an enhancement to the local model
registry, not as a replacement for the existing registry fallback behavior.
Unsupported models keep the existing Antigravity request behavior and are not
silently rerouted to another web-search-capable model.
Translate Claude WebSearch request options to the verified Antigravity
googleSearch shape:
- max_uses -> googleSearch.enhancedContent.imageSearch.maxResultCount
- allowed_domains -> googleSearch.includedDomains
Leave blocked_domains and user_location unmapped because the Antigravity
googleSearch request shape has no verified equivalent for them. This avoids
sending speculative fields or pretending unsupported Claude WebSearch options
are enforced upstream.
Translate Antigravity web-search responses back into Claude-compatible output:
server_tool_use blocks, web_search_tool_result blocks, cited text blocks,
grounding URLs, and usage-compatible stream/non-stream responses.
Cover the behavior with tests for request conversion, response conversion,
grounding URL resolution, domain filter mapping, fetched capability hints,
excluded-model handling, and unsupported-model behavior.
- Added an example plugin `host-model-callback` in Go to summarize host model callbacks.
- Implemented `cliproxy_plugin_init`, `cliproxyPluginCall`, and other plugin functions for callback handling.
- Introduced API handlers for `ModelExecution` and `ModelExecutionStream` with support for both streaming and non-streaming requests.
- Included unit tests (`model_execution_test.go`) to validate execution logic and streaming responses.
Apply review feedback on codexExtractImageResults: preallocate the results
slice to its known maximum capacity to avoid growth reallocations, and guard
the itemsByIndex index-build/sort with a length check so no empty slice is
allocated or sorted when only the fallback items are present.
The OpenAI images path (/v1/images/*) previously called patchCodexCompletedOutput
to concatenate collected output_item.done items back into the completed event and
then re-parsed that rebuilt JSON to pull out the image results. For multi-megabyte
base64 image payloads this produced two extra full-size copies per request (the
concatenated output array plus the rebuilt completed event), inflating peak memory
under concurrent image generation.
Add codexExtractImageResults, which extracts image_generation_call results directly
from either the completed event's response.output or the collected items, without
the concatenate-and-reparse step. Semantics are preserved: completed output is
preferred and collected items are used only when it is empty, matching the original
patchCodexCompletedOutput behaviour. patchCodexCompletedOutput remains in use by the
text/responses path, which still forwards the patched event downstream.
Adds unit tests covering the completed-output path, the ordered fallback to
collected items, output preference, fallback list, and the wrong-event-type guard.
- Introduced support for file-backed logging of API requests and responses to handle large payloads efficiently.
- Refactored `attachWebsocketLogSources` to `attachRequestLogSources` for broader request and response handling.
- Added new methods for appending request/response data to file-backed sources and updated existing logging workflows for compatibility.
- Improved cleanup and merge logic for file-backed sources during request processing.
- Updated tests to cover newly introduced file-backed logging functionality.
- Updated `NewUtlsHTTPClient` to support context-aware RoundTrippers for protected hosts (e.g., Cloudflare bypass).
- Replaced `anthropicHosts` with `utlsProtectedHosts` to generalize host handling logic.
- Added unit test to validate context-based RoundTripper behavior.
- Replaced `NewProxyAwareHTTPClient` with `NewUtlsHTTPClient` in relevant executors for improved TLS fingerprinting.
Closes: #3680
Parse Home refresh auth envelopes so refreshed access tokens are used instead of returning missing access token.
Stop retrying when Home dispatch returns an auth that already failed within the same request.
- Replaced `NewUsageReporter` with `NewExecutorUsageReporter` to include executor type in usage records.
- Updated all executors to use the new reporter implementation.
- Extended `UsageReporter` to track and publish executor type.
- Added tests to validate proper executor type recording and handling.
- Enhanced RedisQueue plugin and payload schema with executor type support.
- Updated session handling to replace `Session_id` and `Conversation_id` headers with new logic ensuring consistent use of `Cache.ID` and prompt keys.
- Restored `Session_id` as a priority extraction source for `ExtractSessionID`.
- Added tests to validate case-sensitive and case-insensitive headers, canonical account header usage, and session key preservation.
- Removed legacy support for deprecated `Conversation_id` header to clean up API.
- Modified `applyCodexIdentityConfuse*` functions to include `turn_id` and `window_id` in metadata transformations.
- Updated test cases to validate the inclusion and restoration of these fields.
- Removed deprecated `Conversation_id` header support and related logic for cleaner implementation.
When Claude Code sends a stop-hook evaluator request (or any request
without tools), the payload includes "tools": [] (empty array). The
claude->codex translator unconditionally emits tools: [] + tool_choice:
"auto" + parallel_tool_calls: true into the Codex Responses shape.
When that payload is routed to xAI, the upstream rejects with HTTP 400:
"A tool_choice was set on the request but no tools were specified."
Fix entirely in the xAI executor (translator package is policy-locked):
add normalizeXAIToolChoiceForTools() after normalizeXAITools() to drop
tool_choice and parallel_tool_calls whenever tools end up absent or
empty (covering both the empty-from-source case and the
all-filtered-out case where every tool was an unsupported type such as
tool_search or image_generation).
Per code-review feedback: always remove parallel_tool_calls when tools
are missing (not gated on tool_choice presence) and existence-check
each key before sjson delete to avoid unnecessary JSON parse/copy.
Verification:
- go build -o test-output ./cmd/server
- go test ./internal/runtime/executor/... -count=1
- 5 new regression tests cover empty / missing / present / orphaned
parallel_tool_calls / no-op-when-both-absent.
- Added `applyCodexIdentityConfuse*` functions for remapping request and response payloads and headers to enhance security.
- Updated WebSocket and HTTP logic to handle identity state transformations seamlessly.
- Introduced unit tests to verify remapping and restoration of identity-related fields.
- Introduced `translateCodexRequestPair` to simplify and reuse translation logic for handling original and modified payloads.
- Updated relevant methods to use the new function.
- Added unit tests to cover payload reuse and differentiation scenarios.
- Introduced `service_tier` metadata key to capture client-requested service tiers.
- Updated usage records, context propagation, and plugins to include service tier data.
- Added default handling logic for cases where `service_tier` is absent.
- Implemented tests for `service_tier` extraction, defaults, and updates across components.
- Updated `TotalTokens` calculation to account for `CacheReadTokens` and `CacheCreationTokens`.
- Added tests to validate accurate token aggregation and fallback behavior for `CachedTokens`.
- Introduced Time-To-First-Token (TTFT) measurement and reporting across major executors.
- Added TTFT calculation to `UsageReporter`, including support for HTTP clients and WebSocket communication.
- Updated tests to validate TTFT tracking in streamed and non-streamed scenarios.
- Ensured integration with `usage` plugin and augmented usage records with TTFT data.
- Introduced `SetTranslatedReasoningEffort` method in `UsageReporter` to capture and log reasoning efforts from translated payloads.
- Updated executors to incorporate the new reporting functionality for handling reasoning efforts across various providers.
- Enhanced logging for thinking level extraction with new helper function `ExtractTranslatedReasoningEffort`.
- Introduced `GPTImage2BaseModel` configuration for hosted image generation tools with validation for "gpt-" prefix.
- Added logic to dynamically resolve and apply the base model in Codex executor workflows.
- Enhanced server-sent events (SSE) implementation with keep-alive tickers and error events for stream reliability.
- Updated configuration file examples and internal documentation.
- Introduced `FileBodySource` to support large request log sections stored in temp files.
- Added file-backed support for WebSocket timeline and API WebSocket timeline logging.
- Updated `LogRequest` and middleware to integrate optional file-backed sources.
- Implemented clean-up mechanisms to manage temporary log files after processing.