The loadCodeAssist polling call hardcoded the User-Agent to
google-api-nodejs-client/9.15.1. Google Cloud Code returns the
paidTier object WITHOUT the availableCredits array for that UA,
so updateAntigravityCreditsBalance always saw "no credits", set the
hint to Available=false for every Google One AI Ultra account, and
the conductor-level credits fallback could never find a candidate.
Switching to resolveUserAgent(auth) (the same UA used for
streamGenerateContent / generateContent) makes the response include
availableCredits, so the credits hint is populated correctly and the
fallback can actually inject enabledCreditTypes:["GOOGLE_ONE_AI"]
when free tier is exhausted.
- Added `buildAdditionalModelRecord` to filter out zero-token usage details.
- Introduced `hasNonZeroTokenUsage` helper function for token usage validation.
- Updated tests to cover scenarios for zero and non-zero token usage.
- Introduced a `Disabled` flag to OpenAI compatibility configurations.
- Updated routing, auth selection, and API handling logic to respect the `Disabled` state.
- Extended relevant APIs, YAML configurations, and data structures to include the `Disabled` field.
- Adjusted all relevant loops and filters to skip disabled providers.
Closes: #3060#3059#2977
- Added `protocol_multiplexer.go`, enabling support for both HTTP and Redis protocols on a single listener.
- Introduced `redis_queue_protocol.go` to handle Redis-compatible RESP commands for queue management.
- Integrated `redisqueue` package, supporting in-memory queuing with expiration pruning.
- Updated server initialization to manage a shared listener and multiplex connections.
- Adjusted `Handler` to adopt `AuthenticateManagementKey` for modular key validation, supporting both HTTP and Redis flows.
Remove deferred body optimization and maxErrorLog constants that were
unrelated to credits fallback. Keep only MarkCreditsUsed/CreditsUsed
helpers for flagging requests that consumed AI credits.
- findAllAntigravityCreditsCandidateAuths now filters by PinnedAuthMetadataKey
to prevent credential isolation violations during credits fallback
- Release deferredBody reference on success path to avoid holding large
payloads in memory for the lifetime of the gin context
CountTokens upstream API does not support enabledCreditTypes, so
remove the dead credits fallback path from ExecuteCount and delete
the unused tryAntigravityCreditsExecuteCount method. Fix gofmt on
credits test file.
Move credits handling from executor-level retry to conductor-level
orchestration. When all free-tier auths are exhausted (429/503), the
conductor discovers auths with available Google One AI credits and
retries with enabledCreditTypes injected via context flag.
Key changes:
- Add AntigravityCreditsHint system for tracking per-auth credits state
- Conductor tries credits fallback after all auths fail (Execute/Stream/Count)
- Executor injects enabledCreditTypes only when conductor sets context flag
- Credits fallback respects provider scope (requires antigravity in providers)
- Add context cancellation check in credits fallback to avoid wasted requests
- Remove executor-level attemptCreditsFallback and preferCredits machinery
- Restructure 429 decision logic (parse details first, keyword fallback)
- Expand shouldAbort to cover INVALID_ARGUMENT/FAILED_PRECONDITION/500+UNKNOWN
- Support human-readable retry delay parsing (e.g. "1h43m56s")
Codex CLI gates the built-in image_generation tool behind
AuthMode::Chatgpt (OAuth only). When clients connect via API key
auth through CPA, the tool is absent from requests, making image
generation unavailable through the reverse proxy.
Changes:
1. Inject image_generation tool (codex_executor.go):
Add ensureImageGenerationTool() that appends
{"type":"image_generation","output_format":"png"} to the tools
array if not already present. Applied to all three execution
paths: Execute, executeCompact, and ExecuteStream.
2. Route aliases for Codex CLI direct access (server.go):
Add /backend-api/codex/responses routes that map to the same
OpenAI Responses API handlers as /v1/responses. This allows
Codex CLI to connect via chatgpt_base_url config while keeping
AuthMode::Chatgpt, which enables the built-in image_generation
tool on the client side.
3. Unit tests (codex_executor_imagegen_test.go):
Cover no-tools, existing tools, already-present, empty array,
and mixed built-in tool scenarios.
Anthropic has moved the 1M-context-window feature to General Availability,
so the context-1m-2025-08-07 beta flag is no longer accepted and now causes
400 Bad Request errors when forwarded upstream.
Remove the X-CPA-CLAUDE-1M detection and the corresponding injection of the
now-invalid beta header. Also drop the unused net/textproto import that was
only needed for the header-key lookup.
- Deleted `iflow` provider implementation, including thinking configuration (`apply.go`) and authentication modules.
- Removed iFlow-specific tests, executors, and helpers across SDK and internal components.
- Updated all references to exclude iFlow functionality.
- Introduced `WithProxyURL` variants for `CodexAuth`, `ClaudeAuth`, `IFlowAuth`, and `DeviceFlowClient`.
- Updated executors to use proxy-aware constructors for improved configurability.
- Added unit tests to validate proxy override precedence and functionality.
Closes: #2823
- Deleted `QwenAuthenticator`, internal `qwen_auth`, and `qwen_executor` implementations.
- Removed all Qwen-related OAuth flows, token handling, and execution logic.
- Cleaned up dependencies and references to Qwen across the codebase.
Claude models on antigravity have a 64000 token output limit but
max_tokens from downstream requests was passed through uncapped,
causing 400 INVALID_ARGUMENT from Google when clients sent 128000.
The strict bypass test used testGeminiSignaturePayload() which produces
a base64 string starting with 'C'. Since StripInvalidSignatureThinkingBlocks
now strips all non-E/R signatures unconditionally, the test payload was
stripped before reaching ValidateClaudeBypassSignatures, causing the test
to pass the request through instead of rejecting it with 400.
Replace with testFakeClaudeSignature() which produces a base64 string
starting with 'E' (valid at the lightweight check) but with invalid
protobuf content (no valid field 2), so strict mode correctly rejects
it at the deep validation layer.
Thinking blocks with empty signatures come from proxy-generated
responses (Antigravity/Gemini routed as Claude). These should be
silently dropped from the request payload before forwarding, not
rejected with 400. Fixes 10 "missing thinking signature" errors.
Avoid whole-payload schema sanitization when translated Antigravity requests have no actual tool schemas, including missing and empty tools arrays. Add regression coverage so image-heavy no-tool requests keep bypassing the old memory amplification path.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Added proper parsing of `Retry-After` headers for 429 responses.
- Set default retry duration when "disable cooling" is active on quota exhaustion.
- Updated tests to verify `Retry-After` handling and default behavior.
- Use buildTextBlock for billing header to avoid raw JSON string interpolation
- Fix empty array edge case in prependToFirstUserMessage
- Allow remapOAuthToolNames to process messages even without tools array
- Move claude_system_prompt.go to helps/ per repo convention
- Export prompt constants (ClaudeCode* prefix) for cross-package access
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
A/B testing confirmed that Anthropic uses tool name fingerprinting to detect
third-party clients on OAuth traffic. OpenCode-style lowercase names like
'bash', 'read', 'todowrite' trigger extra-usage billing, while Claude Code
TitleCase names like 'Bash', 'Read', 'TodoWrite' pass through normally.
Changes:
- Add oauthToolRenameMap: maps lowercase tool names to Claude Code equivalents
- Add oauthToolsToRemove: removes 'question' and 'skill' (no Claude Code counterpart)
- remapOAuthToolNames: renames tools, removes blacklisted ones, updates tool_choice and messages
- reverseRemapOAuthToolNames/reverseRemapOAuthToolNamesFromStreamLine: reverse map for responses
- Apply in Execute(), ExecuteStream(), and CountTokens() for OAuth token requests
Only for Claude OAuth requests, sanitize forwarded system-prompt context before
it is prepended into the first user message. This preserves neutral task/tool
instructions while removing OpenCode branding, docs links, environment banners,
and product-specific workflow sections that still triggered Anthropic extra-usage
classification after top-level system[] cloaking.