Commit Graph

521 Commits

Author SHA1 Message Date
sususu98
0e1235122e fix antigravity client agent headers 2026-04-28 19:04:40 +08:00
sususu98
e78d45acc9 fix antigravity user agent handling 2026-04-28 19:04:40 +08:00
xbang
a992dee4e8 fix(antigravity): use real antigravity UA when polling credits balance
The loadCodeAssist polling call hardcoded the User-Agent to
google-api-nodejs-client/9.15.1. Google Cloud Code returns the
paidTier object WITHOUT the availableCredits array for that UA,
so updateAntigravityCreditsBalance always saw "no credits", set the
hint to Available=false for every Google One AI Ultra account, and
the conductor-level credits fallback could never find a candidate.

Switching to resolveUserAgent(auth) (the same UA used for
streamGenerateContent / generateContent) makes the response include
availableCredits, so the credits hint is populated correctly and the
fallback can actually inject enabledCreditTypes:["GOOGLE_ONE_AI"]
when free tier is exhausted.
2026-04-28 16:21:15 +08:00
Luis Pater
04a336f7df fix(usage_helpers): skip zero-token usage in additional model records
- Added `buildAdditionalModelRecord` to filter out zero-token usage details.
- Introduced `hasNonZeroTokenUsage` helper function for token usage validation.
- Updated tests to cover scenarios for zero and non-zero token usage.
2026-04-27 10:56:22 +08:00
Luis Pater
c5bea6f6f8 Merge pull request #3020 from Matthias319/fix/codex-error-classification
fix(codex): classify context, thinking-signature, previous-response, and auth failures
2026-04-26 22:26:40 +08:00
Luis Pater
c7b28ba058 feat(executor): add support for Codex image generation tool usage tracking
- Introduced `publishCodexImageToolUsage` to report image generation tool metrics.
- Updated executor logic to handle image generation tool events and defaults.
- Added parsing logic for `image_gen` tool usage details in `helps/usage_helpers.go`.
- Updated `UsageReporter` for additional model-specific usage publishing.
- Refactored usage detail normalizations.

Closes: #3063
2026-04-26 22:19:03 +08:00
Luis Pater
38573050aa feat(config): add support for disabling OpenAI compatibility providers
- Introduced a `Disabled` flag to OpenAI compatibility configurations.
- Updated routing, auth selection, and API handling logic to respect the `Disabled` state.
- Extended relevant APIs, YAML configurations, and data structures to include the `Disabled` field.
- Adjusted all relevant loops and filters to skip disabled providers.

Closes: #3060 #3059 #2977
2026-04-26 21:49:36 +08:00
Luis Pater
28d78273e4 feat(api): implement protocol multiplexer and Redis queue for usage integration
- Added `protocol_multiplexer.go`, enabling support for both HTTP and Redis protocols on a single listener.
- Introduced `redis_queue_protocol.go` to handle Redis-compatible RESP commands for queue management.
- Integrated `redisqueue` package, supporting in-memory queuing with expiration pruning.
- Updated server initialization to manage a shared listener and multiplex connections.
- Adjusted `Handler` to adopt `AuthenticateManagementKey` for modular key validation, supporting both HTTP and Redis flows.
2026-04-25 18:52:24 +08:00
Luis Pater
a7e92e2639 feat(auth): disallow free-tier Codex auth during selection process
- Introduced `disallowFreeAuthFromMetadata` and `isFreeCodexAuth` to enforce skipping free-tier credentials.
- Modified scheduler logic to honor `DisallowFreeAuthMetadataKey` during auth selection.
- Updated `ensureImageGenerationTool` to skip tool injection for free-tier Codex auth.
- Added context utility `WithDisallowFreeAuth` and integrated with image handlers.
- Augmented relevant tests to cover free-tier exclusion scenarios.
2026-04-24 23:18:56 +08:00
Matthias319
4056c2590b fix(codex): classify known upstream failures
Normalize Codex context, thinking-signature, previous-response, and auth failures to explicit error codes: context_too_large, thinking_signature_invalid, previous_response_not_found, auth_unavailable.

Refs #2596.
2026-04-24 17:13:23 +02:00
Luis Pater
f1ba6151a9 feat(codex): pass base model to enable conditional image_generation tool injection
- Modified `ensureImageGenerationTool` to accept `baseModel` for conditional logic.
- Ensured `gpt-5.3-codex-spark` models bypass image_generation tool injection.
- Updated relevant tests and executor logic to reflect changes.
2026-04-24 07:21:03 +08:00
sususu98
12195a276e Merge pull request #2971 from sususu98/feat/antigravity-credits-fallback
feat(antigravity): conductor-level credits fallback for Claude models
2026-04-24 00:15:23 +08:00
sususu98
7ad1900041 perf(antigravity): async credits hint refresh for warm tokens 2026-04-23 23:58:10 +08:00
sususu98
920b6efffa refactor(logging): strip unrelated deferred body changes, keep credits-only logging
Remove deferred body optimization and maxErrorLog constants that were
unrelated to credits fallback. Keep only MarkCreditsUsed/CreditsUsed
helpers for flagging requests that consumed AI credits.
2026-04-23 17:41:54 +08:00
sususu98
e75daa299b fix(antigravity): respect pinned auth in credits fallback, release deferred body on success
- findAllAntigravityCreditsCandidateAuths now filters by PinnedAuthMetadataKey
  to prevent credential isolation violations during credits fallback
- Release deferredBody reference on success path to avoid holding large
  payloads in memory for the lifetime of the gin context
2026-04-23 17:38:02 +08:00
sususu98
4de5c29f86 fix(antigravity): remove credits fallback from CountTokens, fix gofmt
CountTokens upstream API does not support enabledCreditTypes, so
remove the dead credits fallback path from ExecuteCount and delete
the unused tryAntigravityCreditsExecuteCount method. Fix gofmt on
credits test file.
2026-04-23 15:17:00 +08:00
sususu98
14d46a0a5d feat(antigravity): conductor-level credits fallback for Claude models
Move credits handling from executor-level retry to conductor-level
orchestration. When all free-tier auths are exhausted (429/503), the
conductor discovers auths with available Google One AI credits and
retries with enabledCreditTypes injected via context flag.

Key changes:
- Add AntigravityCreditsHint system for tracking per-auth credits state
- Conductor tries credits fallback after all auths fail (Execute/Stream/Count)
- Executor injects enabledCreditTypes only when conductor sets context flag
- Credits fallback respects provider scope (requires antigravity in providers)
- Add context cancellation check in credits fallback to avoid wasted requests
- Remove executor-level attemptCreditsFallback and preferCredits machinery
- Restructure 429 decision logic (parse details first, keyword fallback)
- Expand shouldAbort to cover INVALID_ARGUMENT/FAILED_PRECONDITION/500+UNKNOWN
- Support human-readable retry delay parsing (e.g. "1h43m56s")
2026-04-23 13:44:20 +08:00
MoYeRanQianZhi
31934ae04c feat(codex): enable image generation for all Codex upstream requests
Codex CLI gates the built-in image_generation tool behind
AuthMode::Chatgpt (OAuth only). When clients connect via API key
auth through CPA, the tool is absent from requests, making image
generation unavailable through the reverse proxy.

Changes:

1. Inject image_generation tool (codex_executor.go):
   Add ensureImageGenerationTool() that appends
   {"type":"image_generation","output_format":"png"} to the tools
   array if not already present. Applied to all three execution
   paths: Execute, executeCompact, and ExecuteStream.

2. Route aliases for Codex CLI direct access (server.go):
   Add /backend-api/codex/responses routes that map to the same
   OpenAI Responses API handlers as /v1/responses. This allows
   Codex CLI to connect via chatgpt_base_url config while keeping
   AuthMode::Chatgpt, which enables the built-in image_generation
   tool on the client side.

3. Unit tests (codex_executor_imagegen_test.go):
   Cover no-tools, existing tools, already-present, empty array,
   and mixed built-in tool scenarios.
2026-04-23 01:24:40 +08:00
stringer07
b6781d69be perf(codex): avoid repeated output patch writes 2026-04-21 16:29:54 +08:00
stringer07
bb8408cef5 fix(codex): backfill streaming response output 2026-04-21 16:03:56 +08:00
octo-patch
f4eb16102b fix(executor): drop obsolete context-1m-2025-08-07 beta header (fixes #2866)
Anthropic has moved the 1M-context-window feature to General Availability,
so the context-1m-2025-08-07 beta flag is no longer accepted and now causes
400 Bad Request errors when forwarded upstream.

Remove the X-CPA-CLAUDE-1M detection and the corresponding injection of the
now-invalid beta header.  Also drop the unused net/textproto import that was
only needed for the header-key lookup.
2026-04-19 10:38:16 +08:00
hkfires
d9a3b3e5f3 fix(tests): update model lookup references and enhance Claude executor tests 2026-04-17 08:32:07 +08:00
Luis Pater
f5dc6483d5 chore: remove iFlow-related modules and dependencies
- Deleted `iflow` provider implementation, including thinking configuration (`apply.go`) and authentication modules.
- Removed iFlow-specific tests, executors, and helpers across SDK and internal components.
- Updated all references to exclude iFlow functionality.
2026-04-17 01:07:12 +08:00
Luis Pater
d949921143 feat(auth): add proxy URL override support to auth constructors and executors
- Introduced `WithProxyURL` variants for `CodexAuth`, `ClaudeAuth`, `IFlowAuth`, and `DeviceFlowClient`.
- Updated executors to use proxy-aware constructors for improved configurability.
- Added unit tests to validate proxy override precedence and functionality.

Closes: #2823
2026-04-16 22:11:39 +08:00
Luis Pater
f56cf42461 Merge pull request #2800 from sususu98/fix/antigravity-max-output-tokens-cap
fix(antigravity): cap maxOutputTokens using registry max_completion_tokens
2026-04-15 20:35:11 +08:00
Luis Pater
3dea1da249 Merge pull request #2782 from sususu98/fix/strip-invalid-signature-thinking-blocks
fix(antigravity): use E-prefixed fake signature in strict bypass test
2026-04-15 20:34:32 +08:00
Luis Pater
8fac29631d chore: remove Qwen support from SDK and internal components
- Deleted `QwenAuthenticator`, internal `qwen_auth`, and `qwen_executor` implementations.
- Removed all Qwen-related OAuth flows, token handling, and execution logic.
- Cleaned up dependencies and references to Qwen across the codebase.
2026-04-15 12:16:08 +08:00
sususu98
8fecd625d2 fix(antigravity): cap maxOutputTokens using registry max_completion_tokens
Claude models on antigravity have a 64000 token output limit but
max_tokens from downstream requests was passed through uncapped,
causing 400 INVALID_ARGUMENT from Google when clients sent 128000.
2026-04-15 11:57:55 +08:00
sususu98
10b55b5ddd fix(antigravity): use E-prefixed fake signature in strict bypass test
The strict bypass test used testGeminiSignaturePayload() which produces
a base64 string starting with 'C'. Since StripInvalidSignatureThinkingBlocks
now strips all non-E/R signatures unconditionally, the test payload was
stripped before reaching ValidateClaudeBypassSignatures, causing the test
to pass the request through instead of rejecting it with 400.

Replace with testFakeClaudeSignature() which produces a base64 string
starting with 'E' (valid at the lightweight check) but with invalid
protobuf content (no valid field 2), so strict mode correctly rejects
it at the deep validation layer.
2026-04-14 15:46:02 +08:00
sususu98
278a89824c fix(antigravity): strip thinking blocks with empty signatures instead of rejecting
Thinking blocks with empty signatures come from proxy-generated
responses (Antigravity/Gemini routed as Claude). These should be
silently dropped from the request payload before forwarding, not
rejected with 400. Fixes 10 "missing thinking signature" errors.
2026-04-14 15:14:48 +08:00
sususu98
f5ed5c7453 fix(antigravity): skip full schema cleanup for empty tool requests
Avoid whole-payload schema sanitization when translated Antigravity requests have no actual tool schemas, including missing and empty tools arrays. Add regression coverage so image-heavy no-tool requests keep bypassing the old memory amplification path.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-12 12:51:42 +08:00
Luis Pater
0ab1f5412f fix(executor): handle 429 Retry-After header and default retry logic for quota exhaustion
- Added proper parsing of `Retry-After` headers for 429 responses.
- Set default retry duration when "disable cooling" is active on quota exhaustion.
- Updated tests to verify `Retry-After` handling and default behavior.
2026-04-11 21:04:55 +08:00
Luis Pater
828df80088 refactor(executor): remove immediate retry with token refresh on 429 for Qwen and update tests accordingly 2026-04-11 16:35:18 +08:00
Luis Pater
5ab9afac83 fix(executor): handle OAuth tool name remapping with rename detection and add tests
Closes: #2656
2026-04-10 21:54:59 +08:00
Luis Pater
65ce86338b fix(executor): implement immediate retry with token refresh on 429 for Qwen and add associated tests
Closes: #2661
2026-04-10 21:12:03 +08:00
sususu98
d801393841 feat(antigravity): prefer prod URL as first priority
Promote cloudcode-pa.googleapis.com to the first position in the
fallback order, with daily and sandbox URLs as fallbacks.
2026-04-10 19:37:56 +08:00
Luis Pater
b2c0cdfc88 Merge pull request #2621 from wykk-12138/fix/oauth-extra-usage-detection
fix(claude): prevent OAuth extra-usage billing via tool name fingerprinting and system prompt cloaking
2026-04-10 10:29:27 +08:00
wykk-12138
0f45d89255 fix(claude): address PR review feedback for OAuth cloaking
- Use buildTextBlock for billing header to avoid raw JSON string interpolation
- Fix empty array edge case in prependToFirstUserMessage
- Allow remapOAuthToolNames to process messages even without tools array
- Move claude_system_prompt.go to helps/ per repo convention
- Export prompt constants (ClaudeCode* prefix) for cross-package access

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 00:07:11 +08:00
wykk-12138
96056d0137 Merge remote-tracking branch 'upstream/main' into fix/oauth-extra-usage-detection 2026-04-09 22:59:31 +08:00
wykk-12138
f780c289e8 fix(claude): map question/skill to TitleCase instead of removing them
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-09 22:28:00 +08:00
wykk-12138
ac36119a02 fix(claude): preserve OAuth tool renames when filtering tools
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-09 22:20:15 +08:00
Luis Pater
39dc4557c1 Merge pull request #2412 from sususu98/feat/signature-cache-toggle
feat: configurable signature cache toggle for Antigravity/Claude thinking blocks
2026-04-09 21:54:47 +08:00
ZTXBOSS666
30e94b6792 fix(antigravity): refine 429 handling and credits fallback
Includes: restore SDK docs under docs/; update antigravity executor credits tests; gofmt.
2026-04-09 21:48:32 +08:00
sususu98
cf249586a9 feat(antigravity): configurable signature cache with bypass-mode validation
Antigravity 的 Claude thinking signature 处理新增 cache/bypass 双模式,
并为 bypass 模式实现按 SIGNATURE-CHANNEL-SPEC.md 的签名校验。

新增 antigravity-signature-cache-enabled 配置项(默认 true):
- cache mode(true):使用服务端缓存的签名,行为与原有逻辑完全一致
- bypass mode(false):直接使用客户端提供的签名,经过校验和归一化

支持配置热重载,运行时可切换模式。

校验流程:
1. 剥离历史 cache-mode 的 'modelGroup#' 前缀(如 claude#Exxxx → Exxxx)
2. 首字符必须为 'E'(单层编码)或 'R'(双层编码),否则拒绝
3. R 开头:base64 解码 → 内层必须以 'E' 开头 → 继续单层校验
4. E 开头:base64 解码 → 首字节必须为 0x12(Claude protobuf 标识)
5. 所有合法签名归一化为 R 形式(双层 base64)发往 Antigravity 后端

非法签名处理策略:
- 非严格模式(默认):translator 静默丢弃无签名的 thinking block
- 严格模式(antigravity-signature-bypass-strict: true):
  executor 层在请求发往上游前直接返回 HTTP 400

按 SIGNATURE-CHANNEL-SPEC.md 解析 Claude 签名的完整 protobuf 结构:
- Top-level Field 2(容器)→ Field 1(渠道块)
- 渠道块提取:channel_id (Field 1)、infrastructure (Field 2)、
  model_text (Field 6)、field7 (Field 7)
- 计算 routing_class、infrastructure_class、schema_features
- 使用 google.golang.org/protobuf/encoding/protowire 解析

- resolveThinkingSignature 拆分为 resolveCacheModeSignature / resolveBypassModeSignature
- hasResolvedThinkingSignature:mode-aware 签名有效性判断
  (cache: len>=50 via HasValidSignature,bypass: non-empty)
- validateAntigravityRequestSignatures:executor 预检,
  仅在 bypass + strict 模式下拦截非法签名返回 400
- 响应侧签名缓存逻辑与 cache mode 集成
- Cache mode 行为完全保留:无 '#' 前缀的原生签名静默丢弃
2026-04-09 21:12:40 +08:00
wykk-12138
e8d1b79cb3 fix(claude): remap OAuth tool names to Claude Code style to avoid third-party fingerprint detection
A/B testing confirmed that Anthropic uses tool name fingerprinting to detect
third-party clients on OAuth traffic. OpenCode-style lowercase names like
'bash', 'read', 'todowrite' trigger extra-usage billing, while Claude Code
TitleCase names like 'Bash', 'Read', 'TodoWrite' pass through normally.

Changes:
- Add oauthToolRenameMap: maps lowercase tool names to Claude Code equivalents
- Add oauthToolsToRemove: removes 'question' and 'skill' (no Claude Code counterpart)
- remapOAuthToolNames: renames tools, removes blacklisted ones, updates tool_choice and messages
- reverseRemapOAuthToolNames/reverseRemapOAuthToolNamesFromStreamLine: reverse map for responses
- Apply in Execute(), ExecuteStream(), and CountTokens() for OAuth token requests
2026-04-09 20:15:16 +08:00
Luis Pater
5e81b65f2f fix(auth, executor): normalize Qwen base URL, adjust RefreshLead duration, and add tests 2026-04-09 18:07:07 +08:00
wykk-12138
7e8e2226a6 fix(claude): reduce forwarded OAuth prompt to minimal tool reminder
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-09 17:12:07 +08:00
wykk-12138
f0c20e852f fix(claude): remove invalid cache_control scope from static system block
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-09 17:00:04 +08:00
wykk-12138
7cdf8e9872 fix(claude): sanitize forwarded third-party prompts for OAuth cloaking
Only for Claude OAuth requests, sanitize forwarded system-prompt context before
it is prepended into the first user message. This preserves neutral task/tool
instructions while removing OpenCode branding, docs links, environment banners,
and product-specific workflow sections that still triggered Anthropic extra-usage
classification after top-level system[] cloaking.
2026-04-09 16:45:29 +08:00
wykk-12138
e2e3c7dde0 fix: remove invalid org scope and match Claude Code block layout 2026-04-09 14:09:52 +08:00