CLIProxyAPI

mirror of https://github.com/router-for-me/CLIProxyAPI.git synced 2026-05-08 06:42:41 +08:00

Author	SHA1	Message	Date
Luis Pater	fb08b92402	feat(executor): add upstream disconnect handling for Codex WebSocket sessions - Introduced `UpstreamDisconnectChan` for Codex WebSocket sessions to notify downstream connections of upstream disconnections. - Implemented `notifyUpstreamDisconnect` to signal errors and close channels on disconnect events. - Added integration tests to validate WebSocket session behavior on upstream disconnect. - Updated OpenAI WebSocket response handlers to properly close connections upon upstream disconnect notifications.	2026-05-06 22:09:33 +08:00
Luis Pater	ba5d8ca733	feat(usage): add support for requested model alias handling - Introduced methods for setting and retrieving model aliases in execution and usage contexts. - Enhanced `UsageReporter` and related structures to include client-requested aliases. - Updated tests to validate alias propagation and ensure correct usage reporting. - Adjusted metadata handling in CLIProxyAPI executors to address alias integration.	2026-05-05 01:47:53 +08:00
Luis Pater	28b4b19e7e	Merge pull request #3208 from kdcokenny/codex-websocket-protocol-parity Align Codex websocket protocol semantics	2026-05-05 01:29:19 +08:00
Luis Pater	bdc424007e	Merge pull request #2896 from edlsh/fix/oauth-tool-rename-per-request-map fix(amp): smart-mode tool name fixes + deep-mode response repair	2026-05-05 00:58:39 +08:00
Luis Pater	e4a93c02c5	fix(executor): enhance parsing of OpenAI stream data lines - Added trimming for stream input lines to prevent processing of unnecessary whitespace. - Improved handling of unsupported prefixes and malformed JSON responses, ensuring errors are recorded and propagated appropriately. Fixed: #2690	2026-05-04 23:42:26 +08:00
Luis Pater	8262a03f29	Merge PR #2568 : fix Claude refresh backoff	2026-05-04 21:44:11 +08:00
Luis Pater	ecf1c2590c	fix: preserve Antigravity cancellation errors	2026-05-04 21:18:18 +08:00
Luis Pater	162897e02a	Merge remote-tracking branch 'origin/pr/3205' into dev	2026-05-04 21:17:01 +08:00
Luis Pater	bf6fa402e2	fix(executor): strip Vertex OpenAI response tool call IDs for consistency - Integrated `StripVertexOpenAIResponsesToolCallIDs` to remove tool call ID data from request bodies and translated requests. - Ensures uniformity and avoids unnecessary payload data propagation. Fixed: #2549	2026-05-04 17:54:16 +08:00
Luis Pater	89d80bfff4	fix(executor): adjust ApplyThinking order and add payload override test - Moved `ApplyThinking` logic earlier in `openai_compat_executor` to align with configuration application sequence. - Added test to verify payload override precedence over Thinking suffix configuration.	2026-05-04 16:45:25 +08:00
Kenny	6b4bc0a9a8	Align Codex default identity and docs	2026-05-03 21:13:37 -07:00
Kenny	08b0fe6816	Fix Codex websocket retry metadata	2026-05-03 19:01:44 -07:00
Kenny	c19ae1d5be	Align Codex websocket protocol semantics	2026-05-03 15:56:39 -07:00
Luis Pater	2753d9fb71	feat: add validation for Claude streaming responses - Implemented `validateClaudeStreamingResponse` to ensure upstream streaming data integrity. - Added new tests to verify response validation, including empty streams, error events, incomplete streams, and valid streams. - Integrated validation logic into the Claude executor's streaming handler, returning detailed errors for malformed upstream data. Fixed: #2193	2026-05-04 03:37:31 +08:00
1137043480	bf0e5c23f7	fix: prevent goroutine leaks in streaming executors via context-aware channel sends All streaming executors use bare channel sends (out <- chunk) inside goroutines that process upstream SSE responses. When the downstream consumer disconnects (client timeout, network drop, etc.), these sends block indefinitely, causing the goroutine and all associated resources (HTTP response body, scanner buffers, translation state) to leak permanently. Over time, leaked goroutines accumulate monotonically, leading to RSS growth from ~30MB to 3.7GB+ and eventual OOM kills on resource-constrained VPS hosts. Fix: Replace all bare 'out <- ...' sends with: select { case out <- ...: case <-ctx.Done(): return } This ensures goroutines terminate promptly when the request context is canceled, allowing GC to reclaim all associated resources. Affected executors (9 files, 36+ send sites): - antigravity_executor.go (5 sites) - gemini_cli_executor.go (6 sites) - gemini_vertex_executor.go (6 sites) - aistudio_executor.go (4 sites) - gemini_executor.go (3 sites) - openai_compat_executor.go (3 sites) - claude_executor.go (4 sites) - codex_executor.go (2 sites) - kimi_executor.go (3 sites)	2026-05-03 11:25:04 -04:00
Luis Pater	672fdd14ed	feat: filter and drop empty assistant messages in Kimi executor - Added `filterKimiEmptyAssistantMessages` to identify and remove empty assistant messages with no content, tool links, or reasoning. - Integrated logging to track the number of dropped messages. - Updated tests to validate the filtering logic for both empty and valid assistant messages. Fixed: #1730	2026-05-03 22:40:42 +08:00
Luis Pater	6ba7c810a7	feat: apply image_generation filtering before payload rules - Updated `ApplyPayloadConfigWithRoot` to prioritize `disable-image-generation` filtering before applying payload rules. - Ensured payload overrides can explicitly re-enable `image_generation` when required. - Added unit tests to validate `image_generation` restoration through overrides.	2026-04-30 12:42:08 +08:00
Luis Pater	f56a19e5b8	feat: add tri-state support for `disable-image-generation` configuration - Introduced `DisableImageGenerationMode` with support for `false`, `true`, and `chat` values. - Updated payload handling to preserve `image_generation` on images endpoints when `chat` mode is enabled. - Modified OpenAI image handlers (`ImagesGenerations`, `ImagesEdits`) to respect tri-state logic. - Added unit tests for `DisableImageGenerationMode` behavior and endpoint-specific handling. - Enhanced configuration diff logging to support `DisableImageGenerationMode`.	2026-04-30 12:10:27 +08:00
Luis Pater	46018417ad	feat: remove `tool_choice` for `image_generation` when disabled - Added logic to remove `tool_choice` entries of type `image_generation` from payloads when `disable-image-generation` is enabled. - Updated `ApplyPayloadConfigWithRoot` to handle new removal logic. - Added unit tests to verify `tool_choice` removal behavior.	2026-04-30 08:24:14 +08:00
Luis Pater	e3e60f914b	feat: support disabling image generation globally - Added `disable-image-generation` configuration flag to disable the `image_generation` tool globally. - Updated payload handling to remove `image_generation` tools from request payload arrays when the flag is enabled. - Modified OpenAI image handlers (`ImagesGenerations`, `ImagesEdits`) to return 404 when the feature is disabled. - Enhanced configuration diff logging to track changes for the `disable-image-generation` flag. - Added accompanying unit tests for the new feature in payload helpers and image handler logic.	2026-04-30 03:42:27 +08:00
Luis Pater	a1f0ed9575	Merge pull request #3071 from sususu98/fix/antigravity-credits-log Mark Antigravity credits requests in access logs	2026-04-29 22:56:41 +08:00
sususu98	4982512da2	fix: parse gemini cli usage metadata variants	2026-04-29 13:10:53 +08:00
sususu98	0e1235122e	fix antigravity client agent headers	2026-04-28 19:04:40 +08:00
sususu98	e78d45acc9	fix antigravity user agent handling	2026-04-28 19:04:40 +08:00
xbang	a992dee4e8	fix(antigravity): use real antigravity UA when polling credits balance The loadCodeAssist polling call hardcoded the User-Agent to google-api-nodejs-client/9.15.1. Google Cloud Code returns the paidTier object WITHOUT the availableCredits array for that UA, so updateAntigravityCreditsBalance always saw "no credits", set the hint to Available=false for every Google One AI Ultra account, and the conductor-level credits fallback could never find a candidate. Switching to resolveUserAgent(auth) (the same UA used for streamGenerateContent / generateContent) makes the response include availableCredits, so the credits hint is populated correctly and the fallback can actually inject enabledCreditTypes:["GOOGLE_ONE_AI"] when free tier is exhausted.	2026-04-28 16:21:15 +08:00
Luis Pater	04a336f7df	fix(usage_helpers): skip zero-token usage in additional model records - Added `buildAdditionalModelRecord` to filter out zero-token usage details. - Introduced `hasNonZeroTokenUsage` helper function for token usage validation. - Updated tests to cover scenarios for zero and non-zero token usage.	2026-04-27 10:56:22 +08:00
sususu98	6fc23568df	logging: mark antigravity credits requests	2026-04-26 23:04:27 +08:00
Luis Pater	c5bea6f6f8	Merge pull request #3020 from Matthias319/fix/codex-error-classification fix(codex): classify context, thinking-signature, previous-response, and auth failures	2026-04-26 22:26:40 +08:00
Luis Pater	c7b28ba058	feat(executor): add support for Codex image generation tool usage tracking - Introduced `publishCodexImageToolUsage` to report image generation tool metrics. - Updated executor logic to handle image generation tool events and defaults. - Added parsing logic for `image_gen` tool usage details in `helps/usage_helpers.go`. - Updated `UsageReporter` for additional model-specific usage publishing. - Refactored usage detail normalizations. Closes: #3063	2026-04-26 22:19:03 +08:00
Luis Pater	38573050aa	feat(config): add support for disabling OpenAI compatibility providers - Introduced a `Disabled` flag to OpenAI compatibility configurations. - Updated routing, auth selection, and API handling logic to respect the `Disabled` state. - Extended relevant APIs, YAML configurations, and data structures to include the `Disabled` field. - Adjusted all relevant loops and filters to skip disabled providers. Closes: #3060 #3059 #2977	2026-04-26 21:49:36 +08:00
Enzo Lucchesi	fc1ddf365f	fix(claude): centralize oauth tool-name transform flow	2026-04-25 17:45:03 -04:00
edlsh	03ea4e569f	perf(claude): pre-allocate reverseMap capacity Address Gemini code review suggestion: the reverseMap can contain at most len(oauthToolRenameMap) entries, so pre-allocating avoids reallocations as entries are added.	2026-04-25 17:45:03 -04:00
Enzo Lucchesi	e707cf7d46	fix(claude): only reverse-remap OAuth tool names that were forward-renamed remapOAuthToolNames renames lowercase client-sent tools (e.g. `glob` → `Glob`) to Claude Code equivalents on OAuth requests to avoid tool-name fingerprinting. The reverse pass previously ran against a global reverse map and rewrote every tool_use block whose name matched any value in oauthToolRenameMap — regardless of what the client actually sent. For clients that send mixed casing (notably Amp CLI — `Bash`, `Read`, `Grep`, `Task` alongside `glob`, `skill`, etc.) this corrupted the response. Any forward rename in the request set the "renamed" flag, which then unconditionally lowercased every `Bash` in the response to `bash`. Amp's tool registry has `Bash`, not `bash`, so it rejected the tool_use with `tool "bash" is not allowed for smart mode` and tool execution failed. Fix: `remapOAuthToolNames` now returns a per-request map keyed on the upstream (TitleCase) name valued with the original client-sent name. The reverse functions take this map and only touch entries in it. Names the client sent in TitleCase pass through untouched in both directions. - Change remapOAuthToolNames signature from `([]byte, bool)` to `([]byte, map[string]string)`; populate at every rename site (tools[], tool_choice.name, message tool_use, tool_reference, nested tool_reference inside tool_result). - Change reverseRemapOAuthToolNames and reverseRemapOAuthToolNamesFromStreamLine to accept and consume the per-request map; remove the global oauthToolRenameReverseMap. - Update all three executor call sites (Execute, ExecuteStream direct passthrough, ExecuteStream translated) + count_tokens. - Add regression tests for the mixed-case scenario in both the non-streaming and SSE code paths.	2026-04-25 17:45:03 -04:00
Luis Pater	28d78273e4	feat(api): implement protocol multiplexer and Redis queue for usage integration - Added `protocol_multiplexer.go`, enabling support for both HTTP and Redis protocols on a single listener. - Introduced `redis_queue_protocol.go` to handle Redis-compatible RESP commands for queue management. - Integrated `redisqueue` package, supporting in-memory queuing with expiration pruning. - Updated server initialization to manage a shared listener and multiplex connections. - Adjusted `Handler` to adopt `AuthenticateManagementKey` for modular key validation, supporting both HTTP and Redis flows.	2026-04-25 18:52:24 +08:00
Luis Pater	a7e92e2639	feat(auth): disallow free-tier Codex auth during selection process - Introduced `disallowFreeAuthFromMetadata` and `isFreeCodexAuth` to enforce skipping free-tier credentials. - Modified scheduler logic to honor `DisallowFreeAuthMetadataKey` during auth selection. - Updated `ensureImageGenerationTool` to skip tool injection for free-tier Codex auth. - Added context utility `WithDisallowFreeAuth` and integrated with image handlers. - Augmented relevant tests to cover free-tier exclusion scenarios.	2026-04-24 23:18:56 +08:00
Matthias319	4056c2590b	fix(codex): classify known upstream failures Normalize Codex context, thinking-signature, previous-response, and auth failures to explicit error codes: context_too_large, thinking_signature_invalid, previous_response_not_found, auth_unavailable. Refs #2596.	2026-04-24 17:13:23 +02:00
Luis Pater	f1ba6151a9	feat(codex): pass base model to enable conditional image_generation tool injection - Modified `ensureImageGenerationTool` to accept `baseModel` for conditional logic. - Ensured `gpt-5.3-codex-spark` models bypass image_generation tool injection. - Updated relevant tests and executor logic to reflect changes.	2026-04-24 07:21:03 +08:00
sususu98	12195a276e	Merge pull request #2971 from sususu98/feat/antigravity-credits-fallback feat(antigravity): conductor-level credits fallback for Claude models	2026-04-24 00:15:23 +08:00
sususu98	7ad1900041	perf(antigravity): async credits hint refresh for warm tokens	2026-04-23 23:58:10 +08:00
sususu98	920b6efffa	refactor(logging): strip unrelated deferred body changes, keep credits-only logging Remove deferred body optimization and maxErrorLog constants that were unrelated to credits fallback. Keep only MarkCreditsUsed/CreditsUsed helpers for flagging requests that consumed AI credits.	2026-04-23 17:41:54 +08:00
sususu98	e75daa299b	fix(antigravity): respect pinned auth in credits fallback, release deferred body on success - findAllAntigravityCreditsCandidateAuths now filters by PinnedAuthMetadataKey to prevent credential isolation violations during credits fallback - Release deferredBody reference on success path to avoid holding large payloads in memory for the lifetime of the gin context	2026-04-23 17:38:02 +08:00
sususu98	4de5c29f86	fix(antigravity): remove credits fallback from CountTokens, fix gofmt CountTokens upstream API does not support enabledCreditTypes, so remove the dead credits fallback path from ExecuteCount and delete the unused tryAntigravityCreditsExecuteCount method. Fix gofmt on credits test file.	2026-04-23 15:17:00 +08:00
sususu98	14d46a0a5d	feat(antigravity): conductor-level credits fallback for Claude models Move credits handling from executor-level retry to conductor-level orchestration. When all free-tier auths are exhausted (429/503), the conductor discovers auths with available Google One AI credits and retries with enabledCreditTypes injected via context flag. Key changes: - Add AntigravityCreditsHint system for tracking per-auth credits state - Conductor tries credits fallback after all auths fail (Execute/Stream/Count) - Executor injects enabledCreditTypes only when conductor sets context flag - Credits fallback respects provider scope (requires antigravity in providers) - Add context cancellation check in credits fallback to avoid wasted requests - Remove executor-level attemptCreditsFallback and preferCredits machinery - Restructure 429 decision logic (parse details first, keyword fallback) - Expand shouldAbort to cover INVALID_ARGUMENT/FAILED_PRECONDITION/500+UNKNOWN - Support human-readable retry delay parsing (e.g. "1h43m56s")	2026-04-23 13:44:20 +08:00
MoYeRanQianZhi	31934ae04c	feat(codex): enable image generation for all Codex upstream requests Codex CLI gates the built-in image_generation tool behind AuthMode::Chatgpt (OAuth only). When clients connect via API key auth through CPA, the tool is absent from requests, making image generation unavailable through the reverse proxy. Changes: 1. Inject image_generation tool (codex_executor.go): Add ensureImageGenerationTool() that appends {"type":"image_generation","output_format":"png"} to the tools array if not already present. Applied to all three execution paths: Execute, executeCompact, and ExecuteStream. 2. Route aliases for Codex CLI direct access (server.go): Add /backend-api/codex/responses routes that map to the same OpenAI Responses API handlers as /v1/responses. This allows Codex CLI to connect via chatgpt_base_url config while keeping AuthMode::Chatgpt, which enables the built-in image_generation tool on the client side. 3. Unit tests (codex_executor_imagegen_test.go): Cover no-tools, existing tools, already-present, empty array, and mixed built-in tool scenarios.	2026-04-23 01:24:40 +08:00
stringer07	b6781d69be	perf(codex): avoid repeated output patch writes	2026-04-21 16:29:54 +08:00
stringer07	bb8408cef5	fix(codex): backfill streaming response output	2026-04-21 16:03:56 +08:00
octo-patch	f4eb16102b	fix(executor): drop obsolete context-1m-2025-08-07 beta header (fixes #2866 ) Anthropic has moved the 1M-context-window feature to General Availability, so the context-1m-2025-08-07 beta flag is no longer accepted and now causes 400 Bad Request errors when forwarded upstream. Remove the X-CPA-CLAUDE-1M detection and the corresponding injection of the now-invalid beta header. Also drop the unused net/textproto import that was only needed for the header-key lookup.	2026-04-19 10:38:16 +08:00
hkfires	d9a3b3e5f3	fix(tests): update model lookup references and enhance Claude executor tests	2026-04-17 08:32:07 +08:00
Luis Pater	f5dc6483d5	chore: remove iFlow-related modules and dependencies - Deleted `iflow` provider implementation, including thinking configuration (`apply.go`) and authentication modules. - Removed iFlow-specific tests, executors, and helpers across SDK and internal components. - Updated all references to exclude iFlow functionality.	2026-04-17 01:07:12 +08:00
Luis Pater	d949921143	feat(auth): add proxy URL override support to auth constructors and executors - Introduced `WithProxyURL` variants for `CodexAuth`, `ClaudeAuth`, `IFlowAuth`, and `DeviceFlowClient`. - Updated executors to use proxy-aware constructors for improved configurability. - Added unit tests to validate proxy override precedence and functionality. Closes: #2823	2026-04-16 22:11:39 +08:00

1 2 3 4 5 ...

548 Commits