CLIProxyAPIPlus

mirror of https://github.com/router-for-me/CLIProxyAPIPlus.git synced 2026-07-02 17:14:32 +08:00

Author	SHA1	Message	Date
sususu98	7c24d54ca8	feat(session-affinity): add session-sticky routing for multi-account load balancing When multiple auth credentials are configured, requests from the same session are now routed to the same credential, improving upstream prompt cache hit rates and maintaining context continuity. Core components: - SessionAffinitySelector: wraps RoundRobin/FillFirst selectors with session-to-auth binding; automatic failover when bound auth is unavailable, re-binding via the fallback selector for even distribution - SessionCache: TTL-based in-memory cache with background cleanup goroutine, supporting per-session and per-auth invalidation - StoppableSelector interface: lifecycle hook for selectors holding resources, called during Manager.StopAutoRefresh() Session ID extraction priority (extractSessionIDs): 1. metadata.user_id with Claude Code session format (old user_{hash}_session_{uuid} and new JSON {session_id} format) 2. X-Session-ID header (generic client support) 3. metadata.user_id (non-Claude format, used as-is) 4. conversation_id field 5. Stable FNV hash from system prompt + first user/assistant messages (fallback for clients with no explicit session ID); returns both a full hash (primaryID) and a short hash without assistant content (fallbackID) to inherit bindings from the first turn Multi-format message hash covers OpenAI messages, Claude system array, Gemini contents/systemInstruction, and OpenAI Responses API input items (including inline messages with role but no type field). Configuration (config.yaml routing section): - session-affinity: bool (default false) - session-affinity-ttl: duration string (default "1h") - claude-code-session-affinity: bool (deprecated, alias for above) All three fields trigger selector rebuild on config hot reload. Side effect: Idempotency-Key header is no longer auto-generated with a random UUID when absent — only forwarded when explicitly provided by the client, to avoid polluting session hash extraction.	2026-04-16 00:18:47 +08:00
Luis Pater	8fac29631d	chore: remove Qwen support from SDK and internal components - Deleted `QwenAuthenticator`, internal `qwen_auth`, and `qwen_executor` implementations. - Removed all Qwen-related OAuth flows, token handling, and execution logic. - Cleaned up dependencies and references to Qwen across the codebase.	2026-04-15 12:16:08 +08:00
zilianpn	0ea768011b	fix(auth): honor disable-cooling and enrich no-auth errors	2026-04-07 01:12:13 +08:00
Luis Pater	f389667ec3	Merge pull request #2513 from lonr-6/codex/fix-ws-custom-tool-repair-v2 fix: repair responses websocket custom tool call pairing	2026-04-03 23:45:38 +08:00
Luis Pater	adb580b344	feat(security): add configuration to toggle Gemini CLI endpoint access Closes: #2445	2026-04-03 21:46:49 +08:00
Luis Pater	06405f2129	fix(security): enforce stricter localhost validation for GeminiCLIAPIHandler Closes: #2445	2026-04-03 21:22:03 +08:00
Kai Wang	d1fd2c4ad4	fix: repair websocket custom tool calls	2026-04-03 17:11:44 +08:00
Kai Wang	b6c6379bfa	fix: repair websocket custom tool calls	2026-04-03 17:11:42 +08:00
Kai Wang	8f0e66b72e	fix: repair websocket custom tool calls	2026-04-03 17:11:41 +08:00
Luis Pater	3e78a8d500	Merge branch 'main' into dev	2026-04-02 21:21:26 +08:00
Luis Pater	e3eb048c7a	Merge pull request #2489 from Soein/upstream-pr fix: 增强 Claude 反代检测对抗能力	2026-04-02 21:16:58 +08:00
davidwushi1145	108895fc04	Harden Responses SSE framing against partial chunk boundaries Follow-up review found two real framing hazards in the handler-layer framer: it could flush a partial `data:` payload before the JSON was complete, and it could inject an extra newline before chunks that already began with `\n`/`\r\n`. This commit tightens the framer so it only emits undelimited events when the buffered `data:` payload is already valid JSON (or `[DONE]`), skips newline injection for chunks that already start with a line break, and avoids the heavier `bytes.Split` path while scanning SSE fields. The regression suite now covers split `data:` payload chunks, newline-prefixed chunks, and dropping incomplete trailing data on flush, so the original Responses fix remains intact while the review concerns are explicitly locked down. Constraint: Keep the follow-up limited to handler-layer framing and tests Rejected: Ignore the review and rely on current executor chunk shapes \| leaves partial data payload corruption possible Rejected: Build a fully generic SSE parser \| wider change than needed for the identified risks Confidence: high Scope-risk: narrow Reversibility: clean Directive: Do not emit undelimited Responses SSE events unless buffered `data:` content is already complete and valid Tested: /tmp/go1.26.1/go/bin/go test ./sdk/api/handlers/openai -count=1 Tested: /tmp/go1.26.1/go/bin/go test ./sdk/api/handlers -count=1 Tested: /tmp/go1.26.1/go/bin/go vet ./sdk/api/handlers/... Not-tested: Full repository test suite outside sdk/api/handlers packages	2026-04-02 20:39:49 +08:00
davidwushi1145	abc293c642	Prevent malformed Responses SSE frames from breaking stream clients Line-oriented upstream executors can emit `event:` and `data:` as separate chunks, but the Responses handler had started terminating each incoming chunk as a full SSE event. That split `response.created` into an empty event plus a later data block, which broke downstream clients like OpenClaw. This keeps the fix in the handler layer: a small stateful framer now buffers standalone `event:` lines until the matching `data:` arrives, preserves already-framed events, and ignores delimiter-only leftovers. The regression suite now covers split event/data framing, full-event passthrough, terminal errors, and the bootstrap path that forwards line-oriented openai-response streams from non-Codex executors too. Constraint: Keep the fix localized to Responses handler framing instead of patching every executor Rejected: Revert to v6.9.7 chunk writing \| would reintroduce data-only framing regressions Rejected: Patch each line-oriented executor separately \| duplicates fragile SSE assembly logic Confidence: high Scope-risk: narrow Reversibility: clean Directive: Do not assume incoming Responses stream chunks are already complete SSE events; preserve handler-layer reassembly for split `event:`/`data:` inputs Tested: /tmp/go1.26.1/go/bin/go test ./sdk/api/handlers/openai -count=1 Tested: /tmp/go1.26.1/go/bin/go test ./sdk/api/handlers -count=1 Tested: /tmp/go1.26.1/go test ./sdk/api/handlers/... -count=1 Tested: /tmp/go1.26.1/go/bin/go vet ./sdk/api/handlers/... Tested: Temporary patched server on 127.0.0.1:18317 -> /v1/models 200, /v1/responses non-stream 200, /v1/responses stream emitted combined `event:` + `data:` frames Not-tested: Full repository test suite outside sdk/api/handlers packages	2026-04-02 20:26:42 +08:00
hkfires	34339f61ee	Refactor websocket logging and error handling - Introduced new logging functions for websocket requests, handshakes, errors, and responses in `logging_helpers.go`. - Updated `CodexWebsocketsExecutor` to utilize the new logging functions for improved clarity and consistency in websocket operations. - Modified the handling of websocket upgrade rejections to log relevant metadata. - Changed the request body key to a timeline body key in `openai_responses_websocket.go` to better reflect its purpose. - Enhanced tests to verify the correct logging of websocket events and responses, including disconnect events and error handling scenarios.	2026-04-02 17:30:51 +08:00
pzy	4045378cb4	fix: 增强 Claude 反代检测对抗能力基于 Claude Code v2.1.88 源码分析，修复多个可被 Anthropic 检测的差距： - 实现消息指纹算法（SHA256 盐值 + 字符索引），替代随机 buildHash - billing header cc_version 从设备 profile 动态取版本号，不再硬编码 - billing header cc_entrypoint 从客户端 UA 解析，支持 cli/vscode/local-agent - billing header 新增 cc_workload 支持（通过 X-CPA-Claude-Workload 头传入） - 新增 X-Claude-Code-Session-Id 头（每 apiKey 缓存 UUID，TTL=1h） - 新增 x-client-request-id 头（仅 api.anthropic.com，每请求 UUID） - 补全 4 个缺失的 beta flags（structured-outputs/fast-mode/redact-thinking/token-efficient-tools） - OAuth scope 对齐 Claude Code 2.1.88（移除 org:create_api_key，添加 sessions/mcp/file_upload） - Anthropic-Dangerous-Direct-Browser-Access 仅在 API key 模式发送 - 响应头网关指纹清洗（剥离 litellm/helicone/portkey/cloudflare/kong/braintrust 前缀头）	2026-04-02 15:55:22 +08:00
Luis Pater	c422d16beb	Merge pull request #2398 from 7RPH/fix/responses-sse-framing fix: preserve SSE event boundaries for Responses streams	2026-04-02 00:46:51 +08:00
hkfires	caa529c282	fix(openai): improve client IP retrieval in websocket handler	2026-04-01 20:16:01 +08:00
hkfires	51a4379bf4	refactor(openai): remove websocket body log truncation limit	2026-04-01 18:11:43 +08:00
Luis Pater	acf98ed10e	fix(openai): add session reference counter and cache lifecycle management for websocket tools	2026-04-01 17:28:50 +08:00
Luis Pater	d1c07a091e	fix(openai): add websocket tool call repair with caching and tests to improve transcript consistency	2026-04-01 17:16:49 +08:00
Luis Pater	ca11b236a7	refactor(runtime, openai): simplify header management and remove redundant websocket logging logic	2026-04-01 11:57:31 +08:00
apparition	a3e21df814	fix(openai): avoid developer transcript resets - Narrow websocket transcript replacement detection to assistant outputs and function calls - Preserve existing merge behavior for follow-up developer messages without previous_response_id - Add a regression test covering mid-session developer message updates	2026-03-30 23:33:16 +08:00
apparition	c1d7599829	fix(openai): handle transcript replacement after websocket compaction - Add shouldReplaceWebsocketTranscript() to detect historical model output in input - Add normalizeResponseTranscriptReplacement() for full transcript reset handling - Prevent duplicate stale turn-state when clients replace local history post-compaction - Avoid orphaned function_call items from incremental append on compact transcripts - Add unit tests for transcript replacement detection and state reset behavior	2026-03-30 22:44:58 +08:00
trph	f73d55ddaa	fix: simplify responses SSE suffix handling	2026-03-29 22:19:25 +08:00
trph	0fcc02fbea	fix: tighten responses SSE review follow-up	2026-03-29 22:10:28 +08:00
trph	c03883ccf0	fix: address responses SSE review feedback	2026-03-29 22:00:46 +08:00
trph	134a9eac9d	fix: preserve SSE event boundaries for Responses streams	2026-03-29 17:23:16 +08:00
Luis Pater	7b0453074e	Merge pull request #2219 from beck-8/fix/context-done-race fix: avoid data race when watching request cancellation	2026-03-23 22:57:21 +08:00
Luis Pater	2bd646ad70	refactor: replace `sjson.Set` usage with `sjson.SetBytes` to optimize mutable JSON transformations	2026-03-19 17:58:54 +08:00
beck-8	b2921518ac	fix: avoid data race when watching request cancellation	2026-03-19 00:15:52 +08:00
Luis Pater	dc7187ca5b	fix(websocket): pin only websocket-capable auth IDs and add corresponding test	2026-03-16 09:57:38 +08:00
hkfires	d1e3195e6f	feat(codex): register models by plan tier	2026-03-10 11:20:37 +08:00
Supra4E8C	fc2f0b6983	fix: cap websocket body log growth	2026-03-09 17:48:30 +08:00
Luis Pater	ddcf1f279d	Fixed: #1901 test(websocket): add tests for incremental input and prewarm handling logic - Added test cases for incremental input support based on upstream capabilities. - Introduced validation for prewarm handling of `response.create` messages locally. - Enhanced test coverage for websocket executor behavior, including payload forwarding checks. - Updated websocket implementation with prewarm and incremental input logic for better testability.	2026-03-07 13:11:28 +08:00
Luis Pater	5ebc58fab4	refactor(executor): remove legacy `connCreateSent` logic and standardize `response.create` usage for all websocket events - Simplified connection logic by removing `connCreateSent` and related state handling. - Updated `buildCodexWebsocketRequestBody` to always use `response.create`. - Added unit tests to validate `response.create` behavior and beta header preservation. - Dropped unsupported `response.append` and outdated `response.done` event types.	2026-03-07 09:07:23 +08:00
canxin121	acf483c9e6	fix(responses): reject invalid SSE data JSON Guard the openai-response streaming path against truncated/invalid SSE data payloads by validating data: JSON before forwarding; surface a 502 terminal error instead of letting clients crash with JSON parse errors.	2026-02-24 01:42:54 +08:00
canxin121	49c8ec69d0	fix(openai): emit valid responses stream error chunks When /v1/responses streaming fails after headers are sent, we now emit a type=error chunk instead of an HTTP-style {error:{...}} payload, preventing AI SDK chunk validation errors.	2026-02-23 12:59:50 +08:00
Luis Pater	4445a165e9	test(handlers): add tests for passthrough headers behavior in WriteErrorResponse	2026-02-19 21:49:44 +08:00
Luis Pater	a6bdd9a652	feat: add passthrough headers configuration - Introduced `passthrough-headers` option in configuration to control forwarding of upstream response headers. - Updated handlers to respect the passthrough headers setting. - Added tests to verify behavior when passthrough is enabled or disabled.	2026-02-19 21:31:29 +08:00
Luis Pater	2789396435	fix: ensure connection-scoped headers are filtered in upstream requests - Added `connectionScopedHeaders` utility to respect "Connection" header directives. - Updated `FilterUpstreamHeaders` to remove connection-scoped headers dynamically. - Refactored and tested upstream header filtering with additional validations. - Adjusted upstream header handling during retries to replace headers safely.	2026-02-19 13:19:10 +08:00
Luis Pater	61da7bd981	Merge PR #1626 into codex/pr-1626	2026-02-19 04:49:14 +08:00
Luis Pater	55f938164b	Merge pull request #1618 from alexey-yanchenko/fix/completions-usage Fix empty usage in /v1/completions	2026-02-19 03:57:11 +08:00
Luis Pater	bb86a0c0c4	feat(logging, executor): add request logging tests and WebSocket-based Codex executor - Introduced unit tests for request logging middleware to enhance coverage. - Added WebSocket-based Codex executor to support Responses API upgrade. - Updated middleware logic to selectively capture request bodies for memory efficiency. - Enhanced Codex configuration handling with new WebSocket attributes.	2026-02-19 01:57:02 +08:00
Kirill Turanskiy	1f8f198c45	feat: passthrough upstream response headers to clients CPA previously stripped ALL response headers from upstream AI provider APIs, preventing clients from seeing rate-limit info, request IDs, server-timing and other useful headers. Changes: - Add Headers field to Response and StreamResult structs - Add FilterUpstreamHeaders helper (hop-by-hop + security denylist) - Add WriteUpstreamHeaders helper (respects CPA-set headers) - ExecuteWithAuthManager/ExecuteCountWithAuthManager now return headers - ExecuteStreamWithAuthManager returns headers from initial connection - All 11 provider executors populate Response.Headers - All handler call sites write filtered upstream headers before response Filtered headers (not forwarded): - RFC 7230 hop-by-hop: Connection, Transfer-Encoding, Keep-Alive, etc. - Security: Set-Cookie - CPA-managed: Content-Length, Content-Encoding	2026-02-18 00:16:22 +03:00
Alexey Yanchenko	709d999f9f	Add usage to /v1/completions	2026-02-17 17:21:03 +07:00
Luis Pater	46a6782065	refactor(all): replace manual pointer assignments with `new` to enhance code readability and maintainability	2026-02-15 14:10:10 +08:00
Luis Pater	68cb81a258	feat: add Kimi authentication support and streamline device ID handling - Introduced `RequestKimiToken` API for Kimi authentication flow. - Integrated device ID management throughout Kimi-related components. - Enhanced header management for Kimi API requests with device ID context.	2026-02-06 20:43:30 +08:00
Luis Pater	a5a25dec57	refactor(translator, executor): remove redundant `bytes.Clone` calls for improved performance - Replaced all instances of `bytes.Clone` with direct references to enhance efficiency. - Simplified payload handling across executors and translators by eliminating unnecessary data duplication.	2026-02-06 03:26:29 +08:00
Luis Pater	09ecfbcaed	refactor(executor): optimize payload cloning and streamline SDK translator usage - Replaced unnecessary `bytes.Clone` calls for `opts.OriginalRequest` throughout executors. - Introduced intermediate variable `originalPayloadSource` to simplify payload processing. - Ensured better clarity and structure in request translation logic.	2026-02-06 01:44:20 +08:00
Luis Pater	25c6b479c7	refactor(util, executor): optimize payload handling and schema processing - Replaced repetitive string operations with a centralized `escapeGJSONPathKey` function. - Streamlined handling of JSON schema cleaning for Gemini and Antigravity requests. - Improved payload management by transitioning from byte slices to strings for processing. - Removed unnecessary cloning of byte slices in several places.	2026-02-05 19:00:30 +08:00

1 2 3

109 Commits