CLIProxyAPIPlus

mirror of https://github.com/router-for-me/CLIProxyAPIPlus.git synced 2026-05-08 09:26:08 +08:00

Author	SHA1	Message	Date
kunish	87bf0b73d5	fix(copilot): use dynamic API limits to prevent prompt token overflow The Copilot API enforces per-account prompt token limits (128K individual, 168K business) that differ from the static 200K context length advertised by the proxy. This mismatch caused Claude Code to accumulate context beyond the actual limit, triggering "prompt token count exceeds the limit of 128000" errors. Changes: - Extract max_prompt_tokens and max_output_tokens from the Copilot /models API response (capabilities.limits) and use them as the authoritative ContextLength and MaxCompletionTokens values - Add CopilotModelLimits struct and Limits() helper to parse limits from the existing Capabilities map - Fix GitLab Duo context-1m beta header not being set when routing through the Anthropic gateway (gitlab_duo_force_context_1m attr was set but only gin headers were checked) - Fix flaky parallel tests that shared global model registry state	2026-04-03 23:54:17 +08:00
kunish	b849bf79d6	fix(copilot): address code review — SSE reasoning, multi-choice, agent detection - Strip SSE `data:` prefix before normalizing reasoning_text→reasoning_content in streaming mode; re-wrap afterward for the translator - Iterate all choices in normalizeGitHubCopilotReasoningField (not just choices[0]) to support n>1 requests - Remove over-broad tool-role fallback in isAgentInitiated that scanned all messages for role:"tool", aligning with opencode's approach of only detecting active tool loops — genuine user follow-ups after tool use are no longer mis-classified as agent-initiated - Add 5 reasoning normalization tests; update 2 X-Initiator tests to match refined semantics	2026-04-03 20:51:19 +08:00
kunish	59af2c57b1	fix(copilot): reduce premium request inflation and enable thinking This commit addresses three issues with Claude Code through GitHub Copilot: 1. Premium request inflation: Responses API requests were missing Openai-Intent headers and proper defaults, causing Copilot to bill each tool-loop continuation as a new premium request. Fixed by adding isAgentInitiated() heuristic (checks for tool_result content or preceding assistant tool_use), applying Responses API defaults (store, include, reasoning.summary), and local tiktoken-based token counting to avoid extra API calls. 2. Context overflow: Claude Code's modelSupports1M() hardcodes opus-4-6 as 1M-capable, but Copilot only supports ~128K-200K. Fixed by stripping the context-1m-2025-08-07 beta from translated request bodies. Also forwards response headers in non-streaming Execute() and registers the GET /copilot-quota management API route. 3. Thinking not working: Add ThinkingSupport with level-based reasoning to Claude models in the static definitions. Normalize Copilot's non-standard 'reasoning_text' response field to 'reasoning_content' before passing to the SDK translator. Use caller-provided context in CountTokens instead of Background().	2026-04-03 20:24:30 +08:00
Luis Pater	03a1bac898	Merge upstream v6.9.9 (PR #483 )	2026-04-02 21:31:21 +08:00
Luis Pater	e3eb048c7a	Merge pull request #2489 from Soein/upstream-pr fix: 增强 Claude 反代检测对抗能力	2026-04-02 21:16:58 +08:00
Luis Pater	a59e92435b	Merge pull request #2490 from router-for-me/logs Refactor websocket logging and error handling	2026-04-02 20:47:31 +08:00
pzy	bb44671845	fix: 修复反代检测对抗的 3 个问题 - computeFingerprint 使用 rune 索引替代字节索引，修复多字节字符指纹不匹配 - utls Chrome TLS 指纹仅对 Anthropic 官方域名生效，自定义 base_url 走标准 transport - IPv6 地址使用 net.JoinHostPort 正确拼接端口	2026-04-02 19:12:55 +08:00
Luis Pater	09e480036a	feat(auth): add support for managing custom headers in auth files Closes #2457	2026-04-02 19:11:09 +08:00
pzy	249f969110	fix: Claude API 请求使用 utls Chrome TLS 指纹 Claude executor 的 API 请求之前使用 Go 标准库 crypto/tls，JA3 指纹与真实 Claude Code（Bun/BoringSSL）不匹配，可被 Cloudflare 识别。 - 新增 helps/utls_client.go，封装 utls Chrome 指纹 + HTTP/2 + 代理支持 - Claude executor 的 4 处 NewProxyAwareHTTPClient 替换为 NewUtlsHTTPClient - 其他 executor（Gemini/Codex/iFlow 等）不受影响，仍用标准 TLS - 非 HTTPS 请求自动回退到标准 transport	2026-04-02 19:09:56 +08:00
hkfires	4f8acec2d8	refactor(logging): centralize websocket handshake recording	2026-04-02 18:39:32 +08:00
hkfires	34339f61ee	Refactor websocket logging and error handling - Introduced new logging functions for websocket requests, handshakes, errors, and responses in `logging_helpers.go`. - Updated `CodexWebsocketsExecutor` to utilize the new logging functions for improved clarity and consistency in websocket operations. - Modified the handling of websocket upgrade rejections to log relevant metadata. - Changed the request body key to a timeline body key in `openai_responses_websocket.go` to better reflect its purpose. - Enhanced tests to verify the correct logging of websocket events and responses, including disconnect events and error handling scenarios.	2026-04-02 17:30:51 +08:00
pzy	4045378cb4	fix: 增强 Claude 反代检测对抗能力基于 Claude Code v2.1.88 源码分析，修复多个可被 Anthropic 检测的差距： - 实现消息指纹算法（SHA256 盐值 + 字符索引），替代随机 buildHash - billing header cc_version 从设备 profile 动态取版本号，不再硬编码 - billing header cc_entrypoint 从客户端 UA 解析，支持 cli/vscode/local-agent - billing header 新增 cc_workload 支持（通过 X-CPA-Claude-Workload 头传入） - 新增 X-Claude-Code-Session-Id 头（每 apiKey 缓存 UUID，TTL=1h） - 新增 x-client-request-id 头（仅 api.anthropic.com，每请求 UUID） - 补全 4 个缺失的 beta flags（structured-outputs/fast-mode/redact-thinking/token-efficient-tools） - OAuth scope 对齐 Claude Code 2.1.88（移除 org:create_api_key，添加 sessions/mcp/file_upload） - Anthropic-Dangerous-Direct-Browser-Access 仅在 API key 模式发送 - 响应头网关指纹清洗（剥离 litellm/helicone/portkey/cloudflare/kong/braintrust 前缀头）	2026-04-02 15:55:22 +08:00
Luis Pater	2df35449fe	Fix executor compat helpers	2026-04-02 12:20:12 +08:00
Luis Pater	c744179645	Merge PR #479	2026-04-02 12:15:33 +08:00
Luis Pater	4f99bc54f1	test: update codex header expectations	2026-04-02 11:19:37 +08:00
Luis Pater	913f4a9c5f	test: fix executor tests after helpers refactor	2026-04-02 11:12:30 +08:00
Luis Pater	25d1c18a3f	fix: scope experimental cch signing to billing header	2026-04-02 11:03:11 +08:00
Luis Pater	d09dd4d0b2	Merge commit '15c2f274ea690c9a7c9db22f9f454af869db5375' into dev	2026-04-02 10:59:54 +08:00
edlsh	15c2f274ea	fix: preserve cloak config defaults when mode omitted	2026-04-01 13:20:11 -04:00
edlsh	37249339ac	feat: add opt-in experimental Claude cch signing	2026-04-01 13:03:17 -04:00
Luis Pater	105a21548f	fix(codex): centralize session management with global store and add tests for executor session lifecycle	2026-04-01 13:17:10 +08:00
Luis Pater	ca11b236a7	refactor(runtime, openai): simplify header management and remove redundant websocket logging logic	2026-04-01 11:57:31 +08:00
Luis Pater	330e12d3c2	fix(codex): conditionally set `Session_id` header for Mac OS user agents and clean up redundant logic	2026-04-01 11:11:45 +08:00
Thai Nguyen Hung	bd09c0bf09	feat(registry): add gpt-5.4-mini model to GitHub Copilot registry	2026-04-01 10:04:38 +07:00
Luis Pater	b468ca79c3	Merge branch 'dev' of github.com:router-for-me/CLIProxyAPI into dev	2026-04-01 03:09:03 +08:00
Luis Pater	d2c7e4e96a	refactor(runtime): move executor utilities to `helps` package and update references	2026-04-01 03:08:20 +08:00
Luis Pater	1c7003ff68	Merge pull request #2452 from Lucaszmv/fix-qwen-cli-v0.13.2 fix(qwen): update CLI simulation to v0.13.2 and adjust header casing	2026-04-01 02:44:27 +08:00
Lucaszmv	1b44364e78	fix(qwen): update CLI simulation to v0.13.2	2026-03-31 15:19:07 -03:00
Luis Pater	51fd58d74f	fix(codex): use normalizeCodexInstructions to set default instructions	2026-03-31 12:16:57 +08:00
Luis Pater	faae9c2f7c	Merge pull request #2422 from MonsterQiu/fix/codex-compact-instructions fix(codex): add default instructions for /responses/compact	2026-03-31 12:14:20 +08:00
Luis Pater	bc3a6e4646	Merge pull request #2434 from MonsterQiu/fix/codex-responses-null-instructions fix(codex): normalize null instructions for /responses requests	2026-03-31 12:01:21 +08:00
MonsterQiu	39b9a38fbc	fix(codex): normalize null instructions across responses paths	2026-03-31 10:32:39 +08:00
MonsterQiu	bd855abec9	fix(codex): normalize null instructions for responses requests	2026-03-31 10:29:02 +08:00
xixiwenxuanhe	a0bf33eca6	fix(antigravity): preserve fallback and honor config gate	2026-03-31 00:14:05 +08:00
xixiwenxuanhe	88dd9c715d	feat(antigravity): add AI credits quota fallback	2026-03-30 23:58:12 +08:00
MonsterQiu	d3b94c9241	fix(codex): normalize null instructions for compact requests	2026-03-30 22:58:05 +08:00
MonsterQiu	d11936f292	fix(codex): add default instructions for /responses/compact	2026-03-30 22:44:46 +08:00
Luis Pater	13aa5b3375	Revert "fix(codex): restore prompt cache continuity for Codex requests"	2026-03-29 22:18:14 +08:00
Luis Pater	1587ff5e74	Merge pull request #2389 from router-for-me/claude fix(claude): add default max_tokens for models	2026-03-29 13:03:20 +08:00
hkfires	f033d3a6df	fix(claude): enhance ensureModelMaxTokens to use registered max_completion_tokens and fallback to default	2026-03-29 13:00:43 +08:00
hkfires	145e0e0b5d	fix(claude): add default max_tokens for models	2026-03-29 12:46:00 +08:00
Luis Pater	d5930f4e44	Merge branch 'main' into plus	2026-03-29 12:40:17 +08:00
Luis Pater	55271403fb	Merge pull request #2374 from VooDisss/codex-cache-clean fix(codex): restore prompt cache continuity for Codex requests	2026-03-28 21:16:51 +08:00
Luis Pater	b9b127a7ea	Merge pull request #2347 from edlsh/fix/codex-strip-stream-options fix(codex): strip stream_options from Responses API requests	2026-03-28 21:03:01 +08:00
Luis Pater	b8b89f34f4	Merge pull request #442 from LuxVTZ/feat/gitlab-duo-panel-parity Improve GitLab Duo gateway compatibility\n\nRestore internal/runtime/executor/claude_executor.go to main during merge.	2026-03-28 05:06:41 +08:00
VooDisss	e5d3541b5a	refactor(codex): remove stale affinity cleanup leftovers Drop the last affinity-related executor artifacts so the PR stays focused on the minimal Codex continuity fix set: stable prompt cache identity, stable session_id, and the executor-only behavior that was validated to restore cache reads.	2026-03-27 20:40:26 +02:00
VooDisss	26eca8b6ba	fix(codex): preserve continuity and safe affinity fallback Restore Claude continuity after the continuity refactor, keep auth-affinity keys out of upstream Codex session identifiers, and only persist affinity after successful execution so retries can still rotate to healthy credentials when the first auth fails.	2026-03-27 18:27:33 +02:00
VooDisss	62b17f40a1	refactor(codex): align continuity helpers with review feedback Align websocket continuity resolution with the HTTP Codex path, make auth-affinity principal keys use a stable string representation, and extract small helpers that remove duplicated continuity and affinity logic without changing the validated cache-hit behavior.	2026-03-27 18:11:57 +02:00
VooDisss	511b8a992e	fix(codex): restore prompt cache continuity for Codex requests Prompt caching on Codex was not reliably reusable through the proxy because repeated chat-completions requests could reach the upstream without the same continuity envelope. In practice this showed up most clearly with OpenCode, where cache reads worked in the reference client but not through CLIProxyAPI, although the root cause is broader than OpenCode itself. The proxy was breaking continuity in several ways: executor-layer Codex request preparation stripped prompt_cache_retention, chat-completions translation did not preserve that field, continuity headers used a different shape than the working client behavior, and OpenAI-style Codex requests could be sent without a stable prompt_cache_key. When that happened, session_id fell back to a fresh random value per request, so upstream Codex treated repeated requests as unrelated turns instead of as part of the same cacheable context. This change fixes that by preserving caller-provided prompt_cache_retention on Codex execution paths, preserving prompt_cache_retention when translating OpenAI chat-completions requests to Codex, aligning Codex continuity headers to session_id, and introducing an explicit Codex continuity policy that derives a stable continuity key from the best available signal. The resolution order prefers an explicit prompt_cache_key, then execution session metadata, then an explicit idempotency key, then stable request-affinity metadata, then a stable client-principal hash, and finally a stable auth-ID hash when no better continuity signal exists. The same continuity key is applied to both prompt_cache_key in the request body and session_id in the request headers so repeated requests reuse the same upstream cache/session identity. The auth manager also keeps auth selection sticky for repeated request sequences, preventing otherwise-equivalent Codex requests from drifting across different upstream auth contexts and accidentally breaking cache reuse. To keep the implementation maintainable, the continuity resolution and diagnostics are centralized in a dedicated Codex continuity helper instead of being scattered across executor flow code. Regression coverage now verifies retention preservation, continuity-key precedence, stable auth-ID fallback, websocket parity, translator preservation, and auth-affinity behavior. Manual validation confirmed prompt cache reads now occur through CLIProxyAPI when using Codex via OpenCode, and the fix should also benefit other clients that rely on stable repeated Codex request continuity.	2026-03-27 17:49:29 +02:00
MrHuangJser	1b7447b682	feat(cursor): implement StatusError for conductor cooldown integration Cursor executor errors were plain fmt.Errorf — the conductor couldn't extract HTTP status codes, so exhausted accounts never entered cooldown. Changes: - Add ConnectError struct to proto/connect.go: ParseConnectEndStream now returns *ConnectError with Code/Message fields for precise matching - Add cursorStatusErr implementing StatusError + RetryAfter interfaces - Add classifyCursorError() with two-layer classification: Layer 1: exact match on ConnectError.Code (gRPC standard codes) resource_exhausted → 429, unauthenticated → 401, permission_denied → 403, unavailable → 503, internal → 500 Layer 2: fuzzy string match for H2 errors (RST_STREAM → 502) - Log all ConnectError code/message pairs for observing real server error codes (we have no samples yet) - Wrap Execute and ExecuteStream error returns with classifyCursorError Now the conductor properly marks Cursor auths as cooldown on quota errors, enabling exponential backoff and round-robin failover. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 11:42:22 +08:00

1 2 3 4 5 ...

697 Commits