Commit Graph

2940 Commits

Author SHA1 Message Date
trph
134a9eac9d fix: preserve SSE event boundaries for Responses streams 2026-03-29 17:23:16 +08:00
Luis Pater
6d8de0ade4 feat(auth): implement weighted provider rotation for improved scheduling fairness 2026-03-29 13:49:01 +08:00
Luis Pater
1587ff5e74 Merge pull request #2389 from router-for-me/claude
fix(claude): add default max_tokens for models
2026-03-29 13:03:20 +08:00
hkfires
f033d3a6df fix(claude): enhance ensureModelMaxTokens to use registered max_completion_tokens and fallback to default 2026-03-29 13:00:43 +08:00
hkfires
145e0e0b5d fix(claude): add default max_tokens for models 2026-03-29 12:46:00 +08:00
Luis Pater
f8d1bc06ea Merge pull request #469 from router-for-me/plus
v6.9.5
v6.9.5-0
2026-03-29 12:40:26 +08:00
Luis Pater
d5930f4e44 Merge branch 'main' into plus 2026-03-29 12:40:17 +08:00
Luis Pater
9b7d7021af docs(readme): update LingtrueAPI link in all README translations 2026-03-29 12:30:24 +08:00
Luis Pater
e41c22ef44 docs(readme): add LingtrueAPI sponsorship details to all README translations 2026-03-29 12:23:37 +08:00
Ravi Tharuma
5fc2bd393e fix: retain codex thinking signature until item done 2026-03-28 14:41:25 +01:00
Luis Pater
55271403fb Merge pull request #2374 from VooDisss/codex-cache-clean
fix(codex): restore prompt cache continuity for Codex requests
2026-03-28 21:16:51 +08:00
Luis Pater
36fba66619 Merge pull request #2371 from RaviTharuma/docs/provider-specific-routes
docs: clarify provider-specific routing for aliased models
2026-03-28 21:11:29 +08:00
Ravi Tharuma
66eb12294a fix: clear stale thinking signature when no block is open 2026-03-28 14:08:31 +01:00
Ravi Tharuma
73b22ec29b fix: omit empty signature field from thinking blocks
Emit signature only when non-empty in both streaming content_block_start
and non-streaming thinking blocks. Avoids turning 'missing signature'
into 'empty/invalid signature' which Claude clients may reject.
2026-03-28 14:08:31 +01:00
Ravi Tharuma
c31ae2f3b5 fix: retain previously captured thinking signature on new summary part 2026-03-28 14:08:31 +01:00
Ravi Tharuma
76b53d6b5b fix: finalize pending thinking block before next summary part 2026-03-28 14:08:31 +01:00
Ravi Tharuma
a34dfed378 fix: preserve Claude thinking signatures in Codex translator 2026-03-28 14:08:31 +01:00
Luis Pater
b9b127a7ea Merge pull request #2347 from edlsh/fix/codex-strip-stream-options
fix(codex): strip stream_options from Responses API requests
2026-03-28 21:03:01 +08:00
Luis Pater
2741e7b7b3 Merge pull request #2346 from pjpjq/codex/fix-codex-capacity-retry
fix(codex): Treat Codex capacity errors as retryable
2026-03-28 21:00:50 +08:00
Luis Pater
1767a56d4f Merge pull request #2343 from kongkk233/fix/proxy-transport-defaults
Preserve default transport settings for proxy clients
2026-03-28 20:58:24 +08:00
Luis Pater
779e6c2d2f Merge pull request #2231 from 7RPH/fix/responses-stream-multi-tool-calls
fix: preserve separate streamed tool calls in Responses API
2026-03-28 20:53:19 +08:00
Luis Pater
73c831747b Merge pull request #2133 from DragonFSKY/fix/2061-stale-modelstates
fix(auth): prevent stale runtime state inheritance from disabled auth entries
2026-03-28 20:50:57 +08:00
Luis Pater
b8b89f34f4 Merge pull request #442 from LuxVTZ/feat/gitlab-duo-panel-parity
Improve GitLab Duo gateway compatibility\n\nRestore internal/runtime/executor/claude_executor.go to main during merge.
v6.9.4-2
2026-03-28 05:06:41 +08:00
Luis Pater
1fa094dac6 Merge pull request #461 from MrHuangJser/main
feat(cursor): Full Cursor provider with H2 streaming, MCP tools, multi-turn & multi-account
v6.9.4-1
2026-03-28 05:01:27 +08:00
Luis Pater
f55754621f Merge pull request #464 from router-for-me/plus
v6.9.4
v6.9.4-0
2026-03-28 04:51:27 +08:00
Luis Pater
ac26e7db43 Merge branch 'main' into plus 2026-03-28 04:51:18 +08:00
Luis Pater
10b824fcac fix(security): validate auth file names to prevent unsafe input 2026-03-28 04:48:23 +08:00
VooDisss
e5d3541b5a refactor(codex): remove stale affinity cleanup leftovers
Drop the last affinity-related executor artifacts so the PR stays focused on the minimal Codex continuity fix set: stable prompt cache identity, stable session_id, and the executor-only behavior that was validated to restore cache reads.
2026-03-27 20:40:26 +02:00
VooDisss
79755e76ea refactor(pr): remove forbidden translator changes
Drop the chat-completions translator edits from this PR so the branch complies with the repository policy that forbids pull-request changes under internal/translator. The remaining PR stays focused on the executor-level Codex continuity fix that was validated to restore cache reuse.
2026-03-27 19:34:13 +02:00
VooDisss
35f158d526 refactor(pr): narrow Codex cache fix scope
Remove the experimental auth-affinity routing changes from this PR so it stays focused on the validated Codex continuity fix. This keeps the prompt-cache repair while avoiding unrelated routing-policy concerns such as provider/model affinity scope, lifecycle cleanup, and hard-pin fallback semantics.
2026-03-27 19:06:34 +02:00
VooDisss
6962e09dd9 fix(auth): scope affinity by provider
Keep sticky auth affinity limited to matching providers and stop persisting execution-session IDs as long-lived affinity keys so provider switching and normal streaming traffic do not create incorrect pins or stale affinity state.
2026-03-27 18:52:58 +02:00
VooDisss
4c4cbd44da fix(auth): avoid leaking or over-persisting affinity keys
Stop using one-shot idempotency keys as long-lived auth-affinity identifiers and remove raw affinity-key values from debug logs so sticky routing keeps its continuity benefits without creating avoidable memory growth or credential exposure risks.
2026-03-27 18:34:51 +02:00
VooDisss
26eca8b6ba fix(codex): preserve continuity and safe affinity fallback
Restore Claude continuity after the continuity refactor, keep auth-affinity keys out of upstream Codex session identifiers, and only persist affinity after successful execution so retries can still rotate to healthy credentials when the first auth fails.
2026-03-27 18:27:33 +02:00
VooDisss
62b17f40a1 refactor(codex): align continuity helpers with review feedback
Align websocket continuity resolution with the HTTP Codex path, make auth-affinity principal keys use a stable string representation, and extract small helpers that remove duplicated continuity and affinity logic without changing the validated cache-hit behavior.
2026-03-27 18:11:57 +02:00
VooDisss
511b8a992e fix(codex): restore prompt cache continuity for Codex requests
Prompt caching on Codex was not reliably reusable through the proxy because repeated chat-completions requests could reach the upstream without the same continuity envelope. In practice this showed up most clearly with OpenCode, where cache reads worked in the reference client but not through CLIProxyAPI, although the root cause is broader than OpenCode itself.

The proxy was breaking continuity in several ways: executor-layer Codex request preparation stripped prompt_cache_retention, chat-completions translation did not preserve that field, continuity headers used a different shape than the working client behavior, and OpenAI-style Codex requests could be sent without a stable prompt_cache_key. When that happened, session_id fell back to a fresh random value per request, so upstream Codex treated repeated requests as unrelated turns instead of as part of the same cacheable context.

This change fixes that by preserving caller-provided prompt_cache_retention on Codex execution paths, preserving prompt_cache_retention when translating OpenAI chat-completions requests to Codex, aligning Codex continuity headers to session_id, and introducing an explicit Codex continuity policy that derives a stable continuity key from the best available signal. The resolution order prefers an explicit prompt_cache_key, then execution session metadata, then an explicit idempotency key, then stable request-affinity metadata, then a stable client-principal hash, and finally a stable auth-ID hash when no better continuity signal exists.

The same continuity key is applied to both prompt_cache_key in the request body and session_id in the request headers so repeated requests reuse the same upstream cache/session identity. The auth manager also keeps auth selection sticky for repeated request sequences, preventing otherwise-equivalent Codex requests from drifting across different upstream auth contexts and accidentally breaking cache reuse.

To keep the implementation maintainable, the continuity resolution and diagnostics are centralized in a dedicated Codex continuity helper instead of being scattered across executor flow code. Regression coverage now verifies retention preservation, continuity-key precedence, stable auth-ID fallback, websocket parity, translator preservation, and auth-affinity behavior. Manual validation confirmed prompt cache reads now occur through CLIProxyAPI when using Codex via OpenCode, and the fix should also benefit other clients that rely on stable repeated Codex request continuity.
2026-03-27 17:49:29 +02:00
Luis Pater
7dccc7ba2f docs(readme): remove redundant whitespace in BmoPlus sponsorship section of Chinese README 2026-03-27 20:52:14 +08:00
Luis Pater
70c90687fd docs(readme): fix formatting in BmoPlus sponsorship section of Chinese README 2026-03-27 20:49:43 +08:00
Luis Pater
8144ffd5c8 Merge pull request #2370 from B3o/add-bmoplus-sponsor
docs: add BmoPlus sponsorship banners to READMEs
2026-03-27 20:48:22 +08:00
Ravi Tharuma
0ab977c236 docs: clarify provider path limitations 2026-03-27 11:13:08 +01:00
Ravi Tharuma
224f0de353 docs: neutralize provider-specific path wording 2026-03-27 11:11:06 +01:00
B3o
6b45d311ec add BmoPlus sponsorship banners to READMEs 2026-03-27 18:01:35 +08:00
Ravi Tharuma
d54de441d3 docs: clarify provider-specific routing for aliased models 2026-03-27 10:53:09 +01:00
MrHuangJser
7386a70724 feat(cursor): auto-identify accounts from JWT sub for multi-account support
Previously Cursor required a manual ?label=xxx parameter to distinguish
accounts (unlike Codex which auto-generates filenames from JWT claims).

Cursor JWTs contain a "sub" claim (e.g. "auth0|user_XXXX") that uniquely
identifies each account. Now we:

- Add ParseJWTSub() + SubToShortHash() to extract and hash the sub claim
- Refactor GetTokenExpiry() to share the new decodeJWTPayload() helper
- Update CredentialFileName(label, subHash) to auto-generate filenames
  from the sub hash when no explicit label is provided
  (e.g. "cursor.8f202e67.json" instead of always "cursor.json")
- Add DisplayLabel() for human-readable account identification
- Store "sub" in metadata for observability
- Update both management API handler and SDK authenticator

Same account always produces the same filename (deterministic), different
accounts get different files. Explicit ?label= still takes priority.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 17:40:02 +08:00
白金
1821bf7051 docs: add BmoPlus sponsorship banners to READMEs 2026-03-27 17:39:29 +08:00
Luis Pater
d42b5d4e78 docs(readme): update QQ group information in Chinese README 2026-03-27 11:46:21 +08:00
MrHuangJser
1b7447b682 feat(cursor): implement StatusError for conductor cooldown integration
Cursor executor errors were plain fmt.Errorf — the conductor couldn't
extract HTTP status codes, so exhausted accounts never entered cooldown.

Changes:
- Add ConnectError struct to proto/connect.go: ParseConnectEndStream now
  returns *ConnectError with Code/Message fields for precise matching
- Add cursorStatusErr implementing StatusError + RetryAfter interfaces
- Add classifyCursorError() with two-layer classification:
  Layer 1: exact match on ConnectError.Code (gRPC standard codes)
    resource_exhausted → 429, unauthenticated → 401,
    permission_denied → 403, unavailable → 503, internal → 500
  Layer 2: fuzzy string match for H2 errors (RST_STREAM → 502)
- Log all ConnectError code/message pairs for observing real server
  error codes (we have no samples yet)
- Wrap Execute and ExecuteStream error returns with classifyCursorError

Now the conductor properly marks Cursor auths as cooldown on quota errors,
enabling exponential backoff and round-robin failover.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 11:42:22 +08:00
MrHuangJser
40dee4453a feat(cursor): auto-migrate sessions to healthy account on quota exhaustion
When a Cursor account's quota is exhausted, sessions bound to it can now
seamlessly continue on a different account:

Layer 1 — Checkpoint decoupling:
  Key checkpoints by conversationId (not authID:conversationId). Store
  authID inside savedCheckpoint. On lookup, if auth changed, discard the
  stale checkpoint and flatten conversation history into userText.

Layer 2 — Cross-account session cleanup:
  When a request arrives for a conversation whose session belongs to a
  different (now-exhausted) auth, close the old H2 stream and remove
  the stale session to free resources.

Layer 3 — H2Stream.Err() exposure:
  New Err() method on H2Stream so callers can inspect RST_STREAM,
  GOAWAY, or other stream-level errors after closure.

Layer 4 — processH2SessionFrames error propagation:
  Returns error instead of bare return. Connect EndStream errors (quota,
  rate limit) are now propagated instead of being logged and swallowed.

Layer 5 — Pre-response transparent retry:
  If the stream fails before any data is sent to the client, return an
  error to the conductor so it retries with a different auth — fully
  transparent to the client.

Layer 6 — Post-response error logging:
  If the stream fails after data was already sent, log a warning. The
  conductor's existing cooldown mechanism ensures the next request routes
  to a healthy account.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 10:50:32 +08:00
MrHuangJser
8902e1cccb style(cursor): replace fmt.Print* with log package for consistent logging
Address Gemini Code Assist review feedback: use logrus log package
instead of fmt.Printf/Println in Cursor auth handlers and CLI for
unified log formatting and level control.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 17:03:32 +08:00
黄姜恒
de5fe71478 feat(cursor): multi-account routing with round-robin and session isolation
- Add cursor/filename.go for multi-account credential file naming
- Include auth.ID in session and checkpoint keys for per-account isolation
- Record authID in cursorSession, validate on resume to prevent cross-account access
- Management API /cursor-auth-url supports ?label= for creating named accounts
- Leverages existing conductor round-robin + failover framework

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 11:27:49 +08:00
黄姜恒
dcfbec2990 feat(cursor): add management API for Cursor OAuth authentication
- Add RequestCursorToken handler with PKCE + polling flow
- Register /v0/management/cursor-auth-url route
- Returns login URL + state for browser auth, polls in background
- Saves cursor.json with access/refresh tokens on success

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 11:10:07 +08:00