When multiple auth credentials are configured, requests from the same
session are now routed to the same credential, improving upstream prompt
cache hit rates and maintaining context continuity.
Core components:
- SessionAffinitySelector: wraps RoundRobin/FillFirst selectors with
session-to-auth binding; automatic failover when bound auth is
unavailable, re-binding via the fallback selector for even distribution
- SessionCache: TTL-based in-memory cache with background cleanup
goroutine, supporting per-session and per-auth invalidation
- StoppableSelector interface: lifecycle hook for selectors holding
resources, called during Manager.StopAutoRefresh()
Session ID extraction priority (extractSessionIDs):
1. metadata.user_id with Claude Code session format (old
user_{hash}_session_{uuid} and new JSON {session_id} format)
2. X-Session-ID header (generic client support)
3. metadata.user_id (non-Claude format, used as-is)
4. conversation_id field
5. Stable FNV hash from system prompt + first user/assistant messages
(fallback for clients with no explicit session ID); returns both a
full hash (primaryID) and a short hash without assistant content
(fallbackID) to inherit bindings from the first turn
Multi-format message hash covers OpenAI messages, Claude system array,
Gemini contents/systemInstruction, and OpenAI Responses API input items
(including inline messages with role but no type field).
Configuration (config.yaml routing section):
- session-affinity: bool (default false)
- session-affinity-ttl: duration string (default "1h")
- claude-code-session-affinity: bool (deprecated, alias for above)
All three fields trigger selector rebuild on config hot reload.
Side effect: Idempotency-Key header is no longer auto-generated with a
random UUID when absent — only forwarded when explicitly provided by the
client, to avoid polluting session hash extraction.
- Deleted `QwenAuthenticator`, internal `qwen_auth`, and `qwen_executor` implementations.
- Removed all Qwen-related OAuth flows, token handling, and execution logic.
- Cleaned up dependencies and references to Qwen across the codebase.
- Introduced `auth-auto-refresh-workers` config option to override default concurrency.
- Updated `authAutoRefreshLoop` to support customizable worker counts.
- Enhanced token refresh scheduling flexibility by aligning worker pool with runtime configurations.
- Abort update when SHA256 digest mismatch is detected instead of
logging a warning and proceeding (prevents MITM asset replacement)
- Cap asset download size to 10 MB via io.LimitReader (defense-in-depth
against OOM from oversized responses)
- Add `auto-update-panel` config option (default: false) to make the
periodic background updater opt-in; the panel is still downloaded
on first access when missing, but no longer silently auto-updated
every 3 hours unless explicitly enabled
feat(proxy): centralize proxy handling with `proxyutil` package and enhance test coverage
- Added `proxyutil` package to simplify proxy handling across the codebase.
- Refactored various components (`executor`, `cliproxy`, `auth`, etc.) to use `proxyutil` for consistent and reusable proxy logic.
- Introduced support for "direct" proxy mode to explicitly bypass all proxies.
- Updated tests to validate proxy behavior (e.g., `direct`, HTTP/HTTPS, and SOCKS5).
- Enhanced YAML configuration documentation for proxy options.
Default to generating a fresh random user_id per request instead of
reusing cached IDs. Add cache-user-id config option to opt in to the
previous caching behavior.
- Add CacheUserID field to CloakConfig
- Extract user_id cache logic to dedicated file
- Generate fresh user_id by default, cache only when enabled
- Add tests for both paths
- Introduced `passthrough-headers` option in configuration to control forwarding of upstream response headers.
- Updated handlers to respect the passthrough headers setting.
- Added tests to verify behavior when passthrough is enabled or disabled.
Update hardcoded X-Stainless-* and User-Agent defaults to match
Claude Code 2.1.44 / @anthropic-ai/sdk 0.74.0 (verified via
diagnostic proxy capture 2026-02-17).
Changes:
- X-Stainless-Os/Arch: dynamic via runtime.GOOS/GOARCH
- X-Stainless-Package-Version: 0.55.1 → 0.74.0
- X-Stainless-Timeout: 60 → 600
- User-Agent: claude-cli/1.0.83 (external, cli) → claude-cli/2.1.44 (external, sdk-cli)
Add claude-header-defaults config section so values can be updated
without recompilation when Claude Code releases new versions.
- Introduced a new `pprof` server to enable/debug HTTP profiling.
- Added configuration options for enabling/disabling and specifying the server address.
- Integrated pprof server lifecycle management with `Service`.
#1287
Introduce `Filter` rules in the payload configuration to remove specified JSON paths from the payload. Update related helper functions and add examples to `config.example.yaml`.
- Add ErrorLogsMaxFiles config field with default value 10
- Support hot-reload via config file changes
- Add Management API: GET/PUT/PATCH /v0/management/error-logs-max-files
- Maintain SDK backward compatibility with NewFileRequestLogger (3 params)
- Add NewFileRequestLoggerWithOptions for custom error log retention
When request logging is disabled, forced error logs are retained up to
the configured limit. Set to 0 to disable cleanup.
- Introduced `default-raw` and `override-raw` rules to handle raw JSON values.
- Enhanced `PayloadConfig` to validate and sanitize raw JSON payload rules.
- Updated executor logic to apply `default-raw` and `override-raw` rules.
- Extended example YAML to include usage of raw JSON rules.
- Updated `SDKConfig` to use `nonstream-keepalive-interval` (seconds) instead of the boolean `nonstream-keepalive`.
- Refactored handlers and logic to incorporate the new interval-based configuration.
- Updated config diff, tests, and example YAML to reflect the changes.
- Introduced `StartNonStreamingKeepAlive` to emit periodic blank lines during non-streaming responses.
- Added `nonstream-keepalive` configuration option in `SDKConfig`.
- Updated handlers to utilize `StartNonStreamingKeepAlive` and ensure proper cleanup.
- Extended config diff and tests to include `nonstream-keepalive` changes.
feat(cliproxy): support multiple aliases for OAuth model mappings
- Updated mapping logic to allow multiple aliases per upstream model name.
- Adjusted `SanitizeOAuthModelMappings` to ensure aliases remain unique within channels.
- Added test cases to validate multi-alias scenarios.
- Updated example config to clarify multi-alias support.