- Updated Gemini, Gemini CLI, and Antigravity logic to delete `thinkingConfig` when `ModeNone` is set, `Budget=0`, and `Level` is empty.
- Adjusted tests to validate this behavior across multiple scenarios and models with zero-allowed configurations.
- Extended test cases for additional coverage of mixed-model behavior.
Closes: #3138
- Deleted `geminicli` provider and related `Apply` logic.
- Removed all translator packages specific to Gemini CLI (Claude, Codex integrations).
- Purged associated test files for Gemini CLI translation.
- Removed `GeminiAuthenticator` and all associated authentication logic (OAuth flows, token handling, refresh logic).
- Deleted internal/executor Gemini OAuth support, including bearer token handling and runtime API logic.
- Purged all tests, configs, and command-line flags specific to Gemini OAuth flows.
- Updated documentation and aliases to reflect Gemini removal.
- Renamed `parseRetryDelay` to `ParseRetryDelay` and `deleteJSONField` to `DeleteJSONField`.
- Updated references in `antigravity_executor` and tests to use the new `helps` package.
- Adjusted import paths and test cases to ensure compatibility with the new location.
- Updated README files to reflect changes in the retry logic references.
- Updated `.github/ISSUE_TEMPLATE/bug_report.md` to remove deprecated Gemini CLI mention.
- Removed `examples/plugin/main.go` and `internal/pluginhost/loader_plugin.go` after migrating to a more modular system.
- Introduced `streamBridge` in `internal/pluginhost/stream_bridge.go` for efficient stream handling and communication.
- Added examples of `thinking` plugins written in both Rust and Go under `examples/plugin/thinking`.
- Enhanced test coverage for plugin host system changes, including stream chunk translation and thinking logic.
- Improved API compatibility and ensured backward-compatible upgrades for plugin execution.
- Implemented command-line flag registration and execution for plugins with priority-based conflict resolution.
- Enabled plugin-owned command-line flag execution and persistence of plugin-auth data.
- Added new `Host` methods to support command-line capabilities, including flag normalization, validation, and execution state management.
- Introduced unit tests to ensure coverage for command-line plugin functionality, including auth data persistence.
- Updated configs to normalize plugins during initialization.
- Introduced `SetTranslatedReasoningEffort` method in `UsageReporter` to capture and log reasoning efforts from translated payloads.
- Updated executors to incorporate the new reporting functionality for handling reasoning efforts across various providers.
- Enhanced logging for thinking level extraction with new helper function `ExtractTranslatedReasoningEffort`.
- Implemented `xAI` provider for thinking configurations with support for reasoning.effort levels.
- Registered `xAI` in available providers and updated relevant APIs for compatibility.
- Added unit tests for `xAI` provider functionality, including fallback logic for unsupported levels.
- Integrated `xAI` with executor handling and ensured conformance with OpenAI-compatible standards.
- Updated all references from v6 to v7 for `github.com/router-for-me/CLIProxyAPI`.
- Ensured consistency in imports within core libraries, tests, and integration tests.
- Added missing tests for new features in Redis Protocol integration.
- Deleted `iflow` provider implementation, including thinking configuration (`apply.go`) and authentication modules.
- Removed iFlow-specific tests, executors, and helpers across SDK and internal components.
- Updated all references to exclude iFlow functionality.
- Deleted `QwenAuthenticator`, internal `qwen_auth`, and `qwen_executor` implementations.
- Removed all Qwen-related OAuth flows, token handling, and execution logic.
- Cleaned up dependencies and references to Qwen across the codebase.
When adjustedBudget < minBudget, the previous fix blindly set
max_tokens = budgetTokens+1 which could exceed MaxCompletionTokens.
Now: cap max_tokens at MaxCompletionTokens, recalculate budget, and
disable thinking entirely if constraints are unsatisfiable.
Add unit tests covering raise, clamp, disable, and no-op scenarios.
1. Always include interleaved-thinking-2025-05-14 beta header so that
thinking blocks are returned correctly for all Claude models.
2. Remove status-code guard in AMP reverse proxy ModifyResponse so that
error responses (4xx/5xx) with hidden gzip encoding are decoded
properly — prevents garbled error messages reaching the client.
3. In normalizeClaudeBudget, when the adjusted budget falls below the
model minimum, set max_tokens = budgetTokens+1 instead of leaving
the request unchanged (which causes a 400 from the API).
Add support for Claude's "adaptive" and "auto" thinking modes using `output_config.effort`. Introduce support for new effort level "max" in adaptive thinking. Update thinking logic, validate model capabilities, and extend converters and handling to ensure compatibility with adaptive modes. Adjust static model data with supported levels and refine handling across translators and executors.
CPA-internal thinking levels like 'xhigh' and 'minimal' are not accepted
by OpenAI-format providers (MiniMax, etc.). The OpenAI applier now maps
non-standard levels to the nearest valid reasoning_effort value before
writing to the request body:
xhigh → high
minimal → low
auto → medium
- OAuth2 device authorization grant flow (RFC 8628) for authentication
- Streaming and non-streaming chat completions via OpenAI-compatible API
- Models: kimi-k2, kimi-k2-thinking, kimi-k2.5
- CLI `--kimi-login` command for device flow auth
- Token management with automatic refresh
- Thinking/reasoning effort support for thinking-enabled models
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Google official Gemini Python SDK sends thinking_level, thinking_budget,
and include_thoughts (snake_case) instead of thinkingLevel, thinkingBudget,
and includeThoughts (camelCase). This caused thinking configuration to be
ignored when using Python SDK.
Changes:
- Extract layer: extractGeminiConfig now reads snake_case as fallback
- Apply layer: Gemini/CLI/Antigravity appliers clean up snake_case fields
- Translator layer: Gemini->OpenAI/Claude/Codex translators support fallback
- Tests: Added 4 test cases for snake_case field coverage
Fixes#1426
Update ApplyThinking signature to accept fromFormat and toFormat parameters
instead of a single provider string. This enables:
- Proper level-to-budget conversion when source is level-based (openai/codex)
and target is budget-based (gemini/claude)
- Strict budget range validation when source and target formats match
- Level clamping to nearest supported level for cross-format requests
- Format alias resolution in SDK translator registry for codex/openai-response
Also adds ErrBudgetOutOfRange error code and improves iflow config extraction
to fall back to openai format when iflow-specific config is not present.