- Introduced the `gpt-image-2` model in Codex built-ins and updated visibility logic in the registry.
- Added direct proxy support for OpenAI image generation and editing endpoints.
- Implemented new execution paths for `/images/generations` and `/images/edit`, ensuring seamless handling for both JSON and multipart payloads.
- Expanded test coverage to validate the new model and direct proxy features, including streaming scenarios and error handling.
- Deleted `geminicli` provider and related `Apply` logic.
- Removed all translator packages specific to Gemini CLI (Claude, Codex integrations).
- Purged associated test files for Gemini CLI translation.
- Removed `GeminiAuthenticator` and all associated authentication logic (OAuth flows, token handling, refresh logic).
- Deleted internal/executor Gemini OAuth support, including bearer token handling and runtime API logic.
- Purged all tests, configs, and command-line flags specific to Gemini OAuth flows.
- Updated documentation and aliases to reflect Gemini removal.
- Renamed `parseRetryDelay` to `ParseRetryDelay` and `deleteJSONField` to `DeleteJSONField`.
- Updated references in `antigravity_executor` and tests to use the new `helps` package.
- Adjusted import paths and test cases to ensure compatibility with the new location.
- Updated README files to reflect changes in the retry logic references.
- Updated `.github/ISSUE_TEMPLATE/bug_report.md` to remove deprecated Gemini CLI mention.
- Added auth binding logic to tie video requests to specific authentication IDs.
- Enhanced video content handlers to support proxy configuration based on selected auth.
- Introduced helper functions for creating HTTP clients with direct or global proxy fallback.
- Expanded unit tests to validate auth binding, proxy usage, and fallback behavior.
- Updated `openai_videos_handlers` to extract and set `video_url` from payloads when available.
- Enhanced unit tests to validate correct `video_url` extraction and inclusion in responses.
* feat(plugin): add ModelRouter before auth with single-slot routing targets
## Motivation
Plugins that need to change execution based on the **original inbound request**
(protocol format, raw body, headers, query, stream flag, metadata, etc.) often
resorted to virtual/trampoline models or routing inside interceptors. This
commit adds **ModelRouter**: a pluggable layer **before** model-to-provider
resolution and AuthManager credential selection, so plugins can declare who
executes a request without spoofing the client model name.
This is a **new capability**, not a bugfix on the existing chain. With no
ModelRouter plugins loaded, behavior matches upstream.
## Pipeline placement
- `execute`, `stream`, and `count` (and image paths via AuthManager) call
`applyModelRouter()` before building `coreexecutor.Request`.
- Routing runs **before** the request interceptor (before auth), so routers see
the client’s original context. After a plugin executor is chosen, the existing
**after-auth interceptor → response/stream interceptor** chain still applies.
- Internal `ExecuteModel` / `ExecuteModelStream` (host callbacks) support
`SkipRouterPluginID` so nested calls do not re-enter the same router.
## Routing API (single slot, mutually exclusive)
`ModelRouteResponse` uses **one target slot** to avoid ambiguity when both
`TargetExecutorPluginID` and `TargetProvider` were set and the host ignored one:
| Field | Meaning |
|-------|---------|
| `Handled` | `false`: this router declines; try the next router or default path |
| `TargetKind` | `self` \| `executor` \| `provider` (pick one) |
| `Target` | `self`/`executor`: plugin ID; `provider`: built-in provider key |
| `TargetModel` | Optional on `provider` only; empty keeps client `RequestedModel` |
| `Reason` | Optional diagnostic text |
- **self**: the router plugin’s own executor (`Target` normalized to the router’s plugin ID).
- **executor**: another plugin’s executor; host pre-checks with `executorPluginReady()`
(executor declared and provider identifier resolvable) to avoid handled routes that 500 at execution.
- **provider**: skip registry model resolution; fixed built-in AuthManager path; optional
`TargetModel` for execution model only—**does not** change outward requested-model metadata.
Routers run in **descending plugin priority** (tie-break: ascending plugin ID). Panic, error,
invalid target, or unavailable executor/provider → log and **fall through to the next router**;
if none handle, use the original provider+auth flow.
## Context exposed to routers
`ModelRouteRequest` includes:
- `SourceFormat`, `RequestedModel`, `Stream`
- `Headers`, `Query`, `Body` (defensive copies)
- `Metadata` (best-effort read-only context snapshot)
- `AvailableProviders`: built-in provider keys with at least one **non-disabled** auth
(`AuthManager.AvailableProviders()`). **Does not** reflect per-model cooldown or transient
unavailability—treat as an optimistic snapshot.
Adds `AuthManager.HasProviderAuth()` and `AvailableProviders()`, excluding `Disabled` and
`StatusDisabled` auths consistently with credential selection.
## Host and RPC
- Go plugins: `pluginapi.ModelRouter` + `RouteModel()`.
- RPC plugins: `pluginabi.MethodModelRoute` (`model.route`), capability flag `model_router`.
- `pluginhost.Host` implements `RouteModel` / `RouteModelExcept`; handlers use
`SetModelRouterHost` or a `PluginHost` type assertion; **direct executor** paths use
`ExecutePluginExecutor*` / `CountPluginExecutor`.
- No bundled example ModelRouter plugin; capability is active only when a third-party plugin
declares `model_router` and loads.
## Plugin RPC schema (policy A, upstream-aligned)
- `pluginabi.SchemaVersion` stays **1**: capability additions (`model_router`, `model.route`)
do not bump the number; increment only on breaking RPC JSON changes.
- Host sends `schema_version` at register; reject only if the plugin declares a **higher**
version than the host.
- No unpublished “ModelRouter requires schema ≥ 3” gate (v3 single-slot API was never public).
- Existing plugins and examples without `model_router` (`schema_version: 1`) need no changes.
- RPC ModelRouter: `schema_version: 1` + `model_router: true` + implement `model.route`.
## Path consistency within this commit
- Provider routes reuse image-only model checks (e.g. `gpt-image-2`) on the normalized model,
same as the default AuthManager path.
- `count` aligned with execute/stream: `SkipRouterPluginID`, query/headers injection,
interceptor skip semantics.
- Handlers: `modelRoutersEnabled` treats hosts without `HasModelRouters` as disabled
(same as before ModelRouter existed); `pluginhost.Host` implements the detector.
- API docs: `ModelRouter` explicitly includes built-in **provider** targets (in addition to
plugin executors and the router’s own executor).
## Testing
go test ./internal/pluginhost ./sdk/api/handlers ./sdk/pluginapi ./sdk/pluginabi ./sdk/cliproxy/auth
go build -o test-output ./cmd/server && rm test-output
go test ./...
* fix(handlers): address ModelRouter review feedback
- Use modelExecutionQuery for plugin executor and AuthManager paths so
inbound URL query matches router/header behavior
- Guard queryFromContext when gin Request.URL is nil
- Read plugin executor stream chunks via nextStreamChunk to exit on cancel
- Drop redundant clonePluginMetadata on capability record meta
Tests cover query propagation, stream cancel, and nil URL safety.
* feat(plugin): add Claude web search router example
Add a Claude Code web_search ModelRouter example that can route matching Claude requests through Antigravity, Codex, xAI, or Tavily.
The plugin includes executor orchestration, backend fallback/penalty handling, Tavily API key support, Claude-compatible response assembly, stream forwarding, and focused unit coverage for detection, fallback routing, model resolution, penalties, stream forwarding, and Tavily behavior.
Verification: go test -count=1 ./... in examples/plugin/claude-web-search-router/go; go build -buildmode=c-shared for the plugin; go build ./cmd/server; live local CPA curl coverage for plugin load, four explicit routes, fallback, and Codex spark routing.
* fix(pluginhost): validate executor routes before fallback
* fix(pluginhost): skip oauth-only executor routes
- Introduced `videoAuthBindingStore` for managing mappings of video IDs to credentials with TTL support.
- Updated video creation and retrieval handlers to bind and utilize credentials for authentication.
- Enhanced response models to include upstream models and adjusted request preparation logic.
- Added test coverage for video auth binding, TTL configuration, and expiration handling.
- Developed `XAIWebsocketsExecutor` for handling xAI Responses via WebSocket transport.
- Introduced session and state management with `codexWebsocketSessionStore` and `xaiWebsocketIDStateStore`.
- Added robust ID mapping for upstream and downstream request/response sequences.
- Enhanced error propagation and handling of WebSocket terminal events.
- Included utility methods for WebSocket request preparation, connection management, and state tracking.
- Added foundational support for compact and streamed responses via enhanced session tracking.
- Implemented `websocketDirectCaptureExecutor` for Codex websocket passthrough functionality.
- Added logic to bypass incremental state handling for passthrough models.
- Updated normalization, compaction, and replay handling to support passthrough mode.
- Introduced `responsesWebsocketUsesCodexWebsocketPassthrough` utility for model-specific passthrough determination.
- Expanded test coverage for websocket passthrough scenarios, including compaction and response validation.
- Introduced `/openai/v1/videos` endpoint to support OpenAI-specific video generation.
- Added error normalization and handling for OpenAI video resources, including detailed error propagation.
- Enhanced response structure to include OpenAI-specific fields for status, progress, and model mappings.
- Implemented new handlers for video content retrieval and error scenarios.
- Expanded test coverage to validate OpenAI video support, error handling, and backend compatibility.
- Enhanced Codex Websockets Executor to capture `response.done` as a terminal event, alongside `response.completed` and `error`.
- Improved error propagation for upstream websocket errors with comprehensive message handling.
- Introduced utility functions for recognizing terminal events and extracting error messages.
- Expanded tests to validate new websocket event logic, including terminal event handling and upstream error propagation.
- Introduced test scenarios to validate `previous_response_id` injection during incremental and non-incremental requests.
- Verified behavior for pending tool calls, including proper inclusion or exclusion in websocket requests.
- Updated websocket handling logic to track `lastResponseID` and `pendingToolCallIDs`.
- Added utility functions for pending tool call validation and cleanup.
- Updated host model callback logic to skip originating plugin's interceptors during nested model executions.
- Added `SkipInterceptorPluginID` field to plugin API structs for controlling interceptor bypass behavior.
- Introduced supporting logic in host API handlers, plugin host registry, and callback contexts to identify and skip specific plugins.
- Enhanced unit tests across plugin host, API handlers, and execution paths to verify interceptor skipping behavior and plugin isolation.
- Revised documentation to clarify non-recursive behavior of host model callbacks and the use of `SkipInterceptorPluginID`.
- Added an example plugin `host-model-callback` in Go to summarize host model callbacks.
- Implemented `cliproxy_plugin_init`, `cliproxyPluginCall`, and other plugin functions for callback handling.
- Introduced API handlers for `ModelExecution` and `ModelExecutionStream` with support for both streaming and non-streaming requests.
- Included unit tests (`model_execution_test.go`) to validate execution logic and streaming responses.
- Introduced `applyRequestAfterAuthInterceptor` to modify requests after credential selection and before executor translation.
- Added `InterceptRequestAfterAuth` method across plugin adapters with corresponding tests for context validation.
- Enhanced format resolution logic (`requestToFormat`) to support additional providers and formats.
- Updated JavaScript handler to include a new `on_after_auth_request` hook for post-auth request handling.
- Refactored interceptor methods for clarity and better encapsulation of request/response lifecycles.
- Added `opts.OriginalRequest` handling to `applyResponseInterceptors` for improved context passing.
- Introduced new test `TestApplyJSBeforeRequestUsesReturnedCtxBody` to validate JavaScript interceptor behavior.
- Updated JavaScript-based handler to safely rewrite sensitive content and headers in requests.
- Refined interceptor logic to ensure consistent state retention across request processing.
- Implemented `RequestInterceptor`, `ResponseInterceptor`, and `StreamChunkInterceptor` capabilities.
- Added `sanitizePluginMetadata` to clean metadata for RPC compatibility.
- Enhanced interceptor chaining, error handling, and test coverage.
- Updated plugin configuration to register and dispatch interceptor methods.
- Introduced support for file-backed logging of API requests and responses to handle large payloads efficiently.
- Refactored `attachWebsocketLogSources` to `attachRequestLogSources` for broader request and response handling.
- Added new methods for appending request/response data to file-backed sources and updated existing logging workflows for compatibility.
- Improved cleanup and merge logic for file-backed sources during request processing.
- Updated tests to cover newly introduced file-backed logging functionality.
Address review feedback: parse each item's type/id/call_id once with
gjson.GetManyBytes and reuse it across the dedupe loops instead of
rescanning every item up to five times. Behavior is unchanged.
The input item ID dedupe added in #3620 keeps only the last occurrence of
each item id. When an upstream reuses the same item id across a re-sent or
repaired tool call (so two function_call items share an id but carry
different call_ids), the last-wins rule can drop the function_call whose
call_id still has a matching function_call_output. The upstream then rejects
the request with HTTP 400 "No tool call found for function call output with
call_id ...", breaking every subsequent turn over the Codex WebSocket path.
Make the dedupe orphan-aware: when several input items share an id, never
replace an item whose call_id is still referenced by a tool-call output with
one that is not. This keeps a single item per id (preserving the original
intent) while ensuring retained tool calls stay paired with their outputs.
Adds a regression test covering two function_call items that share an id
where only the earlier call_id has a surviving output.
- Updated session handling to replace `Session_id` and `Conversation_id` headers with new logic ensuring consistent use of `Cache.ID` and prompt keys.
- Restored `Session_id` as a priority extraction source for `ExtractSessionID`.
- Added tests to validate case-sensitive and case-insensitive headers, canonical account header usage, and session key preservation.
- Removed legacy support for deprecated `Conversation_id` header to clean up API.
- Introduced `grok-imagine-video-1.5-preview` as a new XAI video model.
- Updated handlers, registry, and validation logic to include support for the new model.
- Enhanced test coverage to validate integration and functionality of the preview model.
- Added `applyCodexIdentityConfuse*` functions for remapping request and response payloads and headers to enhance security.
- Updated WebSocket and HTTP logic to handle identity state transformations seamlessly.
- Introduced unit tests to verify remapping and restoration of identity-related fields.
- Added `sanitizeDownstreamWebsocketFallbackRequest` to clean `generate` from payload for HTTP fallback requests.
- Implemented tests to validate payload handling logic in WebSocket-to-HTTP transitions.
Closes: #3556
- Introduced `service_tier` metadata key to capture client-requested service tiers.
- Updated usage records, context propagation, and plugins to include service tier data.
- Added default handling logic for cases where `service_tier` is absent.
- Implemented tests for `service_tier` extraction, defaults, and updates across components.
- Updated WebSocket response repair tests to validate incremental preservation of response calls and outputs.
- Added new test cases for custom tool responses ensuring accurate handling of output cache and call cache.
- Refactored `repairResponsesWebsocketToolCallsWithCaches` to handle orphan outputs more consistently.
- Adjusted input filtering logic for clearer incremental repair behavior.
Closes: #3569
- Introduced `GPTImage2BaseModel` configuration for hosted image generation tools with validation for "gpt-" prefix.
- Added logic to dynamically resolve and apply the base model in Codex executor workflows.
- Enhanced server-sent events (SSE) implementation with keep-alive tickers and error events for stream reliability.
- Updated configuration file examples and internal documentation.
- Introduced `FileBodySource` to support large request log sections stored in temp files.
- Added file-backed support for WebSocket timeline and API WebSocket timeline logging.
- Updated `LogRequest` and middleware to integrate optional file-backed sources.
- Implemented clean-up mechanisms to manage temporary log files after processing.
- Added new reasoning levels: `none`, `minimal`, and `unsupported` to Codex model configurations.
- Introduced metadata sanitization and normalization for reasoning levels in API response.
- Extended unit tests to cover reasoning levels validation and metadata sanitation logic.
- Introduced OpenAI-compatible image model support in the API, enabling integration through image generation and editing endpoints.
- Added registry type for OpenAIImageModelType to classify and validate compatibility.
- Implemented request handling for OpenAI-compatible image models, including JSON and multipart formats.
- Enhanced executor methods to support OpenAI-compatible image streaming and non-streaming requests.
- Included tests to validate model registration, streaming behavior, and multipart payload formatting.
- Added APIs to store, retrieve, and clone upstream response headers in context for detailed logging.
- Updated `RecordAPIResponseMetadata`, `RecordAPIWebsocketHandshake`, and related methods to capture response headers.
- Extended `UsageReporter` to include response headers in published usage records.
- Enhanced payload tests to validate response headers' integrity and persistence.
- Refactored `usage.Record` to support optional `ResponseHeaders` field.
- Relocated Codex client model JSON and related logic from `openai` package to `registry` for better modularity.
- Updated references to use `registry.GetCodexClientModelsJSON()` in loading logic.
- Extended test cases to cover additional field removals (`upgrade`, `availability_nux`).
- Introduced Codex client models framework in `openai` package.
- Added JSON-based model definitions (`codex_client_models.json`) for Codex, including metadata, reasoning levels, and configuration options.
- Implemented handlers to load, clone, and build Codex client models with support for visibility overrides and metadata application.
- Enabled sorting and prioritization of models based on configuration or runtime criteria.
- Added utility functions for managing and validating model attributes.
- Introduced new xAI `grok-imagine-video` model for video generation with configurable options (e.g., duration, size, resolution).
- Implemented video-specific API endpoints (`/v1/videos`, `/v1/videos/generations`, `/v1/videos/edits`, `/v1/videos/extensions`), including request validation and model handling.
- Enhanced model registry with `xaiBuiltinVideoModelID` and metadata for video capabilities.
- Added unit tests to validate video model support, request structures, and API response handling.
- Extended `XAIExecutor` to integrate video generation and retrieval via runtime requests.
- Added new xAI Grok image models (`grok-imagine-image`, `grok-imagine-image-quality`) with high-fidelity and aspect ratio configurations.
- Extended `isSupportedImagesModel` logic to validate xAI models.
- Implemented API request builders for image generation/editing with customizable options (e.g., resolution, aspect ratio, response format).
- Enhanced `/v1/images` endpoints to handle xAI model capabilities, including response normalization and model-specific handlers.
- Updated unit tests to validate xAI model validation, request structure, and API integration.
- Introduced `ReadRequestBody` helper function to support decoding request bodies based on "Content-Encoding" (e.g., `zstd`).
- Replaced `c.GetRawData()` with `ReadRequestBody` across handlers to enable decoding.
- Added test case to validate `zstd` decoding for compact responses.
- Consolidated `homeRuntimeAuths` to store a map of session-scoped auth maps, replacing `homeRuntimeAuthSessions` and `homeRuntimeAuthRefs`.
- Adjusted session cleanup logic to directly remove session-scoped auths without reference counting.
- Added `GetExecutionSessionAuthByID` to retrieve auths scoped to a specific execution session.
- Updated tests to reflect the new session-scoped caching behavior.
- Added new helper methods for OAuth session management (`RegisterOAuthSession`, `CompleteOAuthSession`, etc.).
- Introduced `WriteConfig` for persisting management configurations.
- Exported `Handler` type and `NewHandler` constructors for SDK consumers.
- Updated all references from v6 to v7 for `github.com/router-for-me/CLIProxyAPI`.
- Ensured consistency in imports within core libraries, tests, and integration tests.
- Added missing tests for new features in Redis Protocol integration.
- Introduced `UpstreamDisconnectChan` for Codex WebSocket sessions to notify downstream connections of upstream disconnections.
- Implemented `notifyUpstreamDisconnect` to signal errors and close channels on disconnect events.
- Added integration tests to validate WebSocket session behavior on upstream disconnect.
- Updated OpenAI WebSocket response handlers to properly close connections upon upstream disconnect notifications.