* fix(translator): emit Claude server tool blocks for Codex web_search_call streams
Map Codex Responses streaming web_search_call events to Claude SSE
server_tool_use and web_search_tool_result blocks, with deduplication
and a focused stream regression test.
* fix(translator): stabilize Codex web_search fallback tool_use IDs
Reuse the active fallback web_search tool_use ID across later stream
events so tool_result blocks stay paired when upstream omits item IDs.
This is defensive hardening; live Codex streams already provide ws_* IDs.
* fix(translator): emit Codex web_search blocks from populated items
Wait for output_item.done before emitting Claude web_search tool_use
and tool_result blocks, and avoid deduping early added/completed events
that arrive before action.query is available. Matches live Responses
stream ordering seen in local tmux verification.
* fix(translator): map Codex web_search_call items in non-stream Claude responses
Emit server_tool_use and web_search_tool_result blocks from completed
response.output web_search_call items, matching the streaming translator.
* fix(translator): keep non-stream web_search on end_turn and dedupe output items
Do not treat server web_search_call items as client tool_use for stop_reason.
Skip duplicate or query-less open_page web_search output items in non-stream
translation, matching spark live behavior.
- Updated `openai_videos_handlers` to extract and set `video_url` from payloads when available.
- Enhanced unit tests to validate correct `video_url` extraction and inclusion in responses.
- Included CatAPI information in README files (EN, JA, CN) to acknowledge sponsorship.
- Added CatAPI logo and sign-up link with credit claim details.
- Updated project assets to include CatAPI logo.
* feat(plugin): add ModelRouter before auth with single-slot routing targets
## Motivation
Plugins that need to change execution based on the **original inbound request**
(protocol format, raw body, headers, query, stream flag, metadata, etc.) often
resorted to virtual/trampoline models or routing inside interceptors. This
commit adds **ModelRouter**: a pluggable layer **before** model-to-provider
resolution and AuthManager credential selection, so plugins can declare who
executes a request without spoofing the client model name.
This is a **new capability**, not a bugfix on the existing chain. With no
ModelRouter plugins loaded, behavior matches upstream.
## Pipeline placement
- `execute`, `stream`, and `count` (and image paths via AuthManager) call
`applyModelRouter()` before building `coreexecutor.Request`.
- Routing runs **before** the request interceptor (before auth), so routers see
the client’s original context. After a plugin executor is chosen, the existing
**after-auth interceptor → response/stream interceptor** chain still applies.
- Internal `ExecuteModel` / `ExecuteModelStream` (host callbacks) support
`SkipRouterPluginID` so nested calls do not re-enter the same router.
## Routing API (single slot, mutually exclusive)
`ModelRouteResponse` uses **one target slot** to avoid ambiguity when both
`TargetExecutorPluginID` and `TargetProvider` were set and the host ignored one:
| Field | Meaning |
|-------|---------|
| `Handled` | `false`: this router declines; try the next router or default path |
| `TargetKind` | `self` \| `executor` \| `provider` (pick one) |
| `Target` | `self`/`executor`: plugin ID; `provider`: built-in provider key |
| `TargetModel` | Optional on `provider` only; empty keeps client `RequestedModel` |
| `Reason` | Optional diagnostic text |
- **self**: the router plugin’s own executor (`Target` normalized to the router’s plugin ID).
- **executor**: another plugin’s executor; host pre-checks with `executorPluginReady()`
(executor declared and provider identifier resolvable) to avoid handled routes that 500 at execution.
- **provider**: skip registry model resolution; fixed built-in AuthManager path; optional
`TargetModel` for execution model only—**does not** change outward requested-model metadata.
Routers run in **descending plugin priority** (tie-break: ascending plugin ID). Panic, error,
invalid target, or unavailable executor/provider → log and **fall through to the next router**;
if none handle, use the original provider+auth flow.
## Context exposed to routers
`ModelRouteRequest` includes:
- `SourceFormat`, `RequestedModel`, `Stream`
- `Headers`, `Query`, `Body` (defensive copies)
- `Metadata` (best-effort read-only context snapshot)
- `AvailableProviders`: built-in provider keys with at least one **non-disabled** auth
(`AuthManager.AvailableProviders()`). **Does not** reflect per-model cooldown or transient
unavailability—treat as an optimistic snapshot.
Adds `AuthManager.HasProviderAuth()` and `AvailableProviders()`, excluding `Disabled` and
`StatusDisabled` auths consistently with credential selection.
## Host and RPC
- Go plugins: `pluginapi.ModelRouter` + `RouteModel()`.
- RPC plugins: `pluginabi.MethodModelRoute` (`model.route`), capability flag `model_router`.
- `pluginhost.Host` implements `RouteModel` / `RouteModelExcept`; handlers use
`SetModelRouterHost` or a `PluginHost` type assertion; **direct executor** paths use
`ExecutePluginExecutor*` / `CountPluginExecutor`.
- No bundled example ModelRouter plugin; capability is active only when a third-party plugin
declares `model_router` and loads.
## Plugin RPC schema (policy A, upstream-aligned)
- `pluginabi.SchemaVersion` stays **1**: capability additions (`model_router`, `model.route`)
do not bump the number; increment only on breaking RPC JSON changes.
- Host sends `schema_version` at register; reject only if the plugin declares a **higher**
version than the host.
- No unpublished “ModelRouter requires schema ≥ 3” gate (v3 single-slot API was never public).
- Existing plugins and examples without `model_router` (`schema_version: 1`) need no changes.
- RPC ModelRouter: `schema_version: 1` + `model_router: true` + implement `model.route`.
## Path consistency within this commit
- Provider routes reuse image-only model checks (e.g. `gpt-image-2`) on the normalized model,
same as the default AuthManager path.
- `count` aligned with execute/stream: `SkipRouterPluginID`, query/headers injection,
interceptor skip semantics.
- Handlers: `modelRoutersEnabled` treats hosts without `HasModelRouters` as disabled
(same as before ModelRouter existed); `pluginhost.Host` implements the detector.
- API docs: `ModelRouter` explicitly includes built-in **provider** targets (in addition to
plugin executors and the router’s own executor).
## Testing
go test ./internal/pluginhost ./sdk/api/handlers ./sdk/pluginapi ./sdk/pluginabi ./sdk/cliproxy/auth
go build -o test-output ./cmd/server && rm test-output
go test ./...
* fix(handlers): address ModelRouter review feedback
- Use modelExecutionQuery for plugin executor and AuthManager paths so
inbound URL query matches router/header behavior
- Guard queryFromContext when gin Request.URL is nil
- Read plugin executor stream chunks via nextStreamChunk to exit on cancel
- Drop redundant clonePluginMetadata on capability record meta
Tests cover query propagation, stream cancel, and nil URL safety.
* feat(plugin): add Claude web search router example
Add a Claude Code web_search ModelRouter example that can route matching Claude requests through Antigravity, Codex, xAI, or Tavily.
The plugin includes executor orchestration, backend fallback/penalty handling, Tavily API key support, Claude-compatible response assembly, stream forwarding, and focused unit coverage for detection, fallback routing, model resolution, penalties, stream forwarding, and Tavily behavior.
Verification: go test -count=1 ./... in examples/plugin/claude-web-search-router/go; go build -buildmode=c-shared for the plugin; go build ./cmd/server; live local CPA curl coverage for plugin load, four explicit routes, fallback, and Codex spark routing.
* fix(pluginhost): validate executor routes before fallback
* fix(pluginhost): skip oauth-only executor routes
Keep RPC streaming executor callback scopes alive until async streams close, detach nested host.model.execute_stream contexts from request cancellation, and clean up the stream bridge on stream completion.
- Added information about exclusive retail availability of Claude Max 200 and GPT Pro 200 premium accounts.
- Enhanced descriptions of VisionCoder's offerings in README files (EN, JA, CN).
- Introduced `disable-claude-cloak-mode` configuration to globally disable Claude cloak mode with credential-level overrides.
- Enhanced `getCloakConfigFromAuth` to support fallback to metadata for cloak settings.
- Updated cloak configuration precedence logic, integrating global, credential, and default modes.
- Updated config and watcher diff handling to include `disable-claude-cloak-mode`.
Closes: #2789
- Added `ConvertClaudeToolResultContent` to standardize Claude tool_result content, preserving JSON structure and splitting out base64-encoded images.
- Updated Gemini and Gemini-CLI translators to use the new utility for generating deterministic function responses and inline image parts.
- Added comprehensive test cases for content types and edge cases, ensuring correct handling of string, JSON, and image blocks.
Closes: #2781
- Added `sanitizeClaudeWebSearchDomains` to remove empty `allowed_domains` and `blocked_domains` fields for built-in web_search tools, addressing ambiguity errors from Anthropic.
- Integrated domain sanitization into the Claude message preparation pipeline.
- Added test cases to validate correct handling of empty and non-empty domain fields across various tool types.
Closes: #2681
- Introduced `videoAuthBindingStore` for managing mappings of video IDs to credentials with TTL support.
- Updated video creation and retrieval handlers to bind and utilize credentials for authentication.
- Enhanced response models to include upstream models and adjusted request preparation logic.
- Added test coverage for video auth binding, TTL configuration, and expiration handling.
- Implemented helper methods `IsConfigAPIKeyAuth` and `toggleConfigAPIKeyExcludedAll` for managing config API key exclusions.
- Updated API request handling to support enabling/disabling config API key exclusion patterns.
- Added test coverage to validate exclusion toggling logic and persistence behavior.
- Refactored duplicate code for identifying config API key auth entries into reusable utilities.
- Added methods for managing and tracking WebSocket transcript state, including recording, prepending, and replacing transcript inputs.
- Implemented `executeCompactionTriggerFromWebsocketContext` to support compaction triggers using recorded transcript context.
- Enhanced upstream-downstream ID mapping with additional utilities and state synchronization.
- Expanded test coverage to validate transcript state management, compaction payload generation, and WebSocket response handling.
- Developed `XAIWebsocketsExecutor` for handling xAI Responses via WebSocket transport.
- Introduced session and state management with `codexWebsocketSessionStore` and `xaiWebsocketIDStateStore`.
- Added robust ID mapping for upstream and downstream request/response sequences.
- Enhanced error propagation and handling of WebSocket terminal events.
- Included utility methods for WebSocket request preparation, connection management, and state tracking.
- Added foundational support for compact and streamed responses via enhanced session tracking.
- Implemented `websocketDirectCaptureExecutor` for Codex websocket passthrough functionality.
- Added logic to bypass incremental state handling for passthrough models.
- Updated normalization, compaction, and replay handling to support passthrough mode.
- Introduced `responsesWebsocketUsesCodexWebsocketPassthrough` utility for model-specific passthrough determination.
- Expanded test coverage for websocket passthrough scenarios, including compaction and response validation.
- Introduced `/openai/v1/videos` endpoint to support OpenAI-specific video generation.
- Added error normalization and handling for OpenAI video resources, including detailed error propagation.
- Enhanced response structure to include OpenAI-specific fields for status, progress, and model mappings.
- Implemented new handlers for video content retrieval and error scenarios.
- Expanded test coverage to validate OpenAI video support, error handling, and backend compatibility.
- Enhanced Codex Websockets Executor to capture `response.done` as a terminal event, alongside `response.completed` and `error`.
- Improved error propagation for upstream websocket errors with comprehensive message handling.
- Introduced utility functions for recognizing terminal events and extracting error messages.
- Expanded tests to validate new websocket event logic, including terminal event handling and upstream error propagation.
- Introduced `executeCompact` to handle non-streaming compact responses via the `/responses/compact` endpoint.
- Added `executeCompactionTriggerStream` for streaming responses triggered by `compaction_trigger`.
- Enhanced request preparation with `prepareResponsesRequestTo` for dynamic response formats.
- Updated logic to bypass streaming for `/responses/compact` and added fallback behaviors.
- Added comprehensive tests for compact response handling and event streaming validations.
- Introduced `auth_callbacks` for handling host authentication list, get, runtime, and save operations.
- Added extensive unit tests to validate functionality, including disk fallback and runtime-specific cases.
- Created example implementation in Go to demonstrate host callback integrations.
- Updated Antigravity Credits fallback to handle KV store unavailability as a service error.
- Enhanced signature caching mechanisms with request-time KV access and sliding expiration.
- Added and improved tests for KV client interactions, including error handling and expiration behaviors.
- Introduced `CacheSignatureBestEffort` for non-critical signature caching and clarified function flows with required context.
- Ensured consistent error reporting for missing or unavailable KV stores in various scenarios.
- Replaced direct `homekv` calls with injectable KV client interfaces for `antigravity` and `codex_reasoning_replay` modules.
- Improved error reporting and handling for KV operations, including `KVGet`, `KVSet`, `KVDel`, and `KVExpire`.
- Introduced dedicated fake KV clients for expanded and granular test coverage.
- Added new unit tests to validate KV client behaviors and error scenarios, ensuring robustness and sliding expiration functionality.
Adds a fourth value for the disable-image-generation setting:
- false: inject image_generation (unchanged)
- true: strip everywhere + 404 on /v1/images/* (unchanged)
- chat: strip on non-images endpoints, keep /v1/images/* (unchanged)
- passthrough: never inject and never strip on non-images endpoints
(the client payload is forwarded unchanged); behaves like
"chat" on /v1/images/* endpoints.
image_generation injection (codex executors) is already gated on the Off
mode, and the /v1/images/* 404 gate is already gated on the All mode, so
passthrough only required a change to the payload strip logic in
payload_helpers.go, now expressed via shouldStripImageGeneration().
Closes#3831
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>