CLIProxyAPI

mirror of https://github.com/router-for-me/CLIProxyAPI.git synced 2026-06-11 00:36:08 +08:00

Author	SHA1	Message	Date
Luis Pater	959067edfb	feat(usage): introduce executor type tracking in usage reporting - Replaced `NewUsageReporter` with `NewExecutorUsageReporter` to include executor type in usage records. - Updated all executors to use the new reporter implementation. - Extended `UsageReporter` to track and publish executor type. - Added tests to validate proper executor type recording and handling. - Enhanced RedisQueue plugin and payload schema with executor type support.	2026-06-02 00:43:16 +08:00
sususu98	aee7a5fbc5	feat: intercept incompatible signature replay	2026-05-29 15:22:57 +08:00
Luis Pater	94c1b25146	feat(executor): add TTFT tracking and reporting for enhanced performance metrics - Introduced Time-To-First-Token (TTFT) measurement and reporting across major executors. - Added TTFT calculation to `UsageReporter`, including support for HTTP clients and WebSocket communication. - Updated tests to validate TTFT tracking in streamed and non-streamed scenarios. - Ensured integration with `usage` plugin and augmented usage records with TTFT data.	2026-05-28 02:59:24 +08:00
Luis Pater	11f0f906bd	feat(logging): add `SetTranslatedReasoningEffort` to track reasoning levels in usage reporting - Introduced `SetTranslatedReasoningEffort` method in `UsageReporter` to capture and log reasoning efforts from translated payloads. - Updated executors to incorporate the new reporting functionality for handling reasoning efforts across various providers. - Enhanced logging for thinking level extraction with new helper function `ExtractTranslatedReasoningEffort`.	2026-05-28 02:19:45 +08:00
Luis Pater	feebe6c7f2	feat(api): add OpenAI compatibility for image models - Introduced OpenAI-compatible image model support in the API, enabling integration through image generation and editing endpoints. - Added registry type for OpenAIImageModelType to classify and validate compatibility. - Implemented request handling for OpenAI-compatible image models, including JSON and multipart formats. - Enhanced executor methods to support OpenAI-compatible image streaming and non-streaming requests. - Included tests to validate model registration, streaming behavior, and multipart payload formatting.	2026-05-19 10:13:26 +08:00
Luis Pater	2007a89594	feat(runtime): enhance payload rule resolution with dynamic path support - Introduced `resolvePayloadRulePaths` function to dynamically resolve rule paths supporting array queries and complex logic. - Updated payload processing logic (`apply defaults`, `overrides`, `filters`) to handle resolved paths for better flexibility. - Added helper functions for path parsing, query matching, and logical resolution to improve modularity and reusability. - Introduced payload condition match logic, including `match`, `not-match`, `exist`, and `not-exist` rules in `PayloadConfig`. - Enhanced `payloadModelRulesMatch` function to support conditional checks at various levels. - Added helper methods for evaluating JSON path conditions and values. - Updated tests to validate new conditional rules against different payload scenarios.	2026-05-17 23:06:43 +08:00
Luis Pater	66c3dae06b	feat(home): implement `count` for home auth dispatch requests and enable usage statistics - Added `count` attribute to `homeAuthCount` requests to improve home message batching. - Enabled usage statistics for home mode by default and added config-level enforcement. - Adjusted failure logging to include detailed metadata in `UsageReporter`. - Updated multiple executors to pass error details to `PublishFailure` for better debugging. - Enhanced unit tests to validate `count` behavior and usage statistics enforcement across components.	2026-05-10 01:30:43 +08:00
Luis Pater	e50cabac4b	chore: upgrade CLIProxyAPI dependency to v7 across the project - Updated all references from v6 to v7 for `github.com/router-for-me/CLIProxyAPI`. - Ensured consistency in imports within core libraries, tests, and integration tests. - Added missing tests for new features in Redis Protocol integration.	2026-05-08 11:46:46 +08:00
Luis Pater	e4a93c02c5	fix(executor): enhance parsing of OpenAI stream data lines - Added trimming for stream input lines to prevent processing of unnecessary whitespace. - Improved handling of unsupported prefixes and malformed JSON responses, ensuring errors are recorded and propagated appropriately. Fixed: #2690	2026-05-04 23:42:26 +08:00
Luis Pater	162897e02a	Merge remote-tracking branch 'origin/pr/3205' into dev	2026-05-04 21:17:01 +08:00
Luis Pater	89d80bfff4	fix(executor): adjust ApplyThinking order and add payload override test - Moved `ApplyThinking` logic earlier in `openai_compat_executor` to align with configuration application sequence. - Added test to verify payload override precedence over Thinking suffix configuration.	2026-05-04 16:45:25 +08:00
1137043480	bf0e5c23f7	fix: prevent goroutine leaks in streaming executors via context-aware channel sends All streaming executors use bare channel sends (out <- chunk) inside goroutines that process upstream SSE responses. When the downstream consumer disconnects (client timeout, network drop, etc.), these sends block indefinitely, causing the goroutine and all associated resources (HTTP response body, scanner buffers, translation state) to leak permanently. Over time, leaked goroutines accumulate monotonically, leading to RSS growth from ~30MB to 3.7GB+ and eventual OOM kills on resource-constrained VPS hosts. Fix: Replace all bare 'out <- ...' sends with: select { case out <- ...: case <-ctx.Done(): return } This ensures goroutines terminate promptly when the request context is canceled, allowing GC to reclaim all associated resources. Affected executors (9 files, 36+ send sites): - antigravity_executor.go (5 sites) - gemini_cli_executor.go (6 sites) - gemini_vertex_executor.go (6 sites) - aistudio_executor.go (4 sites) - gemini_executor.go (3 sites) - openai_compat_executor.go (3 sites) - claude_executor.go (4 sites) - codex_executor.go (2 sites) - kimi_executor.go (3 sites)	2026-05-03 11:25:04 -04:00
Luis Pater	f56a19e5b8	feat: add tri-state support for `disable-image-generation` configuration - Introduced `DisableImageGenerationMode` with support for `false`, `true`, and `chat` values. - Updated payload handling to preserve `image_generation` on images endpoints when `chat` mode is enabled. - Modified OpenAI image handlers (`ImagesGenerations`, `ImagesEdits`) to respect tri-state logic. - Added unit tests for `DisableImageGenerationMode` behavior and endpoint-specific handling. - Enhanced configuration diff logging to support `DisableImageGenerationMode`.	2026-04-30 12:10:27 +08:00
Luis Pater	38573050aa	feat(config): add support for disabling OpenAI compatibility providers - Introduced a `Disabled` flag to OpenAI compatibility configurations. - Updated routing, auth selection, and API handling logic to respect the `Disabled` state. - Extended relevant APIs, YAML configurations, and data structures to include the `Disabled` field. - Adjusted all relevant loops and filters to skip disabled providers. Closes: #3060 #3059 #2977	2026-04-26 21:49:36 +08:00
James	65e9e892a4	Fix missing `response.completed.usage` for late-usage OpenAI-compatible streams	2026-04-04 05:58:04 +00:00
Luis Pater	d2c7e4e96a	refactor(runtime): move executor utilities to `helps` package and update references	2026-04-01 03:08:20 +08:00
Luis Pater	2bd646ad70	refactor: replace `sjson.Set` usage with `sjson.SetBytes` to optimize mutable JSON transformations	2026-03-19 17:58:54 +08:00
Zhenyu Qi	aec65e3be3	fix(openai_compat): add stream_options.include_usage for streaming usage tracking	2026-03-13 00:48:17 -07:00
Kirill Turanskiy	1f8f198c45	feat: passthrough upstream response headers to clients CPA previously stripped ALL response headers from upstream AI provider APIs, preventing clients from seeing rate-limit info, request IDs, server-timing and other useful headers. Changes: - Add Headers field to Response and StreamResult structs - Add FilterUpstreamHeaders helper (hop-by-hop + security denylist) - Add WriteUpstreamHeaders helper (respects CPA-set headers) - ExecuteWithAuthManager/ExecuteCountWithAuthManager now return headers - ExecuteStreamWithAuthManager returns headers from initial connection - All 11 provider executors populate Response.Headers - All handler call sites write filtered upstream headers before response Filtered headers (not forwarded): - RFC 7230 hop-by-hop: Connection, Transfer-Encoding, Keep-Alive, etc. - Security: Set-Cookie - CPA-managed: Content-Length, Content-Encoding	2026-02-18 00:16:22 +03:00
Luis Pater	a5a25dec57	refactor(translator, executor): remove redundant `bytes.Clone` calls for improved performance - Replaced all instances of `bytes.Clone` with direct references to enhance efficiency. - Simplified payload handling across executors and translators by eliminating unnecessary data duplication.	2026-02-06 03:26:29 +08:00
Luis Pater	09ecfbcaed	refactor(executor): optimize payload cloning and streamline SDK translator usage - Replaced unnecessary `bytes.Clone` calls for `opts.OriginalRequest` throughout executors. - Introduced intermediate variable `originalPayloadSource` to simplify payload processing. - Ensured better clarity and structure in request translation logic.	2026-02-06 01:44:20 +08:00
Shady Khalifa	53920b0399	fix(openai): drop stream for responses/compact	2026-01-27 18:27:34 +02:00
Shady Khalifa	95096bc3fc	feat(openai): add responses/compact support	2026-01-26 16:36:01 +02:00
hkfires	f30ffd5f5e	feat(executor): add request_id to error logs Extract error.message from JSON error responses when summarizing error bodies for debug logs	2026-01-25 21:31:46 +08:00
hkfires	ecc850bfb7	feat(executor): apply payload rules using requested model	2026-01-23 16:38:41 +08:00
hkfires	e641fde25c	feat(registry): support provider-specific model info lookup	2026-01-20 10:01:17 +08:00
hkfires	c7e8830a56	refactor(thinking): pass source and target formats to ApplyThinking for cross-format validation Update ApplyThinking signature to accept fromFormat and toFormat parameters instead of a single provider string. This enables: - Proper level-to-budget conversion when source is level-based (openai/codex) and target is budget-based (gemini/claude) - Strict budget range validation when source and target formats match - Level clamping to nearest supported level for cross-format requests - Format alias resolution in SDK translator registry for codex/openai-response Also adds ErrBudgetOutOfRange error code and improves iflow config extraction to fall back to openai format when iflow-specific config is not present.	2026-01-18 10:30:15 +08:00
hkfires	72f2125668	fix(executor): properly handle thinking application errors	2026-01-15 13:06:39 +08:00
hkfires	0b06d637e7	refactor: improve thinking logic	2026-01-15 13:06:39 +08:00
Luis Pater	e8e3bc8616	feat(executor): add HttpRequest support across executors for better http request handling	2026-01-10 16:25:25 +08:00
Luis Pater	af6bdca14f	Fixed: #942 fix(executor): ignore non-SSE lines in OpenAI-compatible streams	2026-01-09 23:41:50 +08:00
Luis Pater	2a663d5cba	feat(executor): enhance payload translation with original request context Refactored `applyPayloadConfig` to `applyPayloadConfigWithRoot`, adding support for default rule validation against the original payload when available. Updated all executors to use `applyPayloadConfigWithRoot` and incorporate an optional original request payload for translations.	2026-01-02 00:03:26 +08:00
hkfires	96340bf136	refactor(executor): resolve upstream model at conductor level before execution	2025-12-30 19:31:54 +08:00
hkfires	367a05bdf6	refactor(thinking): export thinking helpers Expose thinking/effort normalization helpers from the executor package so conversion tests use production code and stay aligned with runtime validation behavior.	2025-12-15 09:16:15 +08:00
Luis Pater	660aabc437	fix(executor): add `allowCompat` support for reasoning effort normalization Introduced `allowCompat` parameter to improve compatibility handling for reasoning effort in payloads across OpenAI and similar models.	2025-12-13 04:06:02 +08:00
huynguyen03.dev	15c3cc3a50	fix(openai-compat): prevent model alias from being overwritten by ResolveOriginalModel When using OpenAI-compatible providers with model aliases (e.g., glm-4.6-zai -> glm-4.6), the alias resolution was correctly applied but then immediately overwritten by ResolveOriginalModel, causing 'Unknown Model' errors from upstream APIs. This fix skips the ResolveOriginalModel override when a model alias has already been resolved, ensuring the correct model name is sent to the upstream provider. Co-authored-by: Amp <amp@ampcode.com>	2025-12-12 17:20:24 +07:00
Luis Pater	a74ee3f319	Merge pull request #481 from sususu98/fix/increase-buffer-size fix: increase buffer size for stream scanners to 50MB across multiple executors	2025-12-11 21:20:54 +08:00
hkfires	3a81ab22fd	fix(runtime): unify reasoning effort metadata overrides	2025-12-11 14:35:05 +08:00
hkfires	519da2e042	fix(runtime): validate reasoning effort levels	2025-12-11 12:36:54 +08:00
hkfires	3ffd120ae9	feat(runtime): add thinking config normalization	2025-12-11 11:51:33 +08:00
Luis Pater	423ce97665	feat(util): implement dynamic thinking suffix normalization and refactor budget resolution logic - Added support for parsing and normalizing dynamic thinking model suffixes. - Centralized budget resolution across executors and payload helpers. - Retired legacy Gemini-specific thinking handlers in favor of unified logic. - Updated executors to use metadata-based thinking configuration. - Added `ResolveOriginalModel` utility for resolving normalized upstream models using request metadata. - Updated executors (Gemini, Codex, iFlow, OpenAI, Qwen) to incorporate upstream model resolution and substitute model values in payloads and request URLs. - Ensured fallbacks handle cases with missing or malformed metadata to derive models robustly. - Refactored upstream model resolution to dynamically incorporate metadata for selecting and normalizing models. - Improved handling of thinking configurations and model overrides in executors. - Removed hardcoded thinking model entries and migrated logic to metadata-based resolution. - Updated payload mutations to always include the resolved model.	2025-12-11 03:10:50 +08:00
sususu	76c563d161	fix(executor): increase buffer size for stream scanners to 50MB across multiple executors	2025-12-10 23:20:04 +08:00
Luis Pater	d50b0f7524	refactor(executor): simplify Gemini CLI execution and remove internal retry logic - Removed nested retry handling for 429 rate limit errors. - Simplified request/response handling by cleaning redundant retry-related code. - Eliminated `parseRetryDelay` function and max retry configuration logic.	2025-11-20 17:49:37 +08:00
Luis Pater	db2d22c978	fix(runtime): simplify scanner buffer allocation in executor implementations	2025-11-18 10:59:49 +08:00
Luis Pater	fcd98f4f9b	feat(runtime): add payload configuration support for executors Introduce `PayloadConfig` in the configuration to define default and override rules for modifying payload parameters. Implement `applyPayloadConfig` and `applyPayloadConfigWithRoot` to apply these rules across various executors, ensuring consistent parameter handling for different models and protocols. Update all relevant executors to utilize this functionality.	2025-11-13 23:27:40 +08:00
Luis Pater	ef7e8206d3	fix(executor): ensure usage reporting for upstream responses lacking usage data Add `ensurePublished` to guarantee request counting even when usage fields (e.g., tokens) are absent in OpenAI-compatible executor responses, particularly for streaming paths.	2025-11-09 17:24:47 +08:00
hkfires	cfb9cb8951	feat(config): support HTTP headers across providers	2025-11-08 20:52:05 +08:00
hkfires	a517290726	refactor(executor): summarize API error bodies of html in debug logs	2025-10-31 06:58:38 +08:00
Luis Pater	a552a45b81	Fixed: #140 #133 #80 feat(translator): add token counting functionality for Gemini, Claude, and CLI - Introduced `TokenCount` handling across various Codex translators (Gemini, Claude, CLI) with respective implementations. - Added utility methods for token counting and formatting responses. - Integrated `tiktoken-go/tokenizer` library for tokenization. - Updated CodexExecutor with token counting logic to support multiple models including GPT-5 variants. - Refined go.mod and go.sum to include new dependencies. feat(runtime): add token counting functionality across executors - Implemented token counting in OpenAICompatExecutor, QwenExecutor, and IFlowExecutor. - Added utilities for token counting and response formatting using `tiktoken-go/tokenizer`. - Integrated token counting into translators for Gemini, Claude, and Gemini CLI. - Enhanced multiple model support, including GPT-5 variants, for token counting. docs: update environment variable instructions for multi-model support - Added details for setting `ANTHROPIC_DEFAULT_OPUS_MODEL`, `ANTHROPIC_DEFAULT_SONNET_MODEL`, and `ANTHROPIC_DEFAULT_HAIKU_MODEL` for version 2.x.x. - Clarified usage of `ANTHROPIC_MODEL` and `ANTHROPIC_SMALL_FAST_MODEL` for version 1.x.x. - Expanded examples for setting environment variables across different models including Gemini, GPT-5, Claude, and Qwen3.	2025-10-26 05:39:15 +08:00
Luis Pater	20985d1a10	Refactor executor error handling and usage reporting - Updated the Execute methods in various executors (GeminiCLIExecutor, GeminiExecutor, IFlowExecutor, OpenAICompatExecutor, QwenExecutor) to return a response and error as named return values for improved clarity. - Enhanced error handling by deferring failure tracking in usage reporters, ensuring that failures are reported correctly. - Improved response body handling by ensuring proper closure and error logging for HTTP responses across all executors. - Added failure tracking and reporting in the usage reporter to capture unsuccessful requests. - Updated the usage logging structure to include a 'Failed' field for better tracking of request outcomes. - Adjusted the logic in the RequestStatistics and Record methods to accommodate the new failure tracking mechanism.	2025-10-21 11:22:24 +08:00

1 2

63 Commits