canary: pin LLM backend via settings API + add LLM_API_KEY (root cause of CI freeze)

The auth-browser-consent google lane has been freezing on CI at
"Thinking (step 1)..." for the full chat-wait budget. Gateway logs
captured by the previous commit's pipe drainer reveal the smoking
gun:

  ERROR Configured LLM backend is not usable.
        backend=openai_compatible reason=missing API key
  WARN  LLM_BACKEND env var is set but DB setting takes priority.
        db_value=nearai env_value=openai_compatible
  WARN  Active LLM backend fell back to NearAI default
        attempted=openai_compatible active=nearai

Two compounding issues:

1. The openai_compatible provider refuses to instantiate without an
   API key, even though the mock LLM ignores the value. Fix: set
   `LLM_API_KEY=mock-api-key` in `build_gateway_env`, matching what
   `tests/e2e/conftest.py` already does for the e2e suite.

2. IronClaw's DB-stored LLM settings take priority over env vars,
   and the freshly-seeded canary DB defaults `llm_backend` to
   `nearai`. So even with a clean env, the agent fell back to NearAI
   and entered an interactive auth flow that hangs indefinitely in
   CI (the "Thinking" never ends). This is the exact trap
   `tests/e2e/CLAUDE.md` documents: "do not rely on env-vs-DB
   precedence … pin the provider explicitly through /api/settings/...".
   Fix: pin `llm_backend`, `openai_compatible_base_url`, and
   `selected_model` via PUT /api/settings/<key> immediately after the
   gateway becomes healthy.

Also revert the BROWSER_CASES["google"] case I touched earlier:
when NearAI was driving it emitted the WASM canonical tool name
(`gmail_tool`), but the mock LLM (now correctly driving) emits the
tool name it knows from its mapping (`gmail`). Restoring the original
`expected_tool_name="gmail"` / `expected_text="gmail"` matches what
the mock LLM actually produces.

Verified locally: all three browser_oauth / browser_chat /
responses_api probes now pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Nikolay Pismenkov
2026-04-27 14:49:33 -07:00
parent f59981d362
commit 8733d3c036
2 changed files with 47 additions and 2 deletions

View File

@@ -163,8 +163,8 @@ BROWSER_CASES: dict[str, BrowserProviderCase] = {
install_kind=None,
install_url=None,
trigger_prompt="check gmail unread",
expected_tool_name="gmail_tool",
expected_text="inbox",
expected_tool_name="gmail",
expected_text="gmail",
auth_extension_name="gmail",
),
"github": BrowserProviderCase(

View File

@@ -297,6 +297,10 @@ def build_gateway_env(
"LLM_BACKEND": "openai_compatible",
"LLM_BASE_URL": mock_llm_url,
"LLM_MODEL": "mock-model",
# Even though the mock LLM ignores the API key, the
# openai_compatible provider refuses to instantiate without one.
# Without this the provider falls back to NearAI's DB default.
"LLM_API_KEY": "mock-api-key",
"DATABASE_BACKEND": "libsql",
"LIBSQL_PATH": str(db_path),
"SECRETS_MASTER_KEY": secrets_master_key,
@@ -315,6 +319,38 @@ def build_gateway_env(
return env
async def _pin_mock_llm_settings(
base_url: str, gateway_token: str, mock_llm_url: str
) -> None:
"""Pin LLM backend/base_url/model via the settings API.
Required because the gateway's DB settings take priority over the
LLM_BACKEND / LLM_BASE_URL / LLM_MODEL env vars; the freshly-seeded
DB defaults llm_backend to `nearai`, which sends the agent into an
interactive auth flow that hangs in CI. See tests/e2e/CLAUDE.md.
"""
import httpx # local import: keep top-level import set unchanged
headers = {"Authorization": f"Bearer {gateway_token}"}
writes = [
("llm_backend", "openai_compatible"),
("openai_compatible_base_url", mock_llm_url),
("selected_model", "mock-model"),
]
async with httpx.AsyncClient(timeout=15.0) as client:
for key, value in writes:
response = await client.put(
f"{base_url}/api/settings/{key}",
headers=headers,
json={"value": value},
)
if response.status_code not in (200, 201, 204):
raise CanaryError(
f"Failed to pin LLM setting {key}: "
f"{response.status_code} {response.text[:300]}"
)
def _drain_to_file(stream: Any, path: Path) -> threading.Thread:
"""Drain a subprocess stdout/stderr stream to a file in a daemon thread.
@@ -435,6 +471,15 @@ async def start_gateway_stack(
_drain_to_file(gateway_proc.stdout, log_dir / "gateway.log")
base_url = f"http://127.0.0.1:{gateway_port}"
await wait_for_ready(f"{base_url}/api/health", timeout=60.0)
# Pin the LLM provider via the settings API. Setting LLM_BACKEND /
# LLM_BASE_URL / LLM_MODEL via env is not enough — IronClaw's DB
# setting takes priority over env, and the freshly-seeded DB
# defaults llm_backend to `nearai`, so the env config is ignored
# and the agent attempts an interactive NearAI auth flow that
# never completes in CI. Mirrors the pattern documented in
# tests/e2e/CLAUDE.md and used by test_v2_tool_activate_surface.py.
await _pin_mock_llm_settings(base_url, gateway_token, mock_llm_url)
return GatewayStack(
base_url=base_url,
gateway_token=gateway_token,