canary: pin LLM backend via settings API + add LLM_API_KEY (root cause of CI freeze)

The auth-browser-consent google lane has been freezing on CI at "Thinking (step 1)..." for the full chat-wait budget. Gateway logs captured by the previous commit's pipe drainer reveal the smoking gun: ERROR Configured LLM backend is not usable. backend=openai_compatible reason=missing API key WARN LLM_BACKEND env var is set but DB setting takes priority. db_value=nearai env_value=openai_compatible WARN Active LLM backend fell back to NearAI default attempted=openai_compatible active=nearai Two compounding issues: 1. The openai_compatible provider refuses to instantiate without an API key, even though the mock LLM ignores the value. Fix: set `LLM_API_KEY=mock-api-key` in `build_gateway_env`, matching what `tests/e2e/conftest.py` already does for the e2e suite. 2. IronClaw's DB-stored LLM settings take priority over env vars, and the freshly-seeded canary DB defaults `llm_backend` to `nearai`. So even with a clean env, the agent fell back to NearAI and entered an interactive auth flow that hangs indefinitely in CI (the "Thinking" never ends). This is the exact trap `tests/e2e/CLAUDE.md` documents: "do not rely on env-vs-DB precedence … pin the provider explicitly through /api/settings/...". Fix: pin `llm_backend`, `openai_compatible_base_url`, and `selected_model` via PUT /api/settings/<key> immediately after the gateway becomes healthy. Also revert the BROWSER_CASES["google"] case I touched earlier: when NearAI was driving it emitted the WASM canonical tool name (`gmail_tool`), but the mock LLM (now correctly driving) emits the tool name it knows from its mapping (`gmail`). Restoring the original `expected_tool_name="gmail"` / `expected_text="gmail"` matches what the mock LLM actually produces. Verified locally: all three browser_oauth / browser_chat / responses_api probes now pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 07:34:10 +08:00 · 2026-04-27 14:49:33 -07:00
parent f59981d362
commit 8733d3c036
2 changed files with 47 additions and 2 deletions
--- a/scripts/live_canary/auth_registry.py
+++ b/scripts/live_canary/auth_registry.py
@@ -163,8 +163,8 @@ BROWSER_CASES: dict[str, BrowserProviderCase] = {
        install_kind=None,
        install_url=None,
        trigger_prompt="check gmail unread",
-        expected_tool_name="gmail_tool",
-        expected_text="inbox",
+        expected_tool_name="gmail",
+        expected_text="gmail",
        auth_extension_name="gmail",
    ),
    "github": BrowserProviderCase(
--- a/scripts/live_canary/common.py
+++ b/scripts/live_canary/common.py
@@ -297,6 +297,10 @@ def build_gateway_env(
        "LLM_BACKEND": "openai_compatible",
        "LLM_BASE_URL": mock_llm_url,
        "LLM_MODEL": "mock-model",
+        # Even though the mock LLM ignores the API key, the
+        # openai_compatible provider refuses to instantiate without one.
+        # Without this the provider falls back to NearAI's DB default.
+        "LLM_API_KEY": "mock-api-key",
        "DATABASE_BACKEND": "libsql",
        "LIBSQL_PATH": str(db_path),
        "SECRETS_MASTER_KEY": secrets_master_key,
@@ -315,6 +319,38 @@ def build_gateway_env(
    return env


+async def _pin_mock_llm_settings(
+    base_url: str, gateway_token: str, mock_llm_url: str
+) -> None:
+    """Pin LLM backend/base_url/model via the settings API.
+
+    Required because the gateway's DB settings take priority over the
+    LLM_BACKEND / LLM_BASE_URL / LLM_MODEL env vars; the freshly-seeded
+    DB defaults llm_backend to `nearai`, which sends the agent into an
+    interactive auth flow that hangs in CI. See tests/e2e/CLAUDE.md.
+    """
+    import httpx  # local import: keep top-level import set unchanged
+
+    headers = {"Authorization": f"Bearer {gateway_token}"}
+    writes = [
+        ("llm_backend", "openai_compatible"),
+        ("openai_compatible_base_url", mock_llm_url),
+        ("selected_model", "mock-model"),
+    ]
+    async with httpx.AsyncClient(timeout=15.0) as client:
+        for key, value in writes:
+            response = await client.put(
+                f"{base_url}/api/settings/{key}",
+                headers=headers,
+                json={"value": value},
+            )
+            if response.status_code not in (200, 201, 204):
+                raise CanaryError(
+                    f"Failed to pin LLM setting {key}: "
+                    f"{response.status_code} {response.text[:300]}"
+                )
+
+
 def _drain_to_file(stream: Any, path: Path) -> threading.Thread:
    """Drain a subprocess stdout/stderr stream to a file in a daemon thread.

@@ -435,6 +471,15 @@ async def start_gateway_stack(
            _drain_to_file(gateway_proc.stdout, log_dir / "gateway.log")
        base_url = f"http://127.0.0.1:{gateway_port}"
        await wait_for_ready(f"{base_url}/api/health", timeout=60.0)
+
+        # Pin the LLM provider via the settings API. Setting LLM_BACKEND /
+        # LLM_BASE_URL / LLM_MODEL via env is not enough — IronClaw's DB
+        # setting takes priority over env, and the freshly-seeded DB
+        # defaults llm_backend to `nearai`, so the env config is ignored
+        # and the agent attempts an interactive NearAI auth flow that
+        # never completes in CI. Mirrors the pattern documented in
+        # tests/e2e/CLAUDE.md and used by test_v2_tool_activate_surface.py.
+        await _pin_mock_llm_settings(base_url, gateway_token, mock_llm_url)
        return GatewayStack(
            base_url=base_url,
            gateway_token=gateway_token,