Files
ironclaw/scripts/test-pre-commit-safety.sh
Illia Polosukhin 21e27b2247 fix(web): isolate cross-tenant SSE/WS status events and thread access (#3390)
* fix(web): isolate cross-tenant SSE/WS status events and thread access

Plug a multi-tenant leak where unscoped `sse.broadcast(...)` calls
from `GatewayChannel::send_status`, sandbox `JobEvent` dispatch, the
WASM/Slack OAuth completion handlers, and any producer that lost
`metadata.user_id` along the way fan out to every connected
subscriber — exposing another tenant's tool calls, tool output,
onboarding state, and job lifecycle to anyone with an open SSE/WS.

Changes
- Extract `dispatch_status_event(sse, multi_tenant_mode, user_id, ev)`
  from `Channel::send_status`. In multi-tenant mode an unscoped event
  is dropped (with a WARN naming the producer to fix); single-tenant
  keeps the global broadcast since there is one subscriber population.
- `IncomingMessage::new` now defaults `metadata` to `{"user_id": ...}`,
  and `with_metadata` preserves the key so downstream `send_status`
  consumers always have an owner to scope by.
- WASM/Slack OAuth completion broadcasts route through
  `broadcast_for_user(&owner_id, ...)`. Sandbox `JobEvent` dispatch
  in `main.rs` respects `multi_tenant_mode` for the empty-`user_id`
  fallback.
- New pre-commit check #10 (`MULTITENANT`) flags unscoped
  `sse.broadcast(...)` lines without a `// multi-tenant-safe: <reason>`
  marker or a transport-only exemption. Marker regex accepts the marker
  anywhere in a `//` comment so compound annotations on a single line
  work.

Tests
- `src/channels/web/tests/status_event_isolation.rs` — 5 unit tests
  covering both modes and the per-variant drop invariant.
- `src/channels/web/platform/sse.rs` — 2 quadrant tests for the
  `subscribe_raw` filter (scoped/unscoped × matching/mismatched).
- `tests/thread_isolation_integration.rs` — 9 HTTP-level checks that
  Bob cannot reach Alice's chat history (paginated and not), threads
  list, engine v2 detail/steps/events, or Responses GET, plus an
  unauthenticated-rejection guard.
- 8 new self-test cases for the `MULTITENANT` script check, including
  the compound projection-exempt + multi-tenant-safe annotation case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(web): pin cross-tenant boundaries on jobs, files, routines

Audit of the protected route surface for the same bug shape #3390 fixed
(handler that takes a user-controlled id and reads without an ownership
predicate) found that the implementations were correct but four
boundaries had no integration test. Lock them in before they regress.

- Sandbox job persisted-events history (`/api/jobs/{id}/events`):
  Bob → 404 on Alice's job; Alice → 200 on her own.
- Sandbox job workspace listing (`/api/jobs/{id}/files/list`):
  Bob → 404 on Alice's job.
- Sandbox job file read (`/api/jobs/{id}/files/read`): Bob → 404 on
  Alice's job; Alice → 200 on her own; Alice → 403/404 on
  `?path=../outside.txt` (path-traversal pin against the
  `canonicalize() + starts_with(base_canonical)` guard).
- Routine run history (`/api/routines/{id}/runs`): Bob → 404; Alice
  → 200 with at least one seeded run.

The OAuth-state and NEAR-nonce stores were also flagged in the audit
but neither is a real cross-tenant bug: both are pre-auth, single-use,
and the token IS the secret. Documenting here so a future audit
doesn't re-flag them.

Project-static (`/projects/{id}/...`) is left for a follow-up — it
relies on `ironclaw_base_dir()` which is a process-wide `LazyLock`,
making per-test override fragile in the integration runner.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(web): address PR #3390 review — forge-resistant metadata, OAuth toast routing

Addresses six review comments from gemini-code-assist, copilot, and
serrrfirat on PR #3390. False-positives and the perf nit on
`with_metadata` are explained in the reply thread, not changed in code.

- (HIGH, serrrfirat) `IncomingMessage::with_metadata` now ALWAYS sets
  `metadata.user_id` from `self.user_id`, dropping any caller-supplied
  value. A WASM channel emitting `{"user_id":"victim"}` via
  `apply_emitted_metadata` can no longer reroute downstream
  `ToolStarted` / `ToolResult` SSE events into another tenant's
  stream. New unit tests pin the forgery-resistance invariant.
- (MEDIUM, copilot + serrrfirat) Slack relay OAuth callback now
  broadcasts the completion toast to the resolved `oauth_user`
  (the IronClaw user who initiated the flow) rather than
  `state.owner_id`. In multi-tenant deployments those differ and the
  previous routing delivered the toast to the wrong browser tab.
  Extracted the lookup into `resolve_relay_oauth_user`; two unit
  tests cover the secret-present and secret-missing cases.
- (MEDIUM, gemini) `dispatch_status_event` treats empty-string
  `user_id` the same as `None` so producers that lost the field
  along the way fail-closed instead of falling through to a global
  broadcast in multi-tenant mode.
- (MEDIUM, gemini) `main.rs` sandbox JobEvent dispatch now reuses
  `dispatch_status_event` instead of duplicating the drop / WARN /
  broadcast policy. `dispatch_status_event` is bumped from
  `pub(crate)` to `pub` so the binary crate can call it.
- (LOW, copilot) Pre-commit `MULTITENANT` self-tests gain three
  cases (`state.sse.broadcast(`, `gw_state.sse.broadcast(`,
  annotated receiver-prefixed) to lock the existing boundary regex
  behaviour against future tightening.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(channels): preserve i64 metadata.user_id from Telegram in with_metadata

PR #3390's forge-resistance fix made `IncomingMessage::with_metadata`
*always* overwrite `metadata.user_id` with `self.user_id` as a String.
That broke the Telegram WASM channel: it persists Telegram's chat user
ID as `metadata.user_id: i64` and re-deserializes it into
`TelegramMessageMetadata { user_id: i64, ... }` in `on_respond` /
`on_status`. After the fix, `respond` blew up with
`invalid type: string "999", expected i64 at line 1 column 87`,
failing 3 Telegram integration tests in CI.

Narrow the carve-out: overwrite only when the existing `user_id` is a
String (or missing). Non-string values are channel-private and the
SSE routing layer reads via `as_str()` — non-strings already fail
closed in multi-tenant mode, so the forge threat (WASM emits
`{"user_id":"victim"}` as a string) is still mitigated, while
Telegram's i64 use case survives.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(web): redact WARN payload + harden dotdot traversal test (PR #3390)

Two follow-up fixes from the second pass of review on #3390:

1. `dispatch_status_event`'s WARN log used `?event`, which on
   `AppEvent::Response` / `Thinking` / `ToolResult` carries
   user-authored content into operator logs in a multi-tenant
   deployment. Replace with `event_kind = event.event_type()`
   (the wire-stable variant name) — enough to identify the
   misbehaving producer without leaking tenant data. Picked up via
   Copilot's review on `src/channels/web/mod.rs:666`.

2. `alice_job_file_read_rejects_dotdot_traversal` planted
   `outside.txt` under `outer.path()` (the `start_server_with_db`
   fixture's tempdir holding `test.db`) but probed
   `?path=../outside.txt` relative to `alice_proj` — a separate
   `tempfile::tempdir()` rooted at the OS temp directory. The two
   paths were unrelated, so the test could pass even if `..`
   traversal was permitted (probe just hit empty space). Build the
   directory tree by hand instead: `parent/alice_proj/` with the
   planted file at `parent/outside.txt`, so the probe deterministically
   resolves to the planted bytes. Add a body-content assertion that
   fails loudly if those bytes leak. Picked up via Copilot's review on
   `tests/cross_tenant_resource_isolation.rs:355`.

Plus a `cargo fmt` fix for `src/channels/channel.rs:1202` that was
breaking the Formatting CI check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(web): address PR #3390 follow-ups — multi-tenant fallback WARN, exhaustive variant pin

- `resolve_relay_oauth_user`: take `multi_tenant_mode`; emit WARN when the
  `relay:{ext}:oauth_user` secret is missing in multi-tenant mode so the
  unrecoverable-initiator case surfaces in operator logs. Single-tenant
  fallback stays silent (owner == only user).
- `dispatch_status_event`: doc note clarifying the function is `pub` only
  for the sandbox JobEvent rx loop in `main.rs`; not part of a stable
  public API.
- `_compile_time_appevent_variant_check`: exhaustive-match helper paired
  with `unscoped_drop_holds_for_every_status_variant_in_multi_tenant`.
  Adding a new `AppEvent` variant now fails the test build, prompting an
  update to both the helper and the runtime leak-candidate list.
- `tests/thread_isolation_integration.rs`: honest scope note on the
  engine-v2 thread tests — they pin handler shape (404/empty for
  unknown id), not the cross-tenant ownership branch. Cross-tenant
  engine-v2 coverage requires an `ENGINE_STATE` test fixture; tracked
  as a follow-up in the comment block.

Tests: 8 unit (5 status_event_isolation + 3 resolve_relay_oauth_user)
and 9 thread_isolation_integration pass; clippy clean; pre-commit
safety scripts pass (regression suite 27 cases).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(web): unconditionally consume relay:{ext}:oauth_user secret in OAuth callback

The previous cleanup site lived inside the `if let Some(pairing_store)`
branch of the result block, which was unreachable on three failure
paths:

1. `pairing_store` is None (no identity pairing wired up)
2. The result block `?`-short-circuits before reaching the `if let`
   (e.g. `set_setting` fails, `activate_stored_relay` fails, an inner
   `relay_config()` / `list_connections()` errors)
3. The `if let` body itself errors before reaching the delete (e.g.
   `list_connections` returns no matching team)

Leaving the secret behind lets a subsequent OAuth callback for the
same extension read a stale initiating user and misroute the
completion toast — Copilot review on PR #3390 (comment id 3211833864).

Move the delete to right after `resolve_relay_oauth_user` returns,
where it always runs once the value has been captured, regardless of
downstream failure mode. Updated the inner comment to document that
the secret is already gone by the time the pairing branch reads
`oauth_user`.

Regression test: `test_relay_oauth_callback_consumes_oauth_user_secret_on_failure_path`
seeds the secret, fires the callback against a fixture with no real
relay backend (so the result block deterministically errors), and
asserts the secret is gone afterward. Pre-fix this would have left
the secret behind on the failure path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 16:06:49 +00:00

244 lines
9.8 KiB
Bash
Executable File

#!/usr/bin/env bash
# Regression tests for the grep pipelines in `pre-commit-safety.sh`.
#
# The PROJECTION / DISPATCH / CREDNAME checks all pipe through
# `grep -nE '^\+' | grep -E <positive> | grep -vE <exclusions>`.
# A previous version of the exclusion regex used `^\+\+\+` to filter
# diff header lines (`+++ b/file.rs`), which silently never fired —
# `grep -n` prepends a `N:` line-number prefix, so `^` no longer
# anchors against the `+++` bytes. This test locks in the corrected
# `:\+\+\+ ` shape.
set -euo pipefail
cd "$(dirname "$0")/.."
PASS=0
FAIL=0
assert_filtered() {
local label="$1" input="$2" positive="$3" exclusions="$4"
# Emulate the production pipeline: `grep -n '^+'` adds the line-number
# prefix, then positive/negative filters run against that shape.
local result
if result=$(printf '%s\n' "$input" \
| grep -nE '^\+' \
| grep -E "$positive" \
| grep -vE "$exclusions" \
| head -5 || true); then :; fi
if [ -z "${result:-}" ]; then
echo "OK: $label (correctly filtered)"
PASS=$((PASS + 1))
else
echo "FAIL: $label — line leaked past exclusions:"
echo "$result" | sed 's/^/ /'
FAIL=$((FAIL + 1))
fi
}
assert_flagged() {
local label="$1" input="$2" positive="$3" exclusions="$4"
local result
if result=$(printf '%s\n' "$input" \
| grep -nE '^\+' \
| grep -E "$positive" \
| grep -vE "$exclusions" \
| head -5 || true); then :; fi
if [ -n "${result:-}" ]; then
echo "OK: $label (correctly flagged)"
PASS=$((PASS + 1))
else
echo "FAIL: $label — line not flagged by positive pattern"
FAIL=$((FAIL + 1))
fi
}
# ── PROJECTION ────────────────────────────────────────────────
# Positive: any `.broadcast_for_user(` (SseManager-unique method) or
# `sse.broadcast(` with a portable word boundary.
# Exclusions: `// projection-exempt: <category>, <detail>`, `// safety:`,
# and diff-header lines (`+++ b/path`) via `:\+\+\+ `.
PROJ_POS='(\.broadcast_for_user|(^|[^[:alnum:]_])sse\.broadcast)[[:space:]]*\('
PROJ_NEG='// projection-exempt: [^,]+,[[:space:]]*[^[:space:]]|// safety:|:\+\+\+ '
# Diff header lines must be filtered.
assert_filtered "PROJECTION: diff header line is filtered" \
"+++ b/src/bridge/router.rs" \
"$PROJ_POS" \
"$PROJ_NEG"
# A real broadcast call is flagged.
assert_flagged "PROJECTION: bare sse.broadcast_for_user is flagged" \
"+ sse.broadcast_for_user(&user, event);" \
"$PROJ_POS" \
"$PROJ_NEG"
# Chained receiver (state.sse.broadcast_for_user) is flagged.
assert_flagged "PROJECTION: chained state.sse.broadcast_for_user is flagged" \
"+ state.sse.broadcast_for_user(&user, event);" \
"$PROJ_POS" \
"$PROJ_NEG"
# A rustfmt-wrapped call is flagged.
assert_flagged "PROJECTION: rustfmt-wrapped .broadcast_for_user is flagged" \
"+ .broadcast_for_user(&user, event);" \
"$PROJ_POS" \
"$PROJ_NEG"
# Non-`sse` receiver must still fire — `broadcast_for_user` is unique to
# SseManager, so the method name alone is authoritative.
assert_flagged "PROJECTION: non-sse receiver .broadcast_for_user is flagged" \
"+ manager.broadcast_for_user(&user, event);" \
"$PROJ_POS" \
"$PROJ_NEG"
# Plain sse.broadcast call is flagged via the portable word boundary.
assert_flagged "PROJECTION: bare sse.broadcast is flagged" \
"+ sse.broadcast(event);" \
"$PROJ_POS" \
"$PROJ_NEG"
# The portable boundary must not fire on a longer identifier that ends
# in 'sse' (e.g. `usse.broadcast(...)` — not a real SseManager).
assert_filtered "PROJECTION: identifier ending in sse is not flagged" \
"+ usse.broadcast(event);" \
"$PROJ_POS" \
"$PROJ_NEG"
# Correctly annotated call is exempted.
assert_filtered "PROJECTION: annotated call with category+detail is exempt" \
"+ sse.broadcast_for_user(&user, event); // projection-exempt: bridge dispatcher, auth gate" \
"$PROJ_POS" \
"$PROJ_NEG"
# Bare `// projection-exempt: legacy` (no comma, no detail) does NOT exempt.
assert_flagged "PROJECTION: unnamed 'legacy' suppression still flagged" \
"+ sse.broadcast_for_user(&user, event); // projection-exempt: legacy" \
"$PROJ_POS" \
"$PROJ_NEG"
# Empty detail after the comma (`// projection-exempt: foo,`) does NOT
# exempt — the documented format requires a non-empty detail.
assert_flagged "PROJECTION: empty detail after comma still flagged" \
"+ sse.broadcast_for_user(&user, event); // projection-exempt: foo," \
"$PROJ_POS" \
"$PROJ_NEG"
# Trailing whitespace after the comma without a detail also does NOT exempt.
assert_flagged "PROJECTION: comma + whitespace-only detail still flagged" \
"+ sse.broadcast_for_user(&user, event); // projection-exempt: foo, " \
"$PROJ_POS" \
"$PROJ_NEG"
# ── DISPATCH ──────────────────────────────────────────────────
DISPATCH_POS='state\.(store|workspace|workspace_pool|extension_manager|skill_registry|session_manager)\.'
DISPATCH_NEG='// dispatch-exempt:|// safety:|:\+\+\+ '
assert_filtered "DISPATCH: diff header line is filtered" \
"+++ b/src/channels/web/handlers/foo.rs" \
"$DISPATCH_POS" \
"$DISPATCH_NEG"
assert_flagged "DISPATCH: direct state.store touch is flagged" \
"+ state.store.create_project(...)" \
"$DISPATCH_POS" \
"$DISPATCH_NEG"
# ── CREDNAME ──────────────────────────────────────────────────
# Portable word boundary: `(^|[^[:alnum:]_])` / `([^[:alnum:]_]|$)` —
# `grep -E`'s `\b` is a GNU extension and not recognised by BSD grep.
CREDNAME_POS='(^|[^[:alnum:]_])CredentialName([^[:alnum:]_]|$)'
CREDNAME_NEG='// web-identity-exempt:|// safety:|:\+\+\+ '
assert_filtered "CREDNAME: diff header line is filtered" \
"+++ b/src/channels/web/features/settings.rs" \
"$CREDNAME_POS" \
"$CREDNAME_NEG"
# A similarly-named but distinct identifier must not fire.
assert_filtered "CREDNAME: CredentialNameExt (different type) is not flagged" \
"+ let ext: CredentialNameExt = ...;" \
"$CREDNAME_POS" \
"$CREDNAME_NEG"
assert_flagged "CREDNAME: bare CredentialName reference is flagged" \
"+ let name: CredentialName = ...;" \
"$CREDNAME_POS" \
"$CREDNAME_NEG"
# ── MULTITENANT ───────────────────────────────────────────────
# A new unscoped `sse.broadcast(...)` must either be transport-only or
# carry an explicit `// multi-tenant-safe: <reason>` annotation.
# `broadcast_for_user(...)` is the safe path and must be exempt.
MT_POS='(^|[^[:alnum:]_])sse\.broadcast[[:space:]]*\('
MT_NEG='\.broadcast_for_user|// projection-exempt: transport-only,[[:space:]]*[^[:space:]]|//.*multi-tenant-safe: [^[:space:]]|// safety:|:\+\+\+ '
assert_filtered "MULTITENANT: diff header line is filtered" \
"+++ b/src/extensions/manager.rs" \
"$MT_POS" \
"$MT_NEG"
assert_filtered "MULTITENANT: broadcast_for_user is exempt" \
"+ sse.broadcast_for_user(&user, event);" \
"$MT_POS" \
"$MT_NEG"
assert_filtered "MULTITENANT: heartbeat (transport-only) is exempt" \
"+ sse.broadcast(AppEvent::Heartbeat); // projection-exempt: transport-only, heartbeat" \
"$MT_POS" \
"$MT_NEG"
# Receiver-prefixed call sites: the boundary regex matches `.sse.broadcast(`
# because the leading `.` is non-alnum-and-non-underscore, so the existing
# check covers production patterns like `state.sse.broadcast(`,
# `gw_state.sse.broadcast(`, and rustfmt-wrapped chains. These tests pin
# that behaviour against a future regex tightening.
assert_flagged "MULTITENANT: state.sse.broadcast (receiver-prefixed) is flagged" \
"+ state.sse.broadcast(event);" \
"$MT_POS" \
"$MT_NEG"
assert_flagged "MULTITENANT: gw_state.sse.broadcast (snake_case receiver) is flagged" \
"+ gw_state.sse.broadcast(event);" \
"$MT_POS" \
"$MT_NEG"
assert_filtered "MULTITENANT: state.sse.broadcast with annotation is exempt" \
"+ state.sse.broadcast(event); // multi-tenant-safe: single-tenant fallback" \
"$MT_POS" \
"$MT_NEG"
assert_filtered "MULTITENANT: explicit multi-tenant-safe annotation is exempt" \
"+ sse.broadcast(event); // multi-tenant-safe: only reached when multi_tenant_mode=false" \
"$MT_POS" \
"$MT_NEG"
# Compound annotation: a single `// ` comment can carry both
# `projection-exempt:` and `multi-tenant-safe:` because Rust line
# comments don't nest. The marker scanner must accept either marker
# anywhere in the trailing comment, not only when the comment opens
# with it. See `src/channels/web/mod.rs::dispatch_status_event` and
# `src/main.rs` sandbox JobEvent dispatcher.
assert_filtered "MULTITENANT: compound projection-exempt + multi-tenant-safe annotation is exempt" \
"+ sse.broadcast(event); // projection-exempt: bridge dispatcher, single-tenant unscoped status; multi-tenant-safe: only reached when multi_tenant_mode=false" \
"$MT_POS" \
"$MT_NEG"
assert_flagged "MULTITENANT: bare unscoped sse.broadcast is flagged" \
"+ sse.broadcast(event);" \
"$MT_POS" \
"$MT_NEG"
assert_flagged "MULTITENANT: unscoped broadcast with bridge-dispatcher projection-exempt is still flagged" \
"+ sse.broadcast(event); // projection-exempt: bridge dispatcher, status update" \
"$MT_POS" \
"$MT_NEG"
assert_flagged "MULTITENANT: unscoped broadcast with empty multi-tenant-safe detail is still flagged" \
"+ sse.broadcast(event); // multi-tenant-safe: " \
"$MT_POS" \
"$MT_NEG"
echo ""
echo "Passed: $PASS, Failed: $FAIL"
[ "$FAIL" -eq 0 ]