Commit Graph

6 Commits

Author SHA1 Message Date
Pamela Chia
47c084e51d refactor(studio): migrate telemetry to useTrack (#46140)
## Summary

I migrated every `useSendEventMutation` call site in `apps/studio` to
`useTrack`, deleted the legacy hook, and added a lint guardrail so it
can't return. `useTrack` is the type-safe replacement: it auto-injects
`groups: { project, organization }` from the selected project/org and
types `action` + `properties` against `TelemetryEvent`. Existing call
sites built groups manually and were not type-checked at the action
level. The migration covers 81 files (60 trivial swaps, 9 org-only, 3
pre-auth, 5 bespoke, 4 test mocks).

## Changes

- Migrated trivial call sites across `pages/project/[ref]`,
`components/interfaces/*` (Reports, Storage, Realtime/Inspector,
SQLEditor, Functions, EdgeFunctions, Integrations, ProjectAPIDocs,
Branching/BranchManagement, TableGridEditor, Connect, Docs, Auth,
Support, Home, ProjectHome, App), `components/layouts/*`, and
`components/ui/*`.
- Migrated org-only sites (`Organization/Documents/*`,
`Organization/BillingSettings/Subscription/*`,
`Organization/SecuritySettings.tsx`,
`Account/Preferences/DashboardSettingsToggles.tsx`) by dropping the
manual `groups: { organization: ... }` and letting `useTrack`
auto-inject. Verified `useSelectedProjectQuery` is disabled on org
routes (gates on URL `[ref]`).
- Migrated pre-auth sites (`SignInForm.tsx`, `sign-in-mfa.tsx`,
`profile.tsx`) where neither project nor org is resolved.
- Bespoke handling:
- `execute-sql-mutation.ts` and `table-row-create-mutation.ts`: pass `{
project: projectRef }` via `groupOverrides` since the mutation can
target a non-selected project ref.
- `useStudioCommandMenuTelemetry.ts`: kept a direct `sendTelemetryEvent`
call because studio groups must override pre-built event groups
(opposite of `useTrack`'s override direction).
- `AIAssistantOption.tsx`: passes sentinel-aware `groupOverrides` so
`NO_PROJECT_MARKER`/`NO_ORG_MARKER` continue to suppress group emission.
- `SidePanelEditor.utils.tsx`: utility functions `createTable` and
`updateTable` now take a `track: Track` parameter (threaded from
`SidePanelEditor.tsx`); dropped the `organizationSlug` arg since groups
are no longer assembled manually.
- Branch-event attribution: preserved `parentProjectRef` overrides on
`branch_updated`, `branch_merge_completed`, `branch_merge_failed`,
`branch_merge_submitted`, `branch_delete_button_clicked`,
`branch_review_with_assistant_clicked`, and
`branch_*_merge_request_button_clicked`. Original code grouped these
under the parent (production) project, not the branch ref;
auto-injection would have shifted them onto the branch.
- Switched 4 test mocks from `@/data/telemetry/send-event-mutation` to
`@/lib/telemetry/track`. Removed obsolete tests around manual groups and
`try/catch` on telemetry rejection.
- Deleted `apps/studio/data/telemetry/send-event-mutation.ts`. The
deleted module is its own guardrail: any reintroduction of the import
fails at TypeScript module resolution before lint runs.

## Testing

Tested on preview deploy:

- [x] SQL editor `CREATE TABLE` fires `table_created` with method
`sql_editor` and `groups.project` set to the mutation's `projectRef`.
- [x] Table editor creates a table from the side panel; `table_created`
fires from `SidePanelEditor.utils` via threaded `track`.
- [x] Help button (`/project/[ref]/...`) fires `help_button_clicked`
with auto-injected project + org groups.
- [x] Sign-in form fires `sign_in` with empty groups (pre-auth,
expected).
- [x] Org documents page (`/org/[slug]/documents`) fires
`document_view_button_clicked` with org group only, no stale project
ref.
- [x] Command menu (`Cmd+K`) inside a project still fires
`command_menu_opened` with studio's project/org overriding any
event-supplied groups.
- [x] Support form "Ask the Assistant" without selected org fires
`ai_assistant_in_support_form_clicked` with no project/org groups
(sentinels suppress).
- [x] On a branch, "Update branch" / "Merge branch" / "Close merge
request" events fire with `groups.project` set to the parent project
ref, not the branch ref.

Local checks:
- [x] 22/22 tests pass across the 4 updated test files
(`SidePanelEditor.utils.createTable`, `EdgeFunctionRenderer`,
`LayoutSidebar`, `PlanUpdateSidePanel`).
- [x] `rg useSendEventMutation apps/studio` returns 0 hits.

## Linear
- fixes GROWTH-860


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Standardized telemetry across the Studio to a unified tracking system;
events now send simplified payloads with less contextual/grouping data.
* No user-facing flows changed; UI behavior, permissions, and
interactions remain the same.
* **Tests**
* Updated telemetry mocks and tests to align with the new tracking
approach.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/supabase/supabase/pull/46140?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-27 15:19:54 +08:00
Saxon Fletcher
80da153450 Fix for AI Assistant query and deploy confirmation (#46052)
When Assistant requests confirmation to run a query or deploy an edge
function if the user doesn't skip or run and instead sends a follow-up
message it errors out. This allows follow-up messages and treats them as
"skips" which means adjusting confirmation message state as part of the
follow-up. This also uses toModelOutput to cleanse data based on
permissions.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **New Features**
* Enhanced tool approval workflow: pending approvals are now
automatically resolved as denied when submitting new messages
* Improved chat input state management with better handling of approval
states
  * Customizable loading messages for tool operations

* **Bug Fixes**
  * Fixed chat input availability during pending tool approval states
  * Improved tool execution feedback during approval workflows

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/supabase/supabase/pull/46052?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-20 09:09:28 +10:00
Matt Rossman
d143571586 feat(assistant): trace-level scorers + server-side tool execution with needsApproval (#45654)
## Motivation

When Assistant runs a potentially destructive tool like `execute_sql`,
it stops the LLM request and prompts for client-side approval and
execution of the tool. After approval, a second request kicks off under
a separate trace. This has made scoring and
[Topics](https://www.braintrust.dev/blog/topics) classification
challenging, as the generated `output` is split across stateless
requests. The [span-level
scoring](https://www.braintrust.dev/docs/evaluate/custom-code#score-spans)
approach we've used thusfar (after the LLM call, we massage the result
into an `output` payload that's stuck onto the root span) has been
cumbersome and led to invalid scores / topics where only part of the
assistant response is considered. It's also inefficient, as we're
duplicating potentially large info (like the `search_docs` output) that
already exists within the trace.

An alternative to scoring spans is to [score
traces](https://www.braintrust.dev/docs/evaluate/custom-code#score-traces).
Braintrust [best
practices](https://www.braintrust.dev/docs/evaluate/score-online#best-practices)
advise:

> Use span scope for evaluating individual operations or outputs. Use
trace scope for evaluating multi-turn conversations, overall workflow
completion, or when your scorer needs access to the full execution
context.

We've also received [direct
guidance](https://supabase.slack.com/archives/C05QYJBLX89/p1777925770927149?thread_ts=1777905716.911979&cid=C05QYJBLX89)
from their team to use this approach.

## Changes

Migrates eval scorers from custom `AssistantEvalOutput` shape to
trace-level scoring via `trace.getThread()` / `trace.getSpans()`, with
thread parsing that scores the full latest Assistant turn and passes
prior conversation separately where relevant.

Moves `execute_sql` and `deploy_edge_function` from client-side
execution after approval to AI SDK `needsApproval` + server-side
`execute()`. SQL results returned to the model are gated by AI opt-in
level, so row data is only included with `schema_and_log_and_data`;
otherwise the tool returns the no-data-permissions sentinel.

Adds `metadata.isFinalStep` to disambiguate multiple LLM requests within
an "assistant" turn due to tool call requests/responses. For online
evals, this means we should configure automations to only score traces
with `metadata.isFinalStep = true` to ensure we're judging the complete
generated response.

Other minor kaizen changes:
- Renamed `promptProviderOptions` to `systemProviderOptions` to clarify
that this is associated with the "system" message and disambiguate from
the root `providerOptions`
- Adds `evals/trace-utils.ts` to handle Zod validation of the `unknown`
span shapes from Braintrust, to more easily access typed inputs/output
on tool spans.
- Bumps AI SDK floor version `^6.0.116` → `^6.0.174`
- Tweaked the "Conciseness" scorer to not unfairly dock points for the
new `[called tool_name]` labels in serialized assistant response

## Verification

In the studio staging build, I asked Assistant to create a todos table
with 3 sample todos. I manually approved the `execute_sql` call and saw
Assistant generate text before & after the call.

In Braintrust I verified two traces were produced (see [filtered
logs](https://www.braintrust.dev/app/supabase.io/p/Assistant/logs?v=Staging&tvt=trace&search={%22filter%22:[{%22text%22:%22metadata.environment%2520%253D%2520%27staging%27%22,%22label%22:%22metadata.environment%2520%253D%2520%27staging%27%22,%22originType%22:%22btql%22},{%22text%22:%22%2560Chat%2520ID%2560%2520%253D%2520%25221cb2ac45-e5e7-458c-9da4-3bf6863b8842%2522%22,%22label%22:%22Chat%2520ID%2520equals%25201cb2ac45-e5e7-458c-9da4-3bf6863b8842%22,%22originType%22:%22form%22}]})),
the first with `metadata.isFinalStep = false` and the second with
`metadata.isFinalStep = true`.

In the Braintrust staging scorers, I ran the preview Completeness scorer
on the second trace and verified it sees the complete Assistant response
including markers for tool calls ([link to
trace](https://www.braintrust.dev/app/supabase.io/p/Assistant%20(Staging%20Scorers)/trace?object_type=project_logs&object_id=b5214b62-ad1e-4929-9d5b-40b1daebe948&r=0ed0a4f8-8aff-4a34-bb1d-1df1d88a5070&s=ff9015f8-6bf7-4ab3-83a9-ca4e69e27e82))

<img width="1193" height="960" alt="CleanShot 2026-05-07 at 11 27 10@2x"
src="https://github.com/user-attachments/assets/509d4858-c3a1-4068-986d-3aa4d5617d1a"
/>

I also tested the `deploy_edge_function` workflow and verified it still
prompts for permission and warns on deployment of existing functions.

**References**
- https://www.braintrust.dev/docs/evaluate/custom-code#score-traces
-
https://ai-sdk.dev/docs/ai-sdk-core/tools-and-tool-calling#tool-execution-approval

Supercedes https://github.com/supabase/supabase/pull/45556 and
https://github.com/supabase/supabase/pull/45339

Closes AI-473

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Tool actions (SQL execution, edge-function deploy) now require
explicit user Approve/Deny before proceeding.

* **Improvements**
* Assistant pauses for approval responses before sending follow-ups,
giving clearer control over risky actions.
  * Deploy/replace flows show confirmation and clearer replace warnings.
* Evaluation/scoring updated to use richer trace data for more accurate
assistant performance signals.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-12 15:24:21 -04:00
Charis
4a0bb36ca8 style: require sorted imports in studio/components (#44408)
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Ivan Vasilov <vasilov.ivan@gmail.com>
2026-04-01 10:22:37 +02:00
Ivan Vasilov
43cc61818c chore: Migrate all isPending uses in react-query (#40642)
* Bump react-query. Minor type and logic fixes.

* Migrate all use of isLoading to isPending in mutations.

* Fix type errors.
2025-11-20 16:44:53 +01:00
Saxon Fletcher
626eb30e77 Assistant action orientated approach (#38806)
* update onboarding

* update model and fix part issue

* action orientated assistant

* fix tool

* lock

* remove unused filter

* fix tests

* fix again

* update package

* update container

* fix tests

* refactor(ai assistant): break out message markdown and profile picture

* wip

* refactor(ai assistant): break up message component

* refactor: break ai assistant message down into multiple files

* refactor: simplify ReportBlock state

* fix: styling of draggable report block header

When the drag handle is showing, it overlaps with the block header.
Decrease the opacity of the header so the handle can be seen and the two
can be distinguished.

* fix: minor tweaks to tool ui

* refactor: simplify DisplayBlockRenderer state

* fix: remove double deploy button in edge function block

When the confirm footer is shown, the deploy button on the top right should be
hidden (not just disabled) to avoid confusion.

* refactor, test: message sanitization by opt-in level

Refactor the message sanitization to have more type safety and be more testable.
Add tests to ensure:

- Message sanitization always runs on generate-v4
- Message sanitization correctly works by opt-in level

* Fix conflicts in pnpm lock

* Couple of nits and refactors

* Revert casing for report block snippet

* adjust sanitised prompt

* Fix tests

---------

Co-authored-by: Charis Lam <26616127+charislam@users.noreply.github.com>
Co-authored-by: Joshen Lim <joshenlimek@gmail.com>
2025-09-29 03:57:36 +00:00