Files
supabase/apps/www/middleware.ts
Pamela Chia cd52669f1f fix(docs): negotiate /guides/* markdown via shared helper (#45432)
## Summary

This brings docs `/guides/*` to full content negotiation for AI agents
(GROWTH-811):
RFC 9110 q-value parsing instead of a `.includes('text/markdown')`
substring match,
a 406 when the client rejects every type the route can produce, and
markdown rewrites
for known LLM user agents.

I implemented it by extracting the negotiation into a shared
`common/markdown-negotiation`
module consumed by both `apps/docs/middleware.ts` and
`apps/www/middleware.ts`, rather than
duplicating the helpers into docs and keeping them in sync by hand with
www (#45394). Single
source of truth, no re-sync burden. www is refactored onto the shared
helper with no behavior
change.

## Changes

### docs `/guides/*` content negotiation (GROWTH-811)

- Replace the `.includes('text/markdown')` substring match with RFC 9110
q-value parsing.
- Return 406 (`Cache-Control: no-store`, `Vary: Accept`) when Accept
excludes every type the
route serves. Bypassed for LLM user agents, the `.md` suffix, and
clients sending no Accept.
- Rewrite to `/api/guides-md/<slug>` for LLM user agents (Claude-User,
Claude-Web, ChatGPT-User,
  PerplexityBot) regardless of Accept.
- Preserve the existing `.md` suffix routing and the entire
`/reference/*` block.

### Shared negotiation helper

- New `packages/common/markdown-negotiation.ts`:
`negotiateMarkdown(signals, route)` returns
`'markdown' | 'not-acceptable' | 'pass'`. Internalizes q-value parsing,
the LLM user-agent
  match, the UA-length cap, and the markdown-vs-html preference.
- `apps/www/middleware.ts`: refactored to consume the shared helper; its
duplicated copy of the
negotiation helpers (added in #45394) is removed. `.md` early-return,
changelog routing, and
first-referrer cookie stamping are unchanged (no behavior change,
covered by its existing tests).

### Tests

- New `apps/docs/middleware.test.ts`: q-value priority, the 406 path,
`.md` suffix, LLM UA
override, browser default Accept, training-crawler and substring-embed
exclusion, and the
  `/reference/*` exemption.
- New `packages/common/markdown-negotiation.test.ts`: the same decision
matrix at the unit level
(q-values, 406, LLM UAs, `.md`, `*/*`, training crawlers, OWS,
out-of-range q).

## Testing (Vercel preview)

After Vercel posts a preview URL, save it once then run the probe set.

```bash
echo 'PREVIEW_HOST' > /tmp/growth-811-host.txt
HOST=$(cat /tmp/growth-811-host.txt)

# 1) Browser-style Accept -> HTML 200
curl -sI -A "Mozilla/5.0" \
  -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' \
  "https://$HOST/docs/guides/auth"

# 2) Accept: text/markdown -> markdown 200
curl -sI -H 'Accept: text/markdown' "https://$HOST/docs/guides/auth"

# 3) text/html;q=1.0, text/markdown;q=0.5 -> HTML 200
curl -sI -H 'Accept: text/html;q=1.0, text/markdown;q=0.5' "https://$HOST/docs/guides/auth"

# 4) unsupported Accept -> 406 + Cache-Control: no-store + Vary: Accept
curl -sI -H 'Accept: application/x-content-negotiation-probe' "https://$HOST/docs/guides/auth"

# 5) User-Agent: Claude-User/1.0 (any Accept) -> markdown 200
curl -sI -A 'Claude-User/1.0' "https://$HOST/docs/guides/auth"
```

### After merge

Run
[acceptmarkdown.com/readiness-check](https://acceptmarkdown.com/readiness-check)
against `https://supabase.com/docs/guides/auth`: expect 100/100.

## Linear

- fixes GROWTH-811
2026-06-09 17:49:35 +08:00

59 lines
1.9 KiB
TypeScript

import { stampFirstReferrerCookie } from 'common/first-referrer-cookie'
import { negotiateMarkdown } from 'common/markdown-negotiation'
import { NextResponse, type NextRequest } from 'next/server'
import { MD_PAGES } from './app/api-v2/md/content.generated'
export function middleware(request: NextRequest) {
const { pathname } = request.nextUrl
if (pathname.endsWith('.md')) {
const slug = pathname.slice(1, -3)
if (MD_PAGES.has(slug)) {
return NextResponse.rewrite(new URL(`/api-v2/md/${slug}`, request.nextUrl))
}
}
// Strip trailing slash so /auth/ and /auth resolve to the same allowlist
// entry — NextURL preserves trailing-slash style on rewrite targets.
const slug = (pathname === '/' ? 'homepage' : pathname.slice(1)).replace(/\/$/, '')
const isMdEligible = MD_PAGES.has(slug)
const isChangelogEntry = slug === 'changelog' || /^changelog\/\d+/.test(slug)
const decision = negotiateMarkdown(
{
acceptHeader: request.headers.get('accept') ?? '',
userAgent: request.headers.get('user-agent') ?? '',
},
{ hasMarkdownVariant: isMdEligible || isChangelogEntry }
)
if (decision === 'not-acceptable') {
return new NextResponse('Not Acceptable', {
status: 406,
headers: { 'Cache-Control': 'no-store', Vary: 'Accept' },
})
}
if (decision === 'markdown') {
if (isMdEligible) {
return NextResponse.rewrite(new URL(`/api-v2/md/${slug}`, request.nextUrl))
}
// Changelog entries are static .md files in public/, not API routes.
if (isChangelogEntry) {
return NextResponse.rewrite(new URL(`/${slug}.md`, request.nextUrl))
}
}
const response = NextResponse.next()
stampFirstReferrerCookie(request, response)
return response
}
export const config = {
matcher: [
// MUST exclude _next/data to prevent full page reloads in multi-zone apps.
'/((?!api|_next/static|_next/image|_next/data|favicon.ico|__nextjs).*)',
],
}