Saltar al contenido principal

Design Doc: MCP OAuth 2.0 Authorization Code + PKCE Flow

Overview

Add a standard OAuth 2.0 Authorization Code flow with PKCE to the existing McpModule so that third-party AI clients (ChatGPT, etc.) can authenticate users against FlowPOS and obtain MCP access tokens. This replaces the current manual fp_mcp_* bearer token model for external AI clients while keeping the existing token infrastructure intact.


Problem

ChatGPT's MCP connector requires a discoverable OAuth 2.0 server. The current MCP server:

  • Only accepts pre-issued fp_mcp_* bearer tokens via Authorization: Bearer header
  • Has no /.well-known/oauth-authorization-server discovery document
  • Has no /authorize, /token, or /refresh endpoints
  • Cannot be connected to ChatGPT because it fails OAuth discovery

Goals

  1. Implement the minimum viable OAuth 2.0 Authorization Code + PKCE surface required by ChatGPT's MCP connector spec
  2. Reuse the existing fp_mcp_* token generation and validation infrastructure
  3. Keep the existing V1 bearer token path (Claude Desktop, Cursor) fully intact — no breaking changes
  4. Store OAuth authorization codes in the database (multi-instance safe for Cloud Run)
  5. Support token refresh to address the 24-hour expiry friction

Architecture

Module Location

All new code lives inside apps/backend/src/mcp/ following existing hexagonal layout:

apps/backend/src/mcp/
├── mcp.module.ts # Add new providers
├── application/
│ └── mcp-oauth.service.ts # NEW — OAuth use cases
├── domain/
│ ├── mcp-auth-code.domain.ts # NEW — AuthCode value object
│ └── mcp-oauth-code-repository.domain.ts # NEW — repository port
├── infrastructure/
│ └── mcp-oauth-code.repository.ts # NEW — Kysely DB implementation
└── interfaces/
├── mcp-oauth.controller.ts # NEW — OAuth HTTP endpoints
└── mcp-well-known.controller.ts # NEW — discovery document endpoint

Critical Architecture Constraints

1. No In-Memory Storage — Cloud Run is Multi-Instance

Cloud Run scales horizontally. An in-memory store on instance A is invisible to instance B, causing invalid_grant errors on token exchange. All auth codes must be persisted in the database using the mcp_auth_code table (see schema below). A scheduled cleanup job removes expired codes.

2. /.well-known/ Must Bypass Any NestJS Global Prefix

Before implementing, verify whether main.ts sets a global prefix via app.setGlobalPrefix(...). If it does, the well-known controller must use @Controller({ path: '.well-known/oauth-authorization-server' }) combined with the NestJS RouteExclude option, OR the global prefix must explicitly exclude /.well-known/*. Confirm this works end-to-end before marking the task complete.

3. Exact Redirect URI Matching Only — No Wildcards

Wildcard matching on redirect URIs is an open redirect vulnerability. Use exact string equality only. The MCP_OAUTH_ALLOWED_REDIRECT_URIS env var contains a comma-separated list of fully qualified URIs. ChatGPT's callback URL is known and fixed (visible in the connector setup screen). Add it verbatim.

The /authorize endpoint runs on the backend API domain (flowpos-backend-....run.app). The Firebase session cookie (flowpos-id-token) is set on the web-app domain (e.g., app.flowpos.com) and is never sent to the API domain. Therefore, any attempt to detect a logged-in user via cookie on /authorize will always fail — do not implement McpFirebaseOptionalGuard or any cookie-reading logic here.

Instead, use a two-step flow:

  1. GET /mcp/oauth/authorize — validates params only, then always redirects to /mcp-auth
  2. POST /mcp/oauth/authorize/complete — called by the /mcp-auth frontend after login; accepts the Firebase ID token explicitly in the JSON body; generates and persists the auth code; returns the redirect_url for the frontend to navigate to

This keeps Firebase token validation on the backend (correct) without any cross-domain cookie dependency.

5. Stricter Rate Limits on OAuth Endpoints

Apply @Throttle({ default: { limit: 10, ttl: 60_000 } }) explicitly on all OAuth controller endpoints to override the global 100 req/60s default. This mitigates auth code and refresh token brute-force attacks.


New Endpoints

All new endpoints are decorated with @IsPublic() to bypass the global Firebase AuthGuard.

1. Discovery Document

GET /.well-known/oauth-authorization-server

Returns a static JSON document. No auth required. All URL values sourced from MCP_ISSUER_URL env var.

{
"issuer": "https://<MCP_ISSUER_URL>",
"authorization_endpoint": "https://<MCP_ISSUER_URL>/mcp/oauth/authorize",
"token_endpoint": "https://<MCP_ISSUER_URL>/mcp/oauth/token",
"response_types_supported": ["code"],
"grant_types_supported": ["authorization_code", "refresh_token"],
"code_challenge_methods_supported": ["S256"],
"token_endpoint_auth_methods_supported": ["none"]
}

MCP_ISSUER_URL must be read from ConfigService, never hardcoded.


2a. Authorization Endpoint — Param Validation + Redirect

GET /mcp/oauth/authorize

Query params: client_id, redirect_uri, response_type=code, state, code_challenge, code_challenge_method=S256

This endpoint does nothing except validate params and redirect. It never touches Firebase or the database. Auth code generation happens in step 2b.

Validation (return 400 on failure, before any redirect):

  • response_type must equal code
  • code_challenge_method must equal S256
  • code_challenge must be present and non-empty
  • redirect_uri must exactly match one entry in MCP_OAUTH_ALLOWED_REDIRECT_URIS (exact string equality — no wildcards, no prefix matching)
  • client_id must be present

Flow:

  1. Run all validations above — reject with 400 on any failure
  2. Always redirect to the /mcp-auth login page, passing all original params:
    https://<FRONTEND_URL>/mcp-auth?client_id=<>&redirect_uri=<>&state=<>&code_challenge=<>&code_challenge_method=S256&response_type=code

The /mcp-auth page owns the login + business selection UX, then calls POST /mcp/oauth/authorize/complete when ready.


2b. Authorization Complete Endpoint — Auth Code Generation

POST /mcp/oauth/authorize/complete
Content-Type: application/json
@IsPublic()
@Throttle({ default: { limit: 10, ttl: 60_000 } })

Called by the /mcp-auth frontend page after the user has logged in and selected a business. Accepts the Firebase ID token explicitly in the body — no cookie dependency.

Request body:

{
"firebase_id_token": "<Firebase ID token from frontend SDK>",
"business_id": "<selected business UUID>",
"client_id": "<from original /authorize params>",
"redirect_uri": "<from original /authorize params>",
"code_challenge": "<from original /authorize params>",
"state": "<from original /authorize params>"
}

Flow:

  1. Re-validate redirect_uri against MCP_OAUTH_ALLOWED_REDIRECT_URIS (exact match) — return 400 if invalid
  2. Verify firebase_id_token using FirebaseService.verifyToken() — return 401 if invalid
  3. Resolve userId from the verified Firebase token
  4. Verify the user belongs to the supplied business_id via business_user table — return 403 if not
  5. Generate a cryptographically random 32-byte hex authorization code
  6. Persist to mcp_auth_code table with expires_at = now() + MCP_OAUTH_AUTH_CODE_TTL_MINUTES
  7. Return the redirect URL (do not redirect — let the frontend navigate):

Response:

{
"redirect_url": "<redirect_uri>?code=<code>&state=<state>"
}

The /mcp-auth page then executes window.location.href = redirect_url.


3. Businesses Endpoint — List User's Businesses

GET /mcp/oauth/businesses
Authorization: Bearer <firebase_id_token>
@IsPublic()
@Throttle({ default: { limit: 10, ttl: 60_000 } })

Called by the /mcp-auth frontend page after Firebase login to determine whether to show the business selector (Step 2) or skip directly to Step 3. Accepts the Firebase ID token in the Authorization: Bearer header — not a session cookie.

Flow:

  1. Extract the token from the Authorization: Bearer header — return 401 if missing
  2. Verify the token using FirebaseService.verifyToken() — return 401 if invalid
  3. Resolve userId from the verified token
  4. Query business_user table for all businesses the user belongs to
  5. Return the list

Response:

{
"businesses": [
{ "id": "uuid", "name": "Mi Tienda Centro" },
{ "id": "uuid", "name": "Mi Tienda Zona 10" }
]
}

Returns only id and name — no other business data. If the list has exactly one item, the /mcp-auth page skips the selector and proceeds directly to Step 3.


4. Token Endpoint — Authorization Code Exchange

POST /mcp/oauth/token
Content-Type: application/x-www-form-urlencoded

Request body:

grant_type=authorization_code
&code=<authorization_code>
&redirect_uri=<redirect_uri>
&client_id=<client_id>
&code_verifier=<pkce_verifier>

Flow:

  1. Look up the authorization code in mcp_auth_code — return 400 invalid_grant if not found or expires_at < now()
  2. Verify redirect_uri exactly matches what was stored — return 400 invalid_grant if mismatch
  3. Compute base64url(SHA256(code_verifier)) and compare to stored code_challenge — return 400 invalid_grant if mismatch
  4. Delete the authorization code record immediately (single-use enforcement)
  5. Issue a new fp_mcp_* access token using existing token generation logic, scoped to stored userId + businessId
  6. Generate a 30-day refresh token (random 32-byte hex), store its SHA-256 hash in mcp_refresh_token
  7. Return the response

Response:

{
"access_token": "fp_mcp_...",
"token_type": "bearer",
"expires_in": 86400,
"refresh_token": "fp_mcp_refresh_..."
}

5. Token Endpoint — Refresh Token Grant

POST /mcp/oauth/token
Content-Type: application/x-www-form-urlencoded

grant_type=refresh_token
&refresh_token=<refresh_token>
&client_id=<client_id>

Flow:

  1. Hash the incoming token with SHA-256; look up by hash in mcp_refresh_token
  2. Return 400 invalid_grant if not found, revoked_at IS NOT NULL, or expires_at < now()
  3. Issue a new fp_mcp_* access token scoped to the stored userId + businessId
  4. Rotate the refresh token: set revoked_at = now() on the old record, insert a new record with a new token + new expires_at
  5. Return same response shape as above

Concurrent refresh note: if two refresh requests arrive simultaneously with the same token, the second will fail with invalid_grant after the first marks it revoked. This is correct per RFC 6749 §10.4 — do not attempt to de-duplicate.


Database Schema

New Table: mcp_auth_code

Replaces in-memory storage. Enables multi-instance safety on Cloud Run.

CREATE TABLE mcp_auth_code (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
code TEXT NOT NULL UNIQUE, -- raw 32-byte hex code
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
business_id UUID NOT NULL,
client_id TEXT NOT NULL,
redirect_uri TEXT NOT NULL,
code_challenge TEXT NOT NULL, -- base64url(SHA256(verifier))
expires_at TIMESTAMPTZ NOT NULL, -- now() + MCP_OAUTH_AUTH_CODE_TTL_MINUTES
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX mcp_auth_code_code_idx ON mcp_auth_code(code);
CREATE INDEX mcp_auth_code_expires_at_idx ON mcp_auth_code(expires_at);

A scheduled NestJS job (using existing @nestjs/schedule) runs every 10 minutes:

DELETE FROM mcp_auth_code WHERE expires_at < now();

New Table: mcp_refresh_token

CREATE TABLE mcp_refresh_token (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
token_hash TEXT NOT NULL UNIQUE, -- SHA-256 hash of the raw token
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
business_id UUID NOT NULL,
client_id TEXT NOT NULL,
expires_at TIMESTAMPTZ NOT NULL,
revoked_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX mcp_refresh_token_token_hash_idx ON mcp_refresh_token(token_hash);
CREATE INDEX mcp_refresh_token_user_id_idx ON mcp_refresh_token(user_id);

The raw refresh token is returned once and never stored. Only the SHA-256 hash is persisted.

Both tables require a Kysely migration and pnpm run generate:types after creation.


Environment Variables (add to Doppler — staging and production)

Backend

VariableDescriptionExample
MCP_ISSUER_URLPublic base URL (no trailing slash)https://flowpos-backend-723334209984.us-central1.run.app
MCP_OAUTH_ALLOWED_REDIRECT_URISComma-separated exact URI allowlisthttps://chatgpt.com/connector/oauth/my4E4RWQ1J9Z
MCP_OAUTH_REFRESH_TOKEN_TTL_DAYSRefresh token lifetime in days30
MCP_OAUTH_AUTH_CODE_TTL_MINUTESAuthorization code TTL in minutes5

Web-App (Next.js — must be NEXT_PUBLIC_ prefixed)

VariableDescriptionExample
NEXT_PUBLIC_MCP_ISSUER_URLSame value as backend MCP_ISSUER_URL — used to construct the /authorize/complete API call URLhttps://flowpos-backend-723334209984.us-central1.run.app
NEXT_PUBLIC_MCP_ALLOWED_REDIRECT_ORIGINAllowed origin for redirect_uri client-side validation (UX safeguard only — backend is authoritative)https://chatgpt.com

Frontend: /mcp-auth Page (web-app Next.js)

A new standalone page in apps/web-app/src/app/mcp-auth/page.tsx. No FlowPOS shell or navigation — renders as a centered card.

On load, reads OAuth params from the query string: client_id, redirect_uri, state, code_challenge, code_challenge_method, response_type. These are passed through from the original /authorize redirect and must be preserved across all steps.

Step 1: Login

  • Shows FlowPOS logo + message: "Sign in to connect your FlowPOS account to an AI assistant"
  • Renders the existing Firebase login form (email + password)
  • On success: Firebase SDK returns an ID token — store it in component state
  • If the user belongs to exactly one business: skip to Step 3
  • If the user belongs to multiple businesses: proceed to Step 2

Step 2: Business Selector (multi-business users only)

  • Lists all businesses the authenticated user belongs to (fetch using GET /mcp/oauth/businesses — see endpoint below)
  • On selection: store the chosen business_id in component state, proceed to Step 3

Step 3: Complete — Call /mcp/oauth/authorize/complete

const response = await fetch('/api/mcp/oauth/authorize/complete', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
firebase_id_token: firebaseIdToken, // from Step 1
business_id: selectedBusinessId, // from Step 1 or 2
client_id,
redirect_uri,
code_challenge,
state,
}),
});
const { redirect_url } = await response.json();
window.location.href = redirect_url; // sends user to ChatGPT callback

Security — redirect_uri Validation on the Frontend

Before calling /authorize/complete, validate that redirect_uri starts with a known safe prefix (using NEXT_PUBLIC_MCP_ALLOWED_REDIRECT_ORIGIN env var — see below). This is a UX safeguard only; the backend re-validates authoritatively.


ChatGPT Connector Setup (after implementation)

In ChatGPT → Settings → Apps & Connectors → New App:

  • MCP Server URL: https://<your-cloud-run-url>/mcp
  • Authentication: OAuth
  • Registration method: User-Defined OAuth Client
  • OAuth Client ID: chatgpt (static string — no secret needed, PKCE is used)
  • Token endpoint auth method: none
  • Auth URL / Token URL: auto-discovered from /.well-known/oauth-authorization-server

What Stays Unchanged

  • All existing fp_mcp_* bearer token validation logic
  • V1 bearer token path (Claude Desktop, Cursor — no changes needed)
  • McpPrincipal interface and activeBusinessId / authorizedBusinessIds[] pattern
  • All 32 existing MCP tools, Prompts, and Resources
  • MCP_TOKEN_SECRET usage for existing token signing

Out of Scope (V2)

  • Dynamic Client Registration (DCR)
  • OIDC / userinfo endpoint
  • Per-client scopes
  • Redis-backed storage (DB is sufficient for V1)
  • Admin UI for managing OAuth clients or revoking sessions
  • Skipping /mcp-auth login step for already-authenticated users (requires same-domain session sharing — not possible with current API/web-app domain split)

Implementation Order

  1. Kysely migrations: mcp_auth_code + mcp_refresh_token tables
  2. pnpm run generate:types
  3. McpAuthCode domain value object + repository port interface
  4. McpOAuthCodeRepository — Kysely DB implementation
  5. McpRefreshTokenRepository — Kysely DB implementation
  6. McpOAuthService — authorize (param validation only), complete (auth code generation), exchange, refresh use cases
  7. Scheduled cleanup job for expired mcp_auth_code rows (every 10 min)
  8. McpWellKnownControllerGET /.well-known/oauth-authorization-server — verify global prefix bypass works
  9. McpOAuthController:
    • GET /mcp/oauth/authorize — param validation + redirect only — @IsPublic()
    • GET /mcp/oauth/businesses — Firebase token in header, returns business list — @IsPublic() + @Throttle
    • POST /mcp/oauth/authorize/complete — Firebase token + auth code generation — @IsPublic() + @Throttle
    • POST /mcp/oauth/token — code exchange + refresh grant — @IsPublic() + @Throttle
  10. Doppler env vars for backend (staging + production)
  11. Doppler env vars for web-app (NEXT_PUBLIC_MCP_ISSUER_URL, NEXT_PUBLIC_MCP_ALLOWED_REDIRECT_ORIGIN)
  12. apps/web-app/mcp-auth page (Step 1 login → Step 2 business selector → Step 3 POST to /authorize/completewindow.location)
  13. End-to-end test with ChatGPT connector

Unit Test Requirements

  • Domain: 100% — McpAuthCode value object (expiry validation, code format assertions)
  • Application: 90% — McpOAuthService: happy path for each grant type, all invalid_grant branches, expired code rejection, invalid Firebase token rejection in complete, business membership check failure, concurrent refresh scenario (second call fails after first revokes)
  • Infrastructure: 80% — McpOAuthCodeRepository (expired code rejection, single-use deletion), McpRefreshTokenRepository (hash storage, revocation flag)
  • Integration: At least one e2e test covering the full flow: GET /authorize (param validation + redirect) → GET /businesses (Firebase token, returns list) → POST /authorize/complete (Firebase token + business_id) → POST /token (code exchange) → POST /token (refresh) → second refresh (rotation produces new token, old is rejected)