Skip to main content

MCP Session and Auth Troubleshooting Runbook

Use this runbook when MCP clients fail with 400, 401, missing tools, or unexpected tenant/business context.

Scope covered by this runbook:

  • apps/backend/src/mcp/interfaces/mcp.controller.ts
  • apps/backend/src/mcp/interfaces/guards/mcp-auth.guard.ts
  • apps/backend/src/mcp/interfaces/guards/mcp-api-key.guard.ts
  • apps/backend/src/mcp/interfaces/guards/mcp-token.guard.ts
  • apps/backend/src/mcp/application/mcp-token.service.ts
  • apps/backend/src/mcp/application/mcp-session.service.ts

1. Symptom triage

Start from the first matching symptom:

SymptomLikely layer
401 Authentication required: provide a valid MCP API key or access token on POST /mcpMcpAuthGuard (both API key and JWT rejected)
401 on POST /mcp/tokenFirebase ID token validation in McpTokenService.exchangeFirebaseToken
401 on POST /mcp/token/refreshInvalid MCP JWT or refresh outside 7-day grace window
400 mcp-session-id header requiredRequest is not initialize and missing session header
400 Unknown session IDSession not found in in-memory map (often routing/affinity issue)
Tool missing from tools/listScope or role filtering in ToolRegistry.listFor
set_active_business has no effectSession principal update not persisted or new session started

2. Validate endpoint usage quickly

Expected flow

  1. POST /mcp/token (V2 only) or obtain V1 API key
  2. POST /mcp with initialize body and Authorization
  3. Save mcp-session-id response header
  4. POST /mcp tool calls with both Authorization and mcp-session-id
  5. Optional: GET /mcp SSE with both headers
  6. DELETE /mcp with both headers

Common misuse checks

  • GET /mcp and DELETE /mcp also require Authorization (same McpAuthGuard as POST /mcp)
  • Missing mcp-session-id is valid only for the initial initialize request
  • Reusing a session ID after instance restart or route change will fail with Unknown session ID
  • If clients auto-discover OAuth metadata, verify:
    • GET /.well-known/oauth-authorization-server
    • GET /.well-known/oauth-protected-resource

3. 401 debugging checklist

A) V1 API key path

  1. Confirm header format:
Authorization: Bearer fp_mcp_<hex>
  1. Verify key status through API:
curl -s "https://api.flowpos.app/mcp/keys?businessId=<business-uuid>" \
-H "Authorization: Bearer <firebase-id-token>"
  1. Confirm:
    • isActive = true
    • expiresAt is null or in the future
    • key scopes include required tool scopes

B) V2 token path

  1. Exchange Firebase token again:
curl -s -X POST https://api.flowpos.app/mcp/token \
-H "Content-Type: application/json" \
-d '{"firebaseIdToken":"<firebase-id-token>"}'
  1. If token exchange fails:

    • 401: Firebase token invalid/expired
    • 403: user has no active memberships (business_user.is_active)
  2. If MCP token is expired, refresh:

curl -s -X POST https://api.flowpos.app/mcp/token/refresh \
-H "Content-Type: application/json" \
-d '{"token":"<expired-or-valid-mcp-token>"}'
  1. Refresh constraints from code:
    • JWT must be validly signed
    • token can be expired, but not more than 7 days
    • memberships are re-resolved from DB at refresh time

4. 400 Unknown session ID debugging checklist

This error is returned when McpSessionService.getSession(sessionId) cannot find an active in-memory session.

Step-by-step checks

  1. Verify client sends exact mcp-session-id returned by initialize response
  2. Verify the request includes Authorization (required before session lookup)
  3. Confirm no backend restart occurred between initialize and tool call
  4. On Cloud Run, confirm --session-affinity is enabled
  5. Confirm client is not opening initialize on one host and tool calls on another host/alias

Important constraint

Redis in V2 stores principal state (mcp:session:{sessionId}), not the full streamable transport. If in-memory session state is gone, Redis alone cannot resurrect the session; client must re-initialize.


5. Missing tools in tools/list

Tool visibility is filtered at session initialization by role/scopes.

Check principal shape

  • role: platform_operator | tenant_developer | merchant
  • scopes: includes required scopes (for example pos:intents)
  • authorizedBusinessIds: affects set_active_business visibility

Known visibility rules

  • set_active_business only appears if authorizedBusinessIds.length > 1
  • Intent tools require pos:intents and are never shown to tenant_developer
  • Operator tools only appear for platform_operator

If scopes changed on the backend, re-create a new session (initialize) so visibility is recalculated.


6. Production-safe recovery actions

Use this order to minimize disruption:

  1. Regenerate token/key (credential reset only)
  2. Re-initialize MCP session (new mcp-session-id)
  3. Reconnect client transport (Cursor/Claude restart if needed)
  4. Validate Cloud Run session affinity and single host usage
  5. Rotate/revoke old API keys if compromise is suspected

7. Preventive practices

  • Always include both headers on non-initialize calls:
    • Authorization
    • mcp-session-id
  • Keep one canonical MCP base URL per environment
  • Re-initialize sessions after deploys/restarts
  • For V2 clients, implement proactive refresh before expiresIn hits zero
  • Track MCP_TOKEN_SECRET and MCP_TOKEN_TTL_SECONDS changes as release notes
  • Re-open MCP sessions when role/scope assignments change (tool list is computed at initialize time)