MCP Session and Auth Troubleshooting Runbook
Use this runbook when MCP clients fail with 400, 401, missing tools, or unexpected tenant/business context.
Scope covered by this runbook:
apps/backend/src/mcp/interfaces/mcp.controller.tsapps/backend/src/mcp/interfaces/guards/mcp-auth.guard.tsapps/backend/src/mcp/interfaces/guards/mcp-api-key.guard.tsapps/backend/src/mcp/interfaces/guards/mcp-token.guard.tsapps/backend/src/mcp/application/mcp-token.service.tsapps/backend/src/mcp/application/mcp-session.service.ts
1. Symptom triage
Start from the first matching symptom:
| Symptom | Likely layer |
|---|---|
401 Authentication required: provide a valid MCP API key or access token on POST /mcp | McpAuthGuard (both API key and JWT rejected) |
401 on POST /mcp/token | Firebase ID token validation in McpTokenService.exchangeFirebaseToken |
401 on POST /mcp/token/refresh | Invalid MCP JWT or refresh outside 7-day grace window |
400 mcp-session-id header required | Request is not initialize and missing session header |
400 Unknown session ID | Session not found in in-memory map (often routing/affinity issue) |
Tool missing from tools/list | Scope or role filtering in ToolRegistry.listFor |
Tool appears but returns isError: true | Tool handler validation or owning module use-case error |
set_active_business has no effect | Session principal update not persisted or new session started |
AI client expects /mcp/oauth/authorize or /mcp/oauth/token | Client is following a proposed OAuth design, not the current backend implementation |
2. Validate endpoint usage quickly
Expected flow
POST /mcp/token(V2 only) or obtain V1 API keyPOST /mcpwithinitializebody andAuthorization- Save
mcp-session-idresponse header POST /mcptool calls with bothAuthorizationandmcp-session-id- Optional:
GET /mcpSSE with both headers DELETE /mcpwith both headers
Common misuse checks
GET /mcpandDELETE /mcpalso require Authorization (sameMcpAuthGuardasPOST /mcp)- Missing
mcp-session-idis valid only for the initialinitializerequest - Reusing a session ID after instance restart or route change will fail with
Unknown session ID - If clients auto-discover OAuth metadata, verify:
GET /.well-known/oauth-authorization-serverGET /.well-known/oauth-protected-resource
- Current discovery metadata points to
POST /mcp/token; there is no current/mcp/oauth/authorizeor/mcp/oauth/tokencontroller.
3. 401 debugging checklist
A) V1 API key path
- Confirm header format:
Authorization: Bearer fp_mcp_<hex>
- Verify key status through API:
curl -s "https://api.flowandgrow.tech/mcp/keys?businessId=<business-uuid>" \
-H "Authorization: Bearer <firebase-id-token>"
- Confirm:
isActive = trueexpiresAtis null or in the future- key scopes include required tool scopes
B) V2 token path
- Exchange Firebase token again:
curl -s -X POST https://api.flowandgrow.tech/mcp/token \
-H "Content-Type: application/json" \
-d '{"firebaseIdToken":"<firebase-id-token>"}'
-
If token exchange fails:
401: Firebase token invalid/expired403: user has no active memberships (business_user.is_active)
-
If MCP token is expired, refresh:
curl -s -X POST https://api.flowandgrow.tech/mcp/token/refresh \
-H "Content-Type: application/json" \
-d '{"token":"<expired-or-valid-mcp-token>"}'
- Refresh constraints from code:
- JWT must be validly signed
- token can be expired, but not more than 7 days
- memberships are re-resolved from DB at refresh time
4. 400 Unknown session ID debugging checklist
This error is returned when McpSessionService.getSession(sessionId) cannot find an active in-memory session.
Step-by-step checks
- Verify client sends exact
mcp-session-idreturned by initialize response - Verify the request includes
Authorization(required before session lookup) - Confirm no backend restart occurred between initialize and tool call
- On Cloud Run, confirm
--session-affinityis enabled - Confirm client is not opening initialize on one host and tool calls on another host/alias
Important constraint
Redis in V2 stores principal state (mcp:session:{sessionId}), not the full streamable transport.
If in-memory session state is gone, Redis alone cannot resurrect the session; client must re-initialize.
5. Missing tools in tools/list
Tool visibility is filtered at session initialization by role/scopes.
Check principal shape
role:platform_operator|tenant_developer|merchantscopes: includes required scopes (for examplepos:intents)authorizedBusinessIds: affectsset_active_businessvisibility
Known visibility rules
set_active_businessonly appears ifauthorizedBusinessIds.length > 1- Intent tools require
pos:intentsand are never shown totenant_developer - Operator tools only appear for
platform_operator
If scopes changed on the backend, re-create a new session (initialize) so visibility is recalculated.
6. Tool call returns isError: true
At this point auth/session routing worked and the tool handler ran.
Common source-backed cases
void_transactionwithoutconfirm: trueintentionally returns a preview message. Call it again with the sametransactionIdandconfirm: trueonly after user confirmation.log_hoursrequires a resolvable MCP principaluserId. Merchant callers cannot passuserId; onlyplatform_operatorcan override it.log_hoursdelegates to Implementation Portal time-entry rules, including hourly-only steps and minimum0.25hours.- Date arguments use ISO 8601 date-times on tool schemas such as
summarize_period,list_sales, andlist_purchases. - Detail-by-ID tools delegate to the owning module service. If the tool returns not found for an ID that exists, verify the owning service path and tenant expectations before changing MCP wrapper code.
Next checks
- Compare the AI-client arguments against the tool argument table in MCP API Reference.
- Re-run the same call with a minimal JSON-RPC payload and saved
mcp-session-id. - Inspect the owning module use case or repository path named in the tool
factory under
apps/backend/src/mcp/tools/.
7. Production-safe recovery actions
Use this order to minimize disruption:
- Regenerate token/key (credential reset only)
- Re-initialize MCP session (new
mcp-session-id) - Reconnect client transport (Cursor/Claude restart if needed)
- Validate Cloud Run session affinity and single host usage
- Rotate/revoke old API keys if compromise is suspected
8. Preventive practices
- Always include both headers on non-initialize calls:
Authorizationmcp-session-id
- Keep one canonical MCP base URL per environment
- Re-initialize sessions after deploys/restarts
- For V2 clients, implement proactive refresh before
expiresInhits zero - Track
MCP_TOKEN_SECRETandMCP_TOKEN_TTL_SECONDSchanges as release notes - Re-open MCP sessions when role/scope assignments change (tool list is computed at initialize time)