Saltar al contenido principal

Print Bridge — Specification

Status: Specified — ready for implementation Scope: New standalone app apps/print-bridge/ — local Docker agent + web UI Depends on: KDS & Printing infrastructure (see kds-tracking-model)


Problem

FlowPOS runs on GCP Cloud Run (serverless, Google's cloud). Thermal printers sit on the restaurant's local network at a private IP (e.g. 192.168.1.100:9100). There is no network path from GCP to a private LAN.

The backend's NetworkThermalPrinterAdapter sends raw TCP ESC/POS — this works in local development but will never reach a restaurant's printer in production.

The Print Bridge solves this by running inside the restaurant's network, acting as the last-mile delivery agent between the cloud and the physical printer.


Architecture

FlowPOS Cloud (GCP Cloud Run)
─────────────────────────────────
/restaurant WebSocket
PATCH /print-jobs/:id
│ ▲
│ │ (internet)
▼ │
Restaurant LAN
─────────────────────────────────
Print Bridge (Docker container)

├─── Printer A (TCP :9100) ──▶ Grill Station + Desserts Station
├─── Printer B (TCP :9100) ──▶ Salads Station
└─── Printer C (USB) ──▶ Bar Station + Cocktails Station

Authority boundaries

FlowPOS cloud is the authority for print job lifecycle (pendingprinted / failed). Print Bridge is the authority for local transport state, printer settings, and ephemeral operational telemetry. The bridge's internal API must not become a second source of truth for job state — it reads job metadata from FlowPOS, delivers bytes to hardware, and reports results back.

Flow (per printer instance)

  1. Bridge authenticates with FlowPOS using a device pairing token (see Authentication).
  2. For each configured printer instance, connects to the /restaurant WebSocket namespace and subscribes to print_job.new for each of its station IDs. One printer may serve multiple stations; all subscriptions share the same physical transport.
  3. On each print_job.new event: a. Layer 1 check — skip if job ID is already in-flight or recently completed in the local registry. b. Layer 2 check — compute key orderItemId + stationId + (modifiedAt ?? createdAt) from the event payload; skip if this key already exists in the order item registry (marks the job printed with lastError: "duplicate suppressed by bridge" — see NFR #1). This check runs before the GET to avoid unnecessary network I/O. c. Max-age check — if the job is older than maxJobAgeMinutes (default 60), it is held in a pending-review set rather than auto-printed or auto-failed. The Status page shows a Recovery Banner prompting the operator to Print All, Discard All, or Review (see NFR #2). d. Marks job as in-flight in local registry. e. Fetches full job details via GET /print-jobs/:id?expand=details. This GET is required. The print_job.new WebSocket event payload contains only the base print_job record (id, orderItemId, stationId, status, attempts, createdAt, lastError, modifiedAt) — it does not include ticket content. The expand=details response adds productName, tableNumber, and the nested orderItem needed to render the ESC/POS ticket. See Backend changes required for a recommended future optimisation that would eliminate this round-trip. f. Renders the ESC/POS ticket locally. The station name is printed on the ticket so kitchen staff can identify which logical area it belongs to even though all tickets come from the same printer. g. Sends the binary buffer to the printer (TCP or USB). Jobs from multiple stations are queued and delivered sequentially — no interleaving. Success means: socket connected, all bytes sent, connection closed cleanly — not paper-in-hand. h. Calls PATCH /print-jobs/:id with { status: "printed" } on success, or { status: "failed", lastError: "..." } on error. i. Moves job from in-flight to completed in local registry; adds its Layer 2 key to the order item registry.
  4. Polls GET /print-jobs?stationIds[]=id1&stationIds[]=id2&status=pending every 30 s as a fallback for missed WebSocket events (batched — one request per printer instance, not per station). Jobs already in the local registry are skipped; jobs older than maxJobAgeMinutes are held for operator review.
  5. Reconnects automatically on WebSocket disconnect with exponential backoff (1 s → 2 s → 4 s → max 30 s).

Authentication

Two separate auth contexts

The bridge has two distinct authentication concerns that must not be confused:

ContextWho authenticatesMethodPurpose
Web UI → Bridge APIHuman operatorFirebase email/password (same as frontend-pwa)Protects http://localhost:3456 from unauthorized LAN access
Bridge Agent → FlowPOSThe bridge processDevice pairing tokenAllows the agent to subscribe to WebSocket events and update job status

Web UI authentication (Firebase + local fallback)

The bridge web UI uses the same Firebase email/password login flow as apps/frontend-pwa as its primary auth method:

  • signInWithEmailAndPassword(auth, email, password) from firebase/auth
  • onIdTokenChanged listener manages the session, caches token in sessionStorage under flowpos:auth:cache, and auto-refreshes every 55 minutes
  • Every request from the React UI to the bridge Express API (/api/*) sends the Firebase ID token as Authorization: Bearer <token>
  • The Express server validates the token using the Firebase Admin SDK. Any unauthenticated request to /api/* returns 401 and the UI redirects to the login page.
  • The login page matches the PWA sign-in page: email/password form, same error messages, same glassmorphism card style.

Firebase config is a build-time variable. VITE_PUBLIC_FIREBASE_CONFIG is baked into the JS bundle at vite build — it cannot be passed as a Docker runtime env var after the image is built. The official ghcr.io/fixxrepo/flowpos-workspace/print-bridge image is pre-built with the production Firebase project config. For self-hosted builds, set VITE_PUBLIC_FIREBASE_CONFIG at image build time, not in docker-compose.yml.

Local fallback auth (offline access). Firebase authentication requires an outbound internet connection. If the restaurant loses internet — exactly when an operator may need to stop the agent or check status — Firebase login fails. To handle this, the bridge supports a local admin password as a fallback:

  • On first run, if no local password is set, the bridge prompts the operator to set one from the Config page (while internet is available).
  • The password is hashed with bcrypt and stored in config.json as localPasswordHash.
  • The login page shows a "Use local password" link below the Firebase form. Local login issues a short-lived signed JWT (signed with the derived BRIDGE_JWT_KEY — see key derivation below), accepted by the Express middleware alongside Firebase tokens.
  • Local login grants the same access as Firebase login. It is not a backdoor — it requires the operator to have previously set the password while online.

Emergency access. If neither Firebase nor a local password is available (e.g. factory reset, local password never set, internet down), the operator can add BRIDGE_EMERGENCY_TOKEN=<any-random-string> to the Docker environment and restart the container. The Express middleware accepts this token as a valid auth credential. Remove it immediately after recovery.

The bridge logs a prominent warning on every startup when BRIDGE_EMERGENCY_TOKEN is set:

⚠️  WARNING: BRIDGE_EMERGENCY_TOKEN is set. The bridge UI is accessible to
anyone with this token. Remove this variable as soon as access is restored.

The token has no built-in expiry — the operator is solely responsible for removing it.

Key derivation. BRIDGE_SECRET is the single master secret but must never be used directly for two different purposes. Two sub-keys are derived from it at startup using HKDF-SHA256:

BRIDGE_ENCRYPT_KEY = HKDF(BRIDGE_SECRET, salt="bridge-encrypt", length=32)
BRIDGE_JWT_KEY = HKDF(BRIDGE_SECRET, salt="bridge-jwt", length=32)

BRIDGE_ENCRYPT_KEY is used for AES-256-GCM encryption of the device token. BRIDGE_JWT_KEY is used for signing local fallback JWTs. Compromising one does not compromise the other.

Agent authentication (device pairing token)

The bridge agent (not the UI) uses a device pairing model — the same mechanism used by KDS devices — to communicate with FlowPOS. This ensures printing survives password rotations and staff turnover.

Setup flow:

  1. In the FlowPOS Admin, navigate to Kitchen Stations → Print Bridge Devices → Generate Pairing Code.
  2. FlowPOS generates a short-lived code (e.g. AB12-XY99), valid for 10 minutes, stored in Redis as kds:pair:bridge:{code}{ businessId, locationId }. The code is scoped to the business and location of the admin who generated it.
  3. The logged-in operator enters the code into the Print Bridge Config page.
  4. The bridge sends POST /kitchen-stations/pair-bridge with { pairingCode }.
  5. FlowPOS validates the code, reads businessId + locationId from Redis, creates a kds_device record (type: print_bridge, scoped to that business/location), and returns a long-lived device token (SHA-256 hashed before storage).
  6. The bridge stores the device token encrypted in config.json. It is used as a Bearer token on all agent API calls and WebSocket auth.

This token survives container restarts and can be revoked from the FlowPOS Admin without affecting any user account.

Auth module interface (src/auth.ts)

interface AuthProvider {
getToken(): Promise<string>; // returns valid token (refreshes if needed)
invalidate(): void; // clear cached token on 401
readonly status: "ok" | "error";
readonly lastRefreshedAt: Date | null;
}

Multi-Station Support

One bridge container manages multiple printer instances. Each instance has its own connection settings, independent job queue, and WebSocket subscriptions — one per assigned station. A single physical printer can serve multiple stations. This avoids deploying one container per printer on a single Raspberry Pi or NUC.

Config structure

{
"apiUrl": "https://api.flowpos.app",
"deviceToken": "<encrypted>",
"agentEnabled": true,
"printers": [
{
"id": "printer-1",
"stationIds": ["uuid-grill", "uuid-desserts"],
"label": "Hot Line",
"connectionType": "tcp",
"printerUrl": "tcp://192.168.1.100:9100",
"paperWidthMm": 80,
"headerLines": ["My Restaurant"],
"cutAfterEach": true,
"copyCount": 1,
"maxJobAgeMinutes": 60
},
{
"id": "printer-2",
"stationIds": ["uuid-bar", "uuid-cocktails"],
"label": "Bar",
"connectionType": "usb",
"usbDevice": "/dev/serial/by-id/usb-Epson_TM-T20_...-port0",
"paperWidthMm": 58,
"headerLines": ["My Restaurant"],
"cutAfterEach": true,
"copyCount": 1,
"maxJobAgeMinutes": 60
}
]
}

stationIds is an array. A printer with a single station uses a one-element array. The station name is printed on each ticket header so staff can distinguish Grill from Desserts even when both print on the same machine.

Jobs from all assigned stations are placed into a single FIFO queue per printer and delivered sequentially. This prevents two tickets from interleaving on paper. The queue is in-memory; if the bridge restarts, the poll fallback re-discovers any missed jobs.

When copyCount > 1, the printer transport is held exclusively for all N copies before the next job is dequeued. Copy 1, copy 2, … copy N are sent back-to-back on the same logical queue slot — no other job can slip between them.

UI impact

The Printer Settings page allows adding, editing, and removing printer instances, with a multi-select station dropdown populated from GET /api/stations. Stations already assigned to another printer instance are shown as disabled in the dropdown — a station can only belong to one printer at a time. The Status page shows one card per printer listing all its assigned stations. The job feed is filterable by printer or by individual station.

Station uniqueness is enforced at the API level. POST /api/printers and PATCH /api/printers/:id reject any request where a stationId is already assigned to a different printer instance, returning HTTP 409 with a message identifying which printer holds the conflicting station. This prevents duplicate subscriptions that would cause every ticket to print twice.


App Structure

apps/print-bridge/
├── src/
│ ├── index.ts # Entry: starts Express server + agent
│ ├── agent.ts # Multi-instance orchestrator: one PrinterAgent per config entry
│ ├── printer-agent.ts # WebSocket client + polling + print loop for one printer instance (multiple stations)
│ ├── auth.ts # AuthProvider interface + device token implementation
│ ├── printer/
│ │ ├── index.ts # Transport factory (returns TCP or USB adapter)
│ │ ├── tcp.ts # Raw TCP socket sender + optional status poll (DLE EOT)
│ │ └── usb.ts # USB/serial sender via @node-escpos/usb
│ ├── renderer.ts # ESC/POS ticket builder (reuses backend logic)
│ ├── registry.ts # In-flight + recently-completed job registry (de-dup, per instance)
│ ├── diagnostics.ts # Internal state object + structured log
│ ├── config.ts # Read/write /app/data/config.json
│ ├── jobs.ts # In-memory job log (last 50 per instance), reprint logic
│ └── routes.ts # Express REST API for the web UI
├── ui/
│ └── src/
│ ├── main.tsx
│ ├── App.tsx
│ ├── pages/
│ │ ├── StatusPage.tsx # Live status cards per printer + job feed + search + reprint + start/stop
│ │ ├── PrinterSettingsPage.tsx # Add / edit / remove printer instances
│ │ └── ConfigPage.tsx # Pairing code entry + API URL
│ └── components/
├── Dockerfile # Multi-stage: build UI → copy into Node image
├── docker-compose.yml # Includes Watchtower for auto-updates
└── package.json

Web UI

Status page (/)

Always-visible live view intended to stay open on a kitchen display or manager's tablet.

SectionContents
Connection statusBadge: Connected (green) · Reconnecting (amber) · Error (red) · Paused (grey). Shows FlowPOS API URL and last connected timestamp.
Printer cardsOne card per configured printer instance. Each shows: station name, Online / Offline / Unknown badge, connection type, address, last successful print timestamp, last error if any.
Paper/cover statusIf the printer supports ESC/POS real-time status (DLE EOT), shows additional badges: Paper OK · Paper Low · Paper Out · Cover Open.
Auth statusBadge: OK · Error. Shows last token refresh timestamp. Error state prompts the user to go to Config and re-pair.
Agent controlsStart / Stop toggle button — activates or deactivates all printer agents without restarting the container. Stop waits for any in-flight jobs to complete before pausing.
ActionsTest Print button per printer — sends a dated ESC/POS test slip. Available only when agent is running and printer is online.
Job feedLast 50 print jobs across all printers (or filtered by printer): item name, order number, table, timestamp, status badge (pending / printed / failed / skipped). Auto-updates via Server-Sent Events. Each row has a Reprint (↩) button.
SearchInline text filter — filters job feed by item name or order number (client-side).

Setup order

The bridge must be paired before printers can be configured. The Printer Settings page depends on GET /api/stations (which proxies FlowPOS using the device token) to populate station dropdowns. If no device token exists, GET /api/stations returns 401 and the dropdown is empty.

The UI enforces this: any navigation to /printers or / while unpaired redirects to /config with the message "Complete pairing before adding printers." Once pairing is complete, the redirect lifts automatically.

Printer settings page (/printers)

Manage all printer instances. Each instance can be added, edited, or removed.

FieldDescription
LabelHuman-readable name (e.g. "Grill Station")
StationDropdown from GET /api/stations
Connection typeTCP · USB
Printer URL (TCP)tcp://192.168.1.100:9100
USB device (USB)Dropdown of detected /dev/serial/by-id/... paths (persistent) and /dev/ttyUSB* (ephemeral)
Paper width58 mm · 80 mm
Header linesFree text, one per line
Cut after eachCheckbox
Copies1–5. Each copy is produced by opening a new TCP socket connection and sending the full ESC/POS buffer again (or re-opening the USB device). ESC/POS-level multi-copy commands are not used — separate sends are more reliable across printer models.
SavePersists changes; reconnects that printer if URL/type changed
Test PrintFires a test slip immediately after save
RemoveRemoves the printer instance from config and stops its agent

Config page (/config)

Set once at installation. Used for pairing the bridge with FlowPOS.

FieldDescription
FlowPOS API URLe.g. https://api.flowpos.app or http://localhost:4000
Pairing codeShort-lived code from FlowPOS Admin (e.g. AB12-XY99). Exchanged for a long-lived device token.
Device statusShows whether the bridge is paired, the station name, and token age. Allows re-pairing.
TimezoneIANA timezone identifier for the restaurant (e.g. America/Guatemala, America/Mexico_City). Used to format all timestamps in the web UI, test slip, job feed, and diagnostics in the restaurant's local time. Defaults to the bridge machine's system timezone if not set.

Printer Support

TCP / Network (primary)

Any ESC/POS thermal printer reachable on the local network via raw TCP (port 9100 default).

Compatible brands: Epson TM series, Star Micronics TSP/SP series, Bixolon SRP series, Citizen CT series, SNBC BTP series, Rongta RP series, Xprinter, MUNBYN, and any generic ESC/POS network printer.

Requirement: Printer must have an ethernet or Wi-Fi network interface. USB-only printers are not supported by this path.

URL format: tcp://192.168.1.100:9100

Paper/cover status polling: If supported by the model, the bridge sends DLE EOT 1 (0x10 0x04 0x01) on a dedicated status connection every 10 seconds, but only when the print queue is idle (no job in-flight). Polling is suspended while a job is being sent to avoid interleaving status bytes with print data on the TCP stream. Results populate the Status page badges (Paper OK / Paper Out / Cover Open).

USB / Serial

USB thermal printers that expose a serial or raw USB interface.

Docker requirement: The host USB device must be passed through:

docker run -d \
-p 3456:3456 \
--device /dev/ttyUSB0:/dev/ttyUSB0 \
-v print-bridge-data:/app/data \
ghcr.io/fixxrepo/flowpos-workspace/print-bridge

Persistent device paths: Use /dev/serial/by-id/... instead of /dev/ttyUSB0. USB-to-serial adapters can change enumeration index if unplugged and replugged; the by-id path is stable across reboots and port swaps:

docker run -d \
--device /dev/serial/by-id/usb-Epson_TM-T20_XXXXXXXX-port0:/dev/ttyUSB0 \
...

The Printer Settings UI shows both paths — by-id entries are labeled Stable and ttyUSB* entries are labeled Ephemeral.

Library: @node-escpos/usb (libusb bindings).

Bluetooth — not in v1

Bluetooth SPP printers require OS-level pairing and rfcomm bind before Docker can see the device. This is too fragile for a production kitchen environment. May be added in a future version.


Docker Deployment

Single-command install

docker run -d \
--name flowpos-print-bridge \
--restart unless-stopped \
-p 3456:3456 \
-v print-bridge-data:/app/data \
--label com.centurylinklabs.watchtower.enable=true \
ghcr.io/fixxrepo/flowpos-workspace/print-bridge

Then open http://localhost:3456 on the same machine to configure it.

docker-compose.yml

Includes Watchtower for automatic image updates scoped only to the bridge container. The watchtower.enable=true label prevents Watchtower from accidentally restarting unrelated local services during service hours.

services:
print-bridge:
image: ghcr.io/fixxrepo/flowpos-workspace/print-bridge
restart: unless-stopped
stop_grace_period: 30s # matches quiesce max-wait; Docker default is 10 s
ports:
- "3456:3456"
volumes:
- print-bridge-data:/app/data
environment:
# Master secret — two sub-keys (encrypt + JWT) are derived from this via HKDF at startup.
# Generate with: openssl rand -hex 32
- BRIDGE_SECRET=your-32-byte-hex-secret-here
# Emergency access only — remove after recovery. Accepts any value as a valid auth token.
# - BRIDGE_EMERGENCY_TOKEN=
# Note: VITE_PUBLIC_FIREBASE_CONFIG is a BUILD-TIME variable — it is already baked
# into the official ghcr.io/fixxrepo/flowpos-workspace/print-bridge image. Do not set it here at runtime.
labels:
- "com.centurylinklabs.watchtower.enable=true"
# Uncomment and adjust to pass through a USB printer.
# Use /dev/serial/by-id/... for stable paths (recommended):
# devices:
# - /dev/serial/by-id/usb-Epson_TM-T20_XXXXXXXX-port0:/dev/ttyUSB0

watchtower:
image: containrrr/watchtower
restart: unless-stopped
volumes:
- /var/run/docker.sock:/var/run/docker.sock
# Runs at 3:00 AM daily — avoids updating during dinner service.
command: --label-enable --schedule "0 3 * * *"
environment:
- WATCHTOWER_CLEANUP=true

volumes:
print-bridge-data:

Dockerfile strategy

Multi-stage build:

  1. Stage 1 — build the React UI (vite builddist/)
  2. Stage 2 — compile TypeScript agent (tscdist/)
  3. Stage 3 — final Node.js 22 Alpine image, copies both build outputs. Express serves the UI as static files and the API under /api.

Single container, no nginx, no separate frontend server.

Important: Always mount a persistent Docker volume for /app/data. This volume contains config.json and the .secret encryption key. Losing the volume requires full re-pairing and re-configuration.


Express API (internal — used by web UI)

MethodPathDescription
GET/api/statusAgent state, all printer states, auth state, last 50 jobs
GET/api/diagnosticsFull structured diagnostic snapshot for support
GET/api/configCurrent config (device token redacted)
POST/api/configSave API URL + trigger re-pair flow
POST/api/pairExchange pairing code for device token
GET/api/printersList configured printer instances
POST/api/printersAdd a new printer instance
PATCH/api/printers/:idUpdate a printer instance; reconnects if URL/type changed
DELETE/api/printers/:idRemove a printer instance and stop its agent
POST/api/test-print/:printerIdSend test ESC/POS slip to a specific printer
POST/api/startStart all printer agents
POST/api/stopStop all printer agents; waits for in-flight jobs to finish
GET/api/jobsList jobs with optional ?q=<search>&printerId=<id> filters
POST/api/jobs/:id/reprintRe-send ESC/POS bytes; bypasses de-dup registry
GET/api/devices/usbList detected USB/serial devices (both by-id and ttyUSB*)
GET/api/stationsProxy GET /kitchen-stations from FlowPOS
GET/eventsServer-Sent Events stream — pushes status and job updates in real time

Test Print

A test print sends a short ESC/POS slip directly to the configured printer without creating a print_job record in FlowPOS. It is used to verify connectivity, paper alignment, and print settings after installation or after any printer change.

Trigger points

WhereWhen
Status page — Test Print button (per printer)On demand; only enabled when agent is running and printer is online
Printer Settings page — Test Print buttonAutomatically offered after saving new settings; can also be triggered manually
POST /api/test-print/:printerIdDirect API call (useful for scripting or remote diagnostics)

Test slip contents

================================
[header lines from config]
================================
** TEST PRINT **
================================
Bridge: flowpos-print-bridge
Station: Grill Station
Printer: tcp://192.168.1.100:9100
Paper: 80 mm
Date: 2026-04-06 14:32:05 (America/Guatemala)
================================
If you can read this, the
printer is working correctly.
================================
[cut]

The slip is built locally by @point-of-sale/receipt-printer-encoder — no network round-trip to FlowPOS is required. The date is formatted in the configured timezone (falls back to system timezone if not set). lastTestPrintAt in the diagnostics object is updated on every attempt.

API response

{ "success": true, "message": "Test slip sent to tcp://192.168.1.100:9100" }

or on failure:

{ "success": false, "error": "TCP connect ECONNREFUSED 192.168.1.100:9100" }

Non-Functional Requirements

These requirements must be addressed in the implementation, not deferred.

1. Duplicate-print prevention

The bridge enforces two independent de-duplication layers. Both must pass before a job is sent to the printer.

Layer 1 — Job ID registry (guards against WebSocket + poll race)

Each printer instance maintains a local in-flight + recently-completed registry keyed by print_job.id. A job is excluded if:

  • It is currently in-flight (being printed).
  • It appears in the completed registry (successfully delivered since last restart).
  • Its status in FlowPOS is already printed (checked on poll path).

Layer 2 — Order item registry (guards against duplicate print_job records for the same item)

The bridge maintains a registry keyed by orderItemId + stationId + (modifiedAt ?? createdAt). If a second print_job record arrives for the same key, the bridge skips it, logs a warning, and calls PATCH /print-jobs/:id with { status: "printed", lastError: "duplicate suppressed by bridge" } — marking it complete without re-printing.

The modifiedAt component is critical: if an order item is modified after the original ticket printed (e.g. "no onions" → "add jalapeños"), a new print_job arrives with a different modifiedAt timestamp, producing a different key. Layer 2 allows it through and the kitchen sees the modification. An exact re-submission with identical timestamps is still suppressed.

modifiedAt source: The Layer 2 key is computed as soon as the print_job.new event is received — before the GET /print-jobs/:id?expand=details fetch in step 3d. This is intentional: the de-dup check must run before any network I/O. The modifiedAt value is therefore taken directly from the print_job record in the event payload, not from ticketData (which is only available after the GET). This requires the backend to include a modifiedAt column on print_job (see Backend changes required).

Persistence: The registry is kept in memory and batch-written to /app/data/dedup.json every 60 seconds, not on every print. This avoids per-print disk I/O and prevents file corruption from a mid-write crash. On startup, the file is loaded with best-effort recovery — if the JSON is corrupt, the bridge starts with an empty registry and logs a warning rather than crashing.

Periodic pruning: An in-process timer prunes entries older than 24 hours every hour while the bridge is running, preventing unbounded memory growth on long-running deployments. The same 24-hour window is applied when loading the file on startup.

The registry is scoped per station. If the same item is legitimately routed to two different stations (e.g. Grill and Expo), each station prints it once independently.

Reprint bypass

Explicit reprint via POST /api/jobs/:id/reprint bypasses both layers — it is a deliberate user action, not automatic re-delivery. The UI requires confirmation before reprinting to prevent accidental duplicates.

2. Ghost printer / backlog prevention

If the agent is stopped and then restarted, it must not silently flood the printer with a backlog of stale orders, nor silently discard them.

On restart, the bridge checks for pending jobs that arrived while it was offline. If any are found:

  • Jobs older than maxJobAgeMinutes (default 60, configurable per printer) are not auto-printed and are not silently marked failed.
  • Instead, the Status page shows a Recovery Banner: "X jobs queued while the bridge was offline. What would you like to do?" with three options: Print All, Discard All, and Review (shows the job list so the operator can cherry-pick).
  • Jobs younger than maxJobAgeMinutes are printed normally without prompting.
  • If the Recovery Banner is not acted on, stale jobs are auto-discarded after a timeout equal to the affected printer's own maxJobAgeMinutes (e.g. if maxJobAgeMinutes: 60, the banner auto-discards after 60 minutes of inaction). This keeps age sensitivity consistent per printer — a short-fuse station like Grill auto-discards faster than a lenient Bar station. Jobs are marked failed with lastError: "auto-discarded after recovery timeout". There is no separate global recoveryBannerTimeoutMinutes field.

This hands the decision to the operator rather than silently losing tickets or flooding the printer with cold orders.

3. Authentication lifecycle

The auth module sits behind an AuthProvider interface. The device-token implementation handles:

  • Token stored encrypted in config.json using the machine-local .secret key.
  • 401 response from FlowPOS: Invalidate cached token, attempt re-authentication once. If re-auth fails, set authStatus: error, stop agent, and prompt the user to re-pair.
  • Token revoked from admin: Same as 401 handling.
  • Token expired mid-job: Complete the in-flight job before handling the auth error.

4. Printer delivery semantics

"Printed" means the bridge successfully delivered bytes to the printer transport — not guaranteed paper output.

TransportSuccess definition
TCPSocket connected; all bytes written; connection closed without error within 5 s timeout
USB/serialDevice opened; payload written and flushed; device closed without error

Paper-out and cover-open handling: When DLE EOT status polling detects paper out or cover open on a TCP printer, the bridge pauses the print queue for that printer and shows a prominent alert in the Status page. Jobs already in-flight are allowed to complete (or fail). New jobs accumulate in the queue but are not sent. When the printer reports paper ok + cover closed, the queue resumes automatically. Jobs that were queued during the pause are delivered normally — they were not marked as failed, so no recovery prompt is needed.

5. Config and secret lifecycle

EventExpected behavior
Normal restartconfig.json and .secret persist on Docker volume; bridge starts in same state
Container replacement (new image)Volume re-mounted; config and secret survive
Docker volume migration (new host)Copy volume to new host; bridge starts normally
Local secret lost (volume damage)Bridge cannot decrypt device token. Clears deviceToken, preserves all printer settings, prompts re-pairing.
Full factory resetDelete Docker volume. All config is lost. Bridge shows first-run setup screen.

Non-sensitive fields (apiUrl, all printer instance settings) are preserved even when the device token is cleared.

6. Agent stop/quiesce behavior

The same quiesce sequence runs for both POST /api/stop (user-initiated) and SIGTERM (Docker shutdown). This ensures in-flight jobs and dedup state are never lost during container restarts or image updates.

Quiesce sequence:

  1. Stop accepting new WebSocket events and new poll-cycle jobs.
  2. Wait for all in-flight print jobs to complete (success or failure). Maximum wait: 30 s, then force-fail any remaining in-flight jobs.
  3. Flush dedup.json to disk immediately (do not wait for the 60-second batch timer).
  4. Persist agentEnabled: false to config.json (user-stop only; SIGTERM leaves agentEnabled unchanged so the agent resumes on next container start).
  5. Exit cleanly (SIGTERM path) or show Paused badges (user-stop path).

SIGTERM handler (src/index.ts):

process.on("SIGTERM", async () => {
await agent.quiesce(); // steps 1–3 above
process.exit(0);
});

Docker's default grace period is 10 seconds. The Dockerfile should set STOPSIGNAL SIGTERM (default) and the docker-compose.yml should set stop_grace_period: 30s to match the maximum quiesce wait.

On POST /api/start, the agent resumes from the poll fallback path to catch any jobs missed during the paused window (subject to max-age check).

7. Poll batching

When one printer instance serves multiple stations, the bridge must batch all station IDs into a single poll request rather than one request per station:

GET /print-jobs?stationIds[]=id1&stationIds[]=id2&status=pending

This requires a minor FlowPOS backend change (see Backend changes required). Without batching, a printer with 4 stations generates 4 API calls every 30 seconds — at 10 printers, that's 40 calls per bridge per 30 s, which will hit rate limits.

8. Print queue safety

Each printer instance maintains a bounded in-memory FIFO queue.

ParameterValueNotes
Max queue depth200 jobsIf exceeded, oldest unstarted jobs are dropped with failed status
Per-job send timeout5 s (TCP) / 10 s (USB)If the printer doesn't accept bytes within this window, the job fails
Stuck-job watchdog30 sIf a job has been in-flight longer than 30 s, it is force-failed and removed from the in-flight registry

The stuck-job watchdog prevents a single hung TCP connection from stalling the entire queue indefinitely.

9. Diagnostics and observability

GET /api/diagnostics returns a full structured snapshot. Required fields per printer instance:

FieldDescription
agentEnabledWhether the agent is running
wsStatusconnected · reconnecting · disconnected · error
wsLastConnectedAtTimestamp of last successful WebSocket connect
wsLastDisconnectedAtTimestamp + reason of last disconnect
authStatusok · error
authLastRefreshedAtTimestamp of last successful token check
printerStatusonline · offline · unknown
printerPaperStatusok · low · out · unknown (from DLE EOT if supported)
printerCoverStatusclosed · open · unknown
printerLastSuccessAtTimestamp of last successful delivery
printerLastErrorAtTimestamp + message of last failure
lastTestPrintAtTimestamp + result of last test print
pollLastRunAtTimestamp of last poll cycle
pollLastJobsFoundTotal jobs found across all station polls in last cycle
jobsInFlightArray of { jobId, stationId } currently being processed
jobsCompletedCountTotal delivered since last restart
jobsFailedCountTotal failed since last restart
jobsSkippedCountTotal skipped (too old) since last restart

Config file schema (/app/data/config.json)

{
"apiUrl": "https://api.flowpos.app",
"deviceToken": "<AES-256-GCM encrypted>",
"localPasswordHash": "<bcrypt hash>",
"agentEnabled": true,
"timezone": "America/Guatemala",
"printers": [
{
"id": "printer-1",
"stationIds": ["uuid-grill", "uuid-desserts"],
"label": "Hot Line",
"connectionType": "tcp",
"printerUrl": "tcp://192.168.1.100:9100",
"usbDevice": null,
"paperWidthMm": 80,
"headerLines": ["My Restaurant"],
"cutAfterEach": true,
"copyCount": 1,
"maxJobAgeMinutes": 60
}
]
}
  • deviceToken — encrypted with AES-256-GCM using BRIDGE_ENCRYPT_KEY (derived from BRIDGE_SECRET). Never returned by the API.
  • localPasswordHash — bcrypt hash of the operator-set local fallback password. Used when Firebase is unreachable. Set from the Config page. Never returned by the API.
  • timezone — IANA timezone identifier (e.g. "America/Guatemala"). All timestamps shown in the web UI, test slip, job feed, and diagnostics are rendered in this timezone. Defaults to the system timezone if omitted.
  • Recommended: pass BRIDGE_SECRET as a Docker environment variable. Two sub-keys (BRIDGE_ENCRYPT_KEY, BRIDGE_JWT_KEY) are derived from it at startup via HKDF-SHA256. Generate once: openssl rand -hex 32.
  • If BRIDGE_SECRET is not set and /app/data/.secret does not exist, the bridge generates a new master secret, writes it to /app/data/.secret, and logs a warning recommending migration to the env var approach.
  • /app/data/dedup.json — persisted order item de-dup registry, batch-written every 60 s, pruned to 24 h window on startup and hourly.

Error handling

ScenarioBehavior
Printer TCP timeout (5 s)Mark job failed, update printerLastErrorAt, retry on next poll cycle
Duplicate job (WebSocket + poll)Layer 1 (job ID) prevents double print; server-side PATCH is also idempotent
Duplicate job for already-printed itemLayer 2 (orderItemId + stationId + modifiedAt) suppresses it; job marked printed without re-printing
Modified order item reprintDifferent modifiedAt produces a new Layer 2 key — always prints through; kitchen sees the change
dedup.json corrupt on startupBridge starts with empty registry and logs a warning; does not crash
Job older than maxJobAgeMinutes on startup/resumeHeld in pending-review set; Recovery Banner shown; operator decides Print All / Discard All / Review
WebSocket disconnectExponential backoff reconnect; update wsStatus in diagnostics
Device token 401Invalidate token; attempt re-auth once; if fails set authStatus: error and stop agent
Device token revokedSame as 401
USB device path changedIf using ephemeral ttyUSB* path, device may not be found after replug. Use by-id paths to avoid this.
USB device disconnectedMark printer offline; retry on next poll
Reprint job not in memoryPOST /api/jobs/:id/reprint returns 404; UI disables button for jobs outside the buffer
Agent stop requestedIn-flight jobs complete before pause; agentEnabled: false persisted
Local secret lostClear deviceToken; preserve printer settings and localPasswordHash; prompt re-pairing
Firebase unreachable (no internet)Login page shows "Use local password" fallback; issues BRIDGE_JWT_KEY-signed JWT valid for 8 h
Local password not set + Firebase downBridge UI is inaccessible; agent keeps running; operator can add BRIDGE_EMERGENCY_TOKEN env var and restart container to regain access
Paper out / cover openDetected via DLE EOT status poll (TCP only, if supported); pauses the print queue for that printer; shown as alert in Status page; queue resumes automatically when paper ok + cover closed is detected

Tech Stack

LayerTechnology
Agent runtimeNode.js 22
HTTP serverExpress 5
WebSocket clientsocket.io-client v4
ESC/POS encoding@point-of-sale/receipt-printer-encoder (same as backend)
TCP transportNode.js net module (same as backend adapter)
USB transport@node-escpos/usb + @node-escpos/serialport
Agent authDevice pairing token (exchanged via POST /kitchen-stations/pair-bridge)
UI auth (primary)Firebase email/password — same SDK as apps/frontend-pwa
UI auth (fallback)Local bcrypt password + BRIDGE_JWT_KEY-signed JWT (derived via HKDF from BRIDGE_SECRET) — works offline
Web UIReact 19 + Vite + Tailwind CSS + Shadcn/ui
Config storageJSON file + AES-256-GCM encryption
ContainerDocker, Node 22 Alpine, multi-stage build
Auto-updatesWatchtower (containrrr/watchtower) scoped by label

Out of scope (v1)

  • Bluetooth printer support
  • Receipt printing for sales (POS receipts) — this spec covers kitchen tickets only
  • Auto-discovery of printers on the LAN
  • Remote management from the FlowPOS admin UI
  • Persistent job history beyond last-50 in-memory buffer per printer (next extension point)
  • Priority ordering across stations on the same printer (jobs are delivered FIFO regardless of station)

Backend changes required

The following FlowPOS backend changes are needed to support this spec:

ChangeDescription
POST /kitchen-stations/pair-bridgeValidates pairing code from Redis, reads businessId + locationId from the stored value, creates a kds_device record (type: "print_bridge", scoped to that business/location), returns long-lived device token.
Pairing code generationAdmin UI trigger (or API) stores kds:pair:bridge:{code}{ businessId, locationId } in Redis with 10-minute TTL. Code is scoped to the generating admin's business/location.
Token auth for bridgeThe existing KdsDeviceGuard or a new BridgeDeviceGuard validates the device token on bridge API calls.
GET /print-jobs/:id?expand=detailsEnsure this single-job expand endpoint exists (current implementation uses the list endpoint). Required by the bridge per-job fetch in flow step 3d.
print_job.new event augmentation (optional optimisation)Include ticketData from the associated kitchen_ticket in the WebSocket event payload. This would eliminate the per-job GET round-trip in the bridge, reducing latency by one network call per ticket. Not required for v1 but recommended before high-volume deployments.
GET /print-jobs batch station filterAccept stationIds[] query param (multiple station IDs) so the bridge can poll all stations in one request per printer instance rather than one request per station.
Add modifiedAt to print_job recordAdd a modifiedAt column to the print_job table, populated at job creation time from order_item.updated_at. Include this field in the print_job.new WebSocket event payload (alongside the existing base fields) so the bridge can build the Layer 2 de-dup key before issuing the GET /print-jobs/:id?expand=details round-trip.