Print Bridge — Specification
Status: Specified — ready for implementation Scope: New standalone app
apps/print-bridge/— local Docker agent + web UI Depends on: KDS & Printing infrastructure (see kds-tracking-model)
Problem
FlowPOS runs on GCP Cloud Run (serverless, Google's cloud). Thermal printers sit on the restaurant's local network at a private IP (e.g. 192.168.1.100:9100). There is no network path from GCP to a private LAN.
The backend's NetworkThermalPrinterAdapter sends raw TCP ESC/POS — this works in local development but will never reach a restaurant's printer in production.
The Print Bridge solves this by running inside the restaurant's network, acting as the last-mile delivery agent between the cloud and the physical printer.
Architecture
FlowPOS Cloud (GCP Cloud Run)
─────────────────────────────────
/restaurant WebSocket
PATCH /print-jobs/:id
│ ▲
│ │ (internet)
▼ │
Restaurant LAN
─────────────────────────────────
Print Bridge (Docker container)
│
├─── Printer A (TCP :9100) ──▶ Grill Station + Desserts Station
├─── Printer B (TCP :9100) ──▶ Salads Station
└─── Printer C (USB) ──▶ Bar Station + Cocktails Station
Authority boundaries
FlowPOS cloud is the authority for print job lifecycle (pending → printed / failed).
Print Bridge is the authority for local transport state, printer settings, and ephemeral operational telemetry. The bridge's internal API must not become a second source of truth for job state — it reads job metadata from FlowPOS, delivers bytes to hardware, and reports results back.
Flow (per printer instance)
- Bridge authenticates with FlowPOS using a device pairing token (see Authentication).
- For each configured printer instance, connects to the
/restaurantWebSocket namespace and subscribes toprint_job.newfor each of its station IDs. One printer may serve multiple stations; all subscriptions share the same physical transport. - On each
print_job.newevent: a. Layer 1 check — skip if job ID is already in-flight or recently completed in the local registry. b. Layer 2 check — compute keyorderItemId + stationId + (modifiedAt ?? createdAt)from the event payload; skip if this key already exists in the order item registry (marks the jobprintedwithlastError: "duplicate suppressed by bridge"— see NFR #1). This check runs before the GET to avoid unnecessary network I/O. c. Max-age check — if the job is older thanmaxJobAgeMinutes(default 60), it is held in a pending-review set rather than auto-printed or auto-failed. The Status page shows a Recovery Banner prompting the operator to Print All, Discard All, or Review (see NFR #2). d. Marks job as in-flight in local registry. e. Fetches full job details viaGET /print-jobs/:id?expand=details. This GET is required. Theprint_job.newWebSocket event payload contains only the baseprint_jobrecord (id,orderItemId,stationId,status,attempts,createdAt,lastError,modifiedAt) — it does not include ticket content. Theexpand=detailsresponse addsproductName,tableNumber, and the nestedorderItemneeded to render the ESC/POS ticket. See Backend changes required for a recommended future optimisation that would eliminate this round-trip. f. Renders the ESC/POS ticket locally. The station name is printed on the ticket so kitchen staff can identify which logical area it belongs to even though all tickets come from the same printer. g. Sends the binary buffer to the printer (TCP or USB). Jobs from multiple stations are queued and delivered sequentially — no interleaving. Success means: socket connected, all bytes sent, connection closed cleanly — not paper-in-hand. h. CallsPATCH /print-jobs/:idwith{ status: "printed" }on success, or{ status: "failed", lastError: "..." }on error. i. Moves job from in-flight to completed in local registry; adds its Layer 2 key to the order item registry. - Polls
GET /print-jobs?stationIds[]=id1&stationIds[]=id2&status=pendingevery 30 s as a fallback for missed WebSocket events (batched — one request per printer instance, not per station). Jobs already in the local registry are skipped; jobs older thanmaxJobAgeMinutesare held for operator review. - Reconnects automatically on WebSocket disconnect with exponential backoff (1 s → 2 s → 4 s → max 30 s).
Authentication
Two separate auth contexts
The bridge has two distinct authentication concerns that must not be confused:
| Context | Who authenticates | Method | Purpose |
|---|---|---|---|
| Web UI → Bridge API | Human operator | Firebase email/password (same as frontend-pwa) | Protects http://localhost:3456 from unauthorized LAN access |
| Bridge Agent → FlowPOS | The bridge process | Device pairing token | Allows the agent to subscribe to WebSocket events and update job status |
Web UI authentication (Firebase + local fallback)
The bridge web UI uses the same Firebase email/password login flow as apps/frontend-pwa as its primary auth method:
signInWithEmailAndPassword(auth, email, password)fromfirebase/authonIdTokenChangedlistener manages the session, caches token insessionStorageunderflowpos:auth:cache, and auto-refreshes every 55 minutes- Every request from the React UI to the bridge Express API (
/api/*) sends the Firebase ID token asAuthorization: Bearer <token> - The Express server validates the token using the Firebase Admin SDK. Any unauthenticated request to
/api/*returns 401 and the UI redirects to the login page. - The login page matches the PWA sign-in page: email/password form, same error messages, same glassmorphism card style.
Firebase config is a build-time variable. VITE_PUBLIC_FIREBASE_CONFIG is baked into the JS bundle at vite build — it cannot be passed as a Docker runtime env var after the image is built. The official ghcr.io/fixxrepo/flowpos-workspace/print-bridge image is pre-built with the production Firebase project config. For self-hosted builds, set VITE_PUBLIC_FIREBASE_CONFIG at image build time, not in docker-compose.yml.
Local fallback auth (offline access). Firebase authentication requires an outbound internet connection. If the restaurant loses internet — exactly when an operator may need to stop the agent or check status — Firebase login fails. To handle this, the bridge supports a local admin password as a fallback:
- On first run, if no local password is set, the bridge prompts the operator to set one from the Config page (while internet is available).
- The password is hashed with bcrypt and stored in
config.jsonaslocalPasswordHash. - The login page shows a "Use local password" link below the Firebase form. Local login issues a short-lived signed JWT (signed with the derived
BRIDGE_JWT_KEY— see key derivation below), accepted by the Express middleware alongside Firebase tokens. - Local login grants the same access as Firebase login. It is not a backdoor — it requires the operator to have previously set the password while online.
Emergency access. If neither Firebase nor a local password is available (e.g. factory reset, local password never set, internet down), the operator can add BRIDGE_EMERGENCY_TOKEN=<any-random-string> to the Docker environment and restart the container. The Express middleware accepts this token as a valid auth credential. Remove it immediately after recovery.
The bridge logs a prominent warning on every startup when BRIDGE_EMERGENCY_TOKEN is set:
⚠️ WARNING: BRIDGE_EMERGENCY_TOKEN is set. The bridge UI is accessible to
anyone with this token. Remove this variable as soon as access is restored.
The token has no built-in expiry — the operator is solely responsible for removing it.
Key derivation. BRIDGE_SECRET is the single master secret but must never be used directly for two different purposes. Two sub-keys are derived from it at startup using HKDF-SHA256:
BRIDGE_ENCRYPT_KEY = HKDF(BRIDGE_SECRET, salt="bridge-encrypt", length=32)
BRIDGE_JWT_KEY = HKDF(BRIDGE_SECRET, salt="bridge-jwt", length=32)
BRIDGE_ENCRYPT_KEY is used for AES-256-GCM encryption of the device token. BRIDGE_JWT_KEY is used for signing local fallback JWTs. Compromising one does not compromise the other.
Agent authentication (device pairing token)
The bridge agent (not the UI) uses a device pairing model — the same mechanism used by KDS devices — to communicate with FlowPOS. This ensures printing survives password rotations and staff turnover.
Setup flow:
- In the FlowPOS Admin, navigate to Kitchen Stations → Print Bridge Devices → Generate Pairing Code.
- FlowPOS generates a short-lived code (e.g.
AB12-XY99), valid for 10 minutes, stored in Redis askds:pair:bridge:{code}→{ businessId, locationId }. The code is scoped to the business and location of the admin who generated it. - The logged-in operator enters the code into the Print Bridge Config page.
- The bridge sends
POST /kitchen-stations/pair-bridgewith{ pairingCode }. - FlowPOS validates the code, reads
businessId+locationIdfrom Redis, creates akds_devicerecord (type:print_bridge, scoped to that business/location), and returns a long-lived device token (SHA-256 hashed before storage). - The bridge stores the device token encrypted in
config.json. It is used as aBearertoken on all agent API calls and WebSocket auth.
This token survives container restarts and can be revoked from the FlowPOS Admin without affecting any user account.
Auth module interface (src/auth.ts)
interface AuthProvider {
getToken(): Promise<string>; // returns valid token (refreshes if needed)
invalidate(): void; // clear cached token on 401
readonly status: "ok" | "error";
readonly lastRefreshedAt: Date | null;
}
Multi-Station Support
One bridge container manages multiple printer instances. Each instance has its own connection settings, independent job queue, and WebSocket subscriptions — one per assigned station. A single physical printer can serve multiple stations. This avoids deploying one container per printer on a single Raspberry Pi or NUC.
Config structure
{
"apiUrl": "https://api.flowpos.app",
"deviceToken": "<encrypted>",
"agentEnabled": true,
"printers": [
{
"id": "printer-1",
"stationIds": ["uuid-grill", "uuid-desserts"],
"label": "Hot Line",
"connectionType": "tcp",
"printerUrl": "tcp://192.168.1.100:9100",
"paperWidthMm": 80,
"headerLines": ["My Restaurant"],
"cutAfterEach": true,
"copyCount": 1,
"maxJobAgeMinutes": 60
},
{
"id": "printer-2",
"stationIds": ["uuid-bar", "uuid-cocktails"],
"label": "Bar",
"connectionType": "usb",
"usbDevice": "/dev/serial/by-id/usb-Epson_TM-T20_...-port0",
"paperWidthMm": 58,
"headerLines": ["My Restaurant"],
"cutAfterEach": true,
"copyCount": 1,
"maxJobAgeMinutes": 60
}
]
}
stationIds is an array. A printer with a single station uses a one-element array. The station name is printed on each ticket header so staff can distinguish Grill from Desserts even when both print on the same machine.
Print queue per printer
Jobs from all assigned stations are placed into a single FIFO queue per printer and delivered sequentially. This prevents two tickets from interleaving on paper. The queue is in-memory; if the bridge restarts, the poll fallback re-discovers any missed jobs.
When copyCount > 1, the printer transport is held exclusively for all N copies before the next job is dequeued. Copy 1, copy 2, … copy N are sent back-to-back on the same logical queue slot — no other job can slip between them.
UI impact
The Printer Settings page allows adding, editing, and removing printer instances, with a multi-select station dropdown populated from GET /api/stations. Stations already assigned to another printer instance are shown as disabled in the dropdown — a station can only belong to one printer at a time. The Status page shows one card per printer listing all its assigned stations. The job feed is filterable by printer or by individual station.
Station uniqueness is enforced at the API level. POST /api/printers and PATCH /api/printers/:id reject any request where a stationId is already assigned to a different printer instance, returning HTTP 409 with a message identifying which printer holds the conflicting station. This prevents duplicate subscriptions that would cause every ticket to print twice.
App Structure
apps/print-bridge/
├── src/
│ ├── index.ts # Entry: starts Express server + agent
│ ├── agent.ts # Multi-instance orchestrator: one PrinterAgent per config entry
│ ├── printer-agent.ts # WebSocket client + polling + print loop for one printer instance (multiple stations)
│ ├── auth.ts # AuthProvider interface + device token implementation
│ ├── printer/
│ │ ├── index.ts # Transport factory (returns TCP or USB adapter)
│ │ ├── tcp.ts # Raw TCP socket sender + optional status poll (DLE EOT)
│ │ └── usb.ts # USB/serial sender via @node-escpos/usb
│ ├── renderer.ts # ESC/POS ticket builder (reuses backend logic)
│ ├── registry.ts # In-flight + recently-completed job registry (de-dup, per instance)
│ ├── diagnostics.ts # Internal state object + structured log
│ ├── config.ts # Read/write /app/data/config.json
│ ├── jobs.ts # In-memory job log (last 50 per instance), reprint logic
│ └── routes.ts # Express REST API for the web UI
├── ui/
│ └── src/
│ ├── main.tsx
│ ├── App.tsx
│ ├── pages/
│ │ ├── StatusPage.tsx # Live status cards per printer + job feed + search + reprint + start/stop
│ │ ├── PrinterSettingsPage.tsx # Add / edit / remove printer instances
│ │ └── ConfigPage.tsx # Pairing code entry + API URL
│ └── components/
├── Dockerfile # Multi-stage: build UI → copy into Node image
├── docker-compose.yml # Includes Watchtower for auto-updates
└── package.json
Web UI
Status page (/)
Always-visible live view intended to stay open on a kitchen display or manager's tablet.
| Section | Contents |
|---|---|
| Connection status | Badge: Connected (green) · Reconnecting (amber) · Error (red) · Paused (grey). Shows FlowPOS API URL and last connected timestamp. |
| Printer cards | One card per configured printer instance. Each shows: station name, Online / Offline / Unknown badge, connection type, address, last successful print timestamp, last error if any. |
| Paper/cover status | If the printer supports ESC/POS real-time status (DLE EOT), shows additional badges: Paper OK · Paper Low · Paper Out · Cover Open. |
| Auth status | Badge: OK · Error. Shows last token refresh timestamp. Error state prompts the user to go to Config and re-pair. |
| Agent controls | Start / Stop toggle button — activates or deactivates all printer agents without restarting the container. Stop waits for any in-flight jobs to complete before pausing. |
| Actions | Test Print button per printer — sends a dated ESC/POS test slip. Available only when agent is running and printer is online. |
| Job feed | Last 50 print jobs across all printers (or filtered by printer): item name, order number, table, timestamp, status badge (pending / printed / failed / skipped). Auto-updates via Server-Sent Events. Each row has a Reprint (↩) button. |
| Search | Inline text filter — filters job feed by item name or order number (client-side). |
Setup order
The bridge must be paired before printers can be configured. The Printer Settings page depends on GET /api/stations (which proxies FlowPOS using the device token) to populate station dropdowns. If no device token exists, GET /api/stations returns 401 and the dropdown is empty.
The UI enforces this: any navigation to /printers or / while unpaired redirects to /config with the message "Complete pairing before adding printers." Once pairing is complete, the redirect lifts automatically.
Printer settings page (/printers)
Manage all printer instances. Each instance can be added, edited, or removed.
| Field | Description |
|---|---|
| Label | Human-readable name (e.g. "Grill Station") |
| Station | Dropdown from GET /api/stations |
| Connection type | TCP · USB |
| Printer URL (TCP) | tcp://192.168.1.100:9100 |
| USB device (USB) | Dropdown of detected /dev/serial/by-id/... paths (persistent) and /dev/ttyUSB* (ephemeral) |
| Paper width | 58 mm · 80 mm |
| Header lines | Free text, one per line |
| Cut after each | Checkbox |
| Copies | 1–5. Each copy is produced by opening a new TCP socket connection and sending the full ESC/POS buffer again (or re-opening the USB device). ESC/POS-level multi-copy commands are not used — separate sends are more reliable across printer models. |
| Save | Persists changes; reconnects that printer if URL/type changed |
| Test Print | Fires a test slip immediately after save |
| Remove | Removes the printer instance from config and stops its agent |
Config page (/config)
Set once at installation. Used for pairing the bridge with FlowPOS.
| Field | Description |
|---|---|
| FlowPOS API URL | e.g. https://api.flowpos.app or http://localhost:4000 |
| Pairing code | Short-lived code from FlowPOS Admin (e.g. AB12-XY99). Exchanged for a long-lived device token. |
| Device status | Shows whether the bridge is paired, the station name, and token age. Allows re-pairing. |
| Timezone | IANA timezone identifier for the restaurant (e.g. America/Guatemala, America/Mexico_City). Used to format all timestamps in the web UI, test slip, job feed, and diagnostics in the restaurant's local time. Defaults to the bridge machine's system timezone if not set. |
Printer Support
TCP / Network (primary)
Any ESC/POS thermal printer reachable on the local network via raw TCP (port 9100 default).
Compatible brands: Epson TM series, Star Micronics TSP/SP series, Bixolon SRP series, Citizen CT series, SNBC BTP series, Rongta RP series, Xprinter, MUNBYN, and any generic ESC/POS network printer.
Requirement: Printer must have an ethernet or Wi-Fi network interface. USB-only printers are not supported by this path.
URL format: tcp://192.168.1.100:9100
Paper/cover status polling: If supported by the model, the bridge sends DLE EOT 1 (0x10 0x04 0x01) on a dedicated status connection every 10 seconds, but only when the print queue is idle (no job in-flight). Polling is suspended while a job is being sent to avoid interleaving status bytes with print data on the TCP stream. Results populate the Status page badges (Paper OK / Paper Out / Cover Open).
USB / Serial
USB thermal printers that expose a serial or raw USB interface.
Docker requirement: The host USB device must be passed through:
docker run -d \
-p 3456:3456 \
--device /dev/ttyUSB0:/dev/ttyUSB0 \
-v print-bridge-data:/app/data \
ghcr.io/fixxrepo/flowpos-workspace/print-bridge
Persistent device paths: Use /dev/serial/by-id/... instead of /dev/ttyUSB0. USB-to-serial adapters can change enumeration index if unplugged and replugged; the by-id path is stable across reboots and port swaps:
docker run -d \
--device /dev/serial/by-id/usb-Epson_TM-T20_XXXXXXXX-port0:/dev/ttyUSB0 \
...
The Printer Settings UI shows both paths — by-id entries are labeled Stable and ttyUSB* entries are labeled Ephemeral.
Library: @node-escpos/usb (libusb bindings).
Bluetooth — not in v1
Bluetooth SPP printers require OS-level pairing and rfcomm bind before Docker can see the device. This is too fragile for a production kitchen environment. May be added in a future version.
Docker Deployment
Single-command install
docker run -d \
--name flowpos-print-bridge \
--restart unless-stopped \
-p 3456:3456 \
-v print-bridge-data:/app/data \
--label com.centurylinklabs.watchtower.enable=true \
ghcr.io/fixxrepo/flowpos-workspace/print-bridge
Then open http://localhost:3456 on the same machine to configure it.
docker-compose.yml
Includes Watchtower for automatic image updates scoped only to the bridge container. The watchtower.enable=true label prevents Watchtower from accidentally restarting unrelated local services during service hours.
services:
print-bridge:
image: ghcr.io/fixxrepo/flowpos-workspace/print-bridge
restart: unless-stopped
stop_grace_period: 30s # matches quiesce max-wait; Docker default is 10 s
ports:
- "3456:3456"
volumes:
- print-bridge-data:/app/data
environment:
# Master secret — two sub-keys (encrypt + JWT) are derived from this via HKDF at startup.
# Generate with: openssl rand -hex 32
- BRIDGE_SECRET=your-32-byte-hex-secret-here
# Emergency access only — remove after recovery. Accepts any value as a valid auth token.
# - BRIDGE_EMERGENCY_TOKEN=
# Note: VITE_PUBLIC_FIREBASE_CONFIG is a BUILD-TIME variable — it is already baked
# into the official ghcr.io/fixxrepo/flowpos-workspace/print-bridge image. Do not set it here at runtime.
labels:
- "com.centurylinklabs.watchtower.enable=true"
# Uncomment and adjust to pass through a USB printer.
# Use /dev/serial/by-id/... for stable paths (recommended):
# devices:
# - /dev/serial/by-id/usb-Epson_TM-T20_XXXXXXXX-port0:/dev/ttyUSB0
watchtower:
image: containrrr/watchtower
restart: unless-stopped
volumes:
- /var/run/docker.sock:/var/run/docker.sock
# Runs at 3:00 AM daily — avoids updating during dinner service.
command: --label-enable --schedule "0 3 * * *"
environment:
- WATCHTOWER_CLEANUP=true
volumes:
print-bridge-data:
Dockerfile strategy
Multi-stage build:
- Stage 1 — build the React UI (
vite build→dist/) - Stage 2 — compile TypeScript agent (
tsc→dist/) - Stage 3 — final Node.js 22 Alpine image, copies both build outputs. Express serves the UI as static files and the API under
/api.
Single container, no nginx, no separate frontend server.
Important: Always mount a persistent Docker volume for
/app/data. This volume containsconfig.jsonand the.secretencryption key. Losing the volume requires full re-pairing and re-configuration.
Express API (internal — used by web UI)
| Method | Path | Description |
|---|---|---|
GET | /api/status | Agent state, all printer states, auth state, last 50 jobs |
GET | /api/diagnostics | Full structured diagnostic snapshot for support |
GET | /api/config | Current config (device token redacted) |
POST | /api/config | Save API URL + trigger re-pair flow |
POST | /api/pair | Exchange pairing code for device token |
GET | /api/printers | List configured printer instances |
POST | /api/printers | Add a new printer instance |
PATCH | /api/printers/:id | Update a printer instance; reconnects if URL/type changed |
DELETE | /api/printers/:id | Remove a printer instance and stop its agent |
POST | /api/test-print/:printerId | Send test ESC/POS slip to a specific printer |
POST | /api/start | Start all printer agents |
POST | /api/stop | Stop all printer agents; waits for in-flight jobs to finish |
GET | /api/jobs | List jobs with optional ?q=<search>&printerId=<id> filters |
POST | /api/jobs/:id/reprint | Re-send ESC/POS bytes; bypasses de-dup registry |
GET | /api/devices/usb | List detected USB/serial devices (both by-id and ttyUSB*) |
GET | /api/stations | Proxy GET /kitchen-stations from FlowPOS |
GET | /events | Server-Sent Events stream — pushes status and job updates in real time |
Test Print
A test print sends a short ESC/POS slip directly to the configured printer without creating a print_job record in FlowPOS. It is used to verify connectivity, paper alignment, and print settings after installation or after any printer change.
Trigger points
| Where | When |
|---|---|
| Status page — Test Print button (per printer) | On demand; only enabled when agent is running and printer is online |
| Printer Settings page — Test Print button | Automatically offered after saving new settings; can also be triggered manually |
POST /api/test-print/:printerId | Direct API call (useful for scripting or remote diagnostics) |
Test slip contents
================================
[header lines from config]
================================
** TEST PRINT **
================================
Bridge: flowpos-print-bridge
Station: Grill Station
Printer: tcp://192.168.1.100:9100
Paper: 80 mm
Date: 2026-04-06 14:32:05 (America/Guatemala)
================================
If you can read this, the
printer is working correctly.
================================
[cut]
The slip is built locally by @point-of-sale/receipt-printer-encoder — no network round-trip to FlowPOS is required. The date is formatted in the configured timezone (falls back to system timezone if not set). lastTestPrintAt in the diagnostics object is updated on every attempt.
API response
{ "success": true, "message": "Test slip sent to tcp://192.168.1.100:9100" }
or on failure:
{ "success": false, "error": "TCP connect ECONNREFUSED 192.168.1.100:9100" }
Non-Functional Requirements
These requirements must be addressed in the implementation, not deferred.
1. Duplicate-print prevention
The bridge enforces two independent de-duplication layers. Both must pass before a job is sent to the printer.
Layer 1 — Job ID registry (guards against WebSocket + poll race)
Each printer instance maintains a local in-flight + recently-completed registry keyed by print_job.id. A job is excluded if:
- It is currently in-flight (being printed).
- It appears in the completed registry (successfully delivered since last restart).
- Its
statusin FlowPOS is alreadyprinted(checked on poll path).
Layer 2 — Order item registry (guards against duplicate print_job records for the same item)
The bridge maintains a registry keyed by orderItemId + stationId + (modifiedAt ?? createdAt). If a second print_job record arrives for the same key, the bridge skips it, logs a warning, and calls PATCH /print-jobs/:id with { status: "printed", lastError: "duplicate suppressed by bridge" } — marking it complete without re-printing.
The modifiedAt component is critical: if an order item is modified after the original ticket printed (e.g. "no onions" → "add jalapeños"), a new print_job arrives with a different modifiedAt timestamp, producing a different key. Layer 2 allows it through and the kitchen sees the modification. An exact re-submission with identical timestamps is still suppressed.
modifiedAt source: The Layer 2 key is computed as soon as the print_job.new event is received — before the GET /print-jobs/:id?expand=details fetch in step 3d. This is intentional: the de-dup check must run before any network I/O. The modifiedAt value is therefore taken directly from the print_job record in the event payload, not from ticketData (which is only available after the GET). This requires the backend to include a modifiedAt column on print_job (see Backend changes required).
Persistence: The registry is kept in memory and batch-written to /app/data/dedup.json every 60 seconds, not on every print. This avoids per-print disk I/O and prevents file corruption from a mid-write crash. On startup, the file is loaded with best-effort recovery — if the JSON is corrupt, the bridge starts with an empty registry and logs a warning rather than crashing.
Periodic pruning: An in-process timer prunes entries older than 24 hours every hour while the bridge is running, preventing unbounded memory growth on long-running deployments. The same 24-hour window is applied when loading the file on startup.
The registry is scoped per station. If the same item is legitimately routed to two different stations (e.g. Grill and Expo), each station prints it once independently.
Reprint bypass
Explicit reprint via POST /api/jobs/:id/reprint bypasses both layers — it is a deliberate user action, not automatic re-delivery. The UI requires confirmation before reprinting to prevent accidental duplicates.
2. Ghost printer / backlog prevention
If the agent is stopped and then restarted, it must not silently flood the printer with a backlog of stale orders, nor silently discard them.
On restart, the bridge checks for pending jobs that arrived while it was offline. If any are found:
- Jobs older than
maxJobAgeMinutes(default 60, configurable per printer) are not auto-printed and are not silently marked failed. - Instead, the Status page shows a Recovery Banner: "X jobs queued while the bridge was offline. What would you like to do?" with three options: Print All, Discard All, and Review (shows the job list so the operator can cherry-pick).
- Jobs younger than
maxJobAgeMinutesare printed normally without prompting. - If the Recovery Banner is not acted on, stale jobs are auto-discarded after a timeout equal to the affected printer's own
maxJobAgeMinutes(e.g. ifmaxJobAgeMinutes: 60, the banner auto-discards after 60 minutes of inaction). This keeps age sensitivity consistent per printer — a short-fuse station like Grill auto-discards faster than a lenient Bar station. Jobs are markedfailedwithlastError: "auto-discarded after recovery timeout". There is no separate globalrecoveryBannerTimeoutMinutesfield.
This hands the decision to the operator rather than silently losing tickets or flooding the printer with cold orders.
3. Authentication lifecycle
The auth module sits behind an AuthProvider interface. The device-token implementation handles:
- Token stored encrypted in
config.jsonusing the machine-local.secretkey. - 401 response from FlowPOS: Invalidate cached token, attempt re-authentication once. If re-auth fails, set
authStatus: error, stop agent, and prompt the user to re-pair. - Token revoked from admin: Same as 401 handling.
- Token expired mid-job: Complete the in-flight job before handling the auth error.
4. Printer delivery semantics
"Printed" means the bridge successfully delivered bytes to the printer transport — not guaranteed paper output.
| Transport | Success definition |
|---|---|
| TCP | Socket connected; all bytes written; connection closed without error within 5 s timeout |
| USB/serial | Device opened; payload written and flushed; device closed without error |
Paper-out and cover-open handling: When DLE EOT status polling detects paper out or cover open on a TCP printer, the bridge pauses the print queue for that printer and shows a prominent alert in the Status page. Jobs already in-flight are allowed to complete (or fail). New jobs accumulate in the queue but are not sent. When the printer reports paper ok + cover closed, the queue resumes automatically. Jobs that were queued during the pause are delivered normally — they were not marked as failed, so no recovery prompt is needed.
5. Config and secret lifecycle
| Event | Expected behavior |
|---|---|
| Normal restart | config.json and .secret persist on Docker volume; bridge starts in same state |
| Container replacement (new image) | Volume re-mounted; config and secret survive |
| Docker volume migration (new host) | Copy volume to new host; bridge starts normally |
| Local secret lost (volume damage) | Bridge cannot decrypt device token. Clears deviceToken, preserves all printer settings, prompts re-pairing. |
| Full factory reset | Delete Docker volume. All config is lost. Bridge shows first-run setup screen. |
Non-sensitive fields (apiUrl, all printer instance settings) are preserved even when the device token is cleared.
6. Agent stop/quiesce behavior
The same quiesce sequence runs for both POST /api/stop (user-initiated) and SIGTERM (Docker shutdown). This ensures in-flight jobs and dedup state are never lost during container restarts or image updates.
Quiesce sequence:
- Stop accepting new WebSocket events and new poll-cycle jobs.
- Wait for all in-flight print jobs to complete (success or failure). Maximum wait: 30 s, then force-fail any remaining in-flight jobs.
- Flush
dedup.jsonto disk immediately (do not wait for the 60-second batch timer). - Persist
agentEnabled: falsetoconfig.json(user-stop only; SIGTERM leavesagentEnabledunchanged so the agent resumes on next container start). - Exit cleanly (SIGTERM path) or show Paused badges (user-stop path).
SIGTERM handler (src/index.ts):
process.on("SIGTERM", async () => {
await agent.quiesce(); // steps 1–3 above
process.exit(0);
});
Docker's default grace period is 10 seconds. The Dockerfile should set STOPSIGNAL SIGTERM (default) and the docker-compose.yml should set stop_grace_period: 30s to match the maximum quiesce wait.
On POST /api/start, the agent resumes from the poll fallback path to catch any jobs missed during the paused window (subject to max-age check).
7. Poll batching
When one printer instance serves multiple stations, the bridge must batch all station IDs into a single poll request rather than one request per station:
GET /print-jobs?stationIds[]=id1&stationIds[]=id2&status=pending
This requires a minor FlowPOS backend change (see Backend changes required). Without batching, a printer with 4 stations generates 4 API calls every 30 seconds — at 10 printers, that's 40 calls per bridge per 30 s, which will hit rate limits.
8. Print queue safety
Each printer instance maintains a bounded in-memory FIFO queue.
| Parameter | Value | Notes |
|---|---|---|
| Max queue depth | 200 jobs | If exceeded, oldest unstarted jobs are dropped with failed status |
| Per-job send timeout | 5 s (TCP) / 10 s (USB) | If the printer doesn't accept bytes within this window, the job fails |
| Stuck-job watchdog | 30 s | If a job has been in-flight longer than 30 s, it is force-failed and removed from the in-flight registry |
The stuck-job watchdog prevents a single hung TCP connection from stalling the entire queue indefinitely.
9. Diagnostics and observability
GET /api/diagnostics returns a full structured snapshot. Required fields per printer instance:
| Field | Description |
|---|---|
agentEnabled | Whether the agent is running |
wsStatus | connected · reconnecting · disconnected · error |
wsLastConnectedAt | Timestamp of last successful WebSocket connect |
wsLastDisconnectedAt | Timestamp + reason of last disconnect |
authStatus | ok · error |
authLastRefreshedAt | Timestamp of last successful token check |
printerStatus | online · offline · unknown |
printerPaperStatus | ok · low · out · unknown (from DLE EOT if supported) |
printerCoverStatus | closed · open · unknown |
printerLastSuccessAt | Timestamp of last successful delivery |
printerLastErrorAt | Timestamp + message of last failure |
lastTestPrintAt | Timestamp + result of last test print |
pollLastRunAt | Timestamp of last poll cycle |
pollLastJobsFound | Total jobs found across all station polls in last cycle |
jobsInFlight | Array of { jobId, stationId } currently being processed |
jobsCompletedCount | Total delivered since last restart |
jobsFailedCount | Total failed since last restart |
jobsSkippedCount | Total skipped (too old) since last restart |
Config file schema (/app/data/config.json)
{
"apiUrl": "https://api.flowpos.app",
"deviceToken": "<AES-256-GCM encrypted>",
"localPasswordHash": "<bcrypt hash>",
"agentEnabled": true,
"timezone": "America/Guatemala",
"printers": [
{
"id": "printer-1",
"stationIds": ["uuid-grill", "uuid-desserts"],
"label": "Hot Line",
"connectionType": "tcp",
"printerUrl": "tcp://192.168.1.100:9100",
"usbDevice": null,
"paperWidthMm": 80,
"headerLines": ["My Restaurant"],
"cutAfterEach": true,
"copyCount": 1,
"maxJobAgeMinutes": 60
}
]
}
deviceToken— encrypted with AES-256-GCM usingBRIDGE_ENCRYPT_KEY(derived fromBRIDGE_SECRET). Never returned by the API.localPasswordHash— bcrypt hash of the operator-set local fallback password. Used when Firebase is unreachable. Set from the Config page. Never returned by the API.timezone— IANA timezone identifier (e.g."America/Guatemala"). All timestamps shown in the web UI, test slip, job feed, and diagnostics are rendered in this timezone. Defaults to the system timezone if omitted.- Recommended: pass
BRIDGE_SECRETas a Docker environment variable. Two sub-keys (BRIDGE_ENCRYPT_KEY,BRIDGE_JWT_KEY) are derived from it at startup via HKDF-SHA256. Generate once:openssl rand -hex 32. - If
BRIDGE_SECRETis not set and/app/data/.secretdoes not exist, the bridge generates a new master secret, writes it to/app/data/.secret, and logs a warning recommending migration to the env var approach. /app/data/dedup.json— persisted order item de-dup registry, batch-written every 60 s, pruned to 24 h window on startup and hourly.
Error handling
| Scenario | Behavior |
|---|---|
| Printer TCP timeout (5 s) | Mark job failed, update printerLastErrorAt, retry on next poll cycle |
| Duplicate job (WebSocket + poll) | Layer 1 (job ID) prevents double print; server-side PATCH is also idempotent |
| Duplicate job for already-printed item | Layer 2 (orderItemId + stationId + modifiedAt) suppresses it; job marked printed without re-printing |
| Modified order item reprint | Different modifiedAt produces a new Layer 2 key — always prints through; kitchen sees the change |
dedup.json corrupt on startup | Bridge starts with empty registry and logs a warning; does not crash |
Job older than maxJobAgeMinutes on startup/resume | Held in pending-review set; Recovery Banner shown; operator decides Print All / Discard All / Review |
| WebSocket disconnect | Exponential backoff reconnect; update wsStatus in diagnostics |
| Device token 401 | Invalidate token; attempt re-auth once; if fails set authStatus: error and stop agent |
| Device token revoked | Same as 401 |
| USB device path changed | If using ephemeral ttyUSB* path, device may not be found after replug. Use by-id paths to avoid this. |
| USB device disconnected | Mark printer offline; retry on next poll |
| Reprint job not in memory | POST /api/jobs/:id/reprint returns 404; UI disables button for jobs outside the buffer |
| Agent stop requested | In-flight jobs complete before pause; agentEnabled: false persisted |
| Local secret lost | Clear deviceToken; preserve printer settings and localPasswordHash; prompt re-pairing |
| Firebase unreachable (no internet) | Login page shows "Use local password" fallback; issues BRIDGE_JWT_KEY-signed JWT valid for 8 h |
| Local password not set + Firebase down | Bridge UI is inaccessible; agent keeps running; operator can add BRIDGE_EMERGENCY_TOKEN env var and restart container to regain access |
| Paper out / cover open | Detected via DLE EOT status poll (TCP only, if supported); pauses the print queue for that printer; shown as alert in Status page; queue resumes automatically when paper ok + cover closed is detected |
Tech Stack
| Layer | Technology |
|---|---|
| Agent runtime | Node.js 22 |
| HTTP server | Express 5 |
| WebSocket client | socket.io-client v4 |
| ESC/POS encoding | @point-of-sale/receipt-printer-encoder (same as backend) |
| TCP transport | Node.js net module (same as backend adapter) |
| USB transport | @node-escpos/usb + @node-escpos/serialport |
| Agent auth | Device pairing token (exchanged via POST /kitchen-stations/pair-bridge) |
| UI auth (primary) | Firebase email/password — same SDK as apps/frontend-pwa |
| UI auth (fallback) | Local bcrypt password + BRIDGE_JWT_KEY-signed JWT (derived via HKDF from BRIDGE_SECRET) — works offline |
| Web UI | React 19 + Vite + Tailwind CSS + Shadcn/ui |
| Config storage | JSON file + AES-256-GCM encryption |
| Container | Docker, Node 22 Alpine, multi-stage build |
| Auto-updates | Watchtower (containrrr/watchtower) scoped by label |
Out of scope (v1)
- Bluetooth printer support
- Receipt printing for sales (POS receipts) — this spec covers kitchen tickets only
- Auto-discovery of printers on the LAN
- Remote management from the FlowPOS admin UI
- Persistent job history beyond last-50 in-memory buffer per printer (next extension point)
- Priority ordering across stations on the same printer (jobs are delivered FIFO regardless of station)
Backend changes required
The following FlowPOS backend changes are needed to support this spec:
| Change | Description |
|---|---|
POST /kitchen-stations/pair-bridge | Validates pairing code from Redis, reads businessId + locationId from the stored value, creates a kds_device record (type: "print_bridge", scoped to that business/location), returns long-lived device token. |
| Pairing code generation | Admin UI trigger (or API) stores kds:pair:bridge:{code} → { businessId, locationId } in Redis with 10-minute TTL. Code is scoped to the generating admin's business/location. |
| Token auth for bridge | The existing KdsDeviceGuard or a new BridgeDeviceGuard validates the device token on bridge API calls. |
GET /print-jobs/:id?expand=details | Ensure this single-job expand endpoint exists (current implementation uses the list endpoint). Required by the bridge per-job fetch in flow step 3d. |
print_job.new event augmentation (optional optimisation) | Include ticketData from the associated kitchen_ticket in the WebSocket event payload. This would eliminate the per-job GET round-trip in the bridge, reducing latency by one network call per ticket. Not required for v1 but recommended before high-volume deployments. |
GET /print-jobs batch station filter | Accept stationIds[] query param (multiple station IDs) so the bridge can poll all stations in one request per printer instance rather than one request per station. |
Add modifiedAt to print_job record | Add a modifiedAt column to the print_job table, populated at job creation time from order_item.updated_at. Include this field in the print_job.new WebSocket event payload (alongside the existing base fields) so the bridge can build the Layer 2 de-dup key before issuing the GET /print-jobs/:id?expand=details round-trip. |