FEL Network Troubleshooting Guide
Problem Summary
The FEL service is experiencing ETIMEDOUT errors when trying to connect to external FEL provider endpoints. This indicates network-level connectivity issues between Google Cloud Run and the FEL provider's servers.
Error Details
- Error Code:
ETIMEDOUT - HTTP Status: 502 Bad Gateway
- Message: "Network timeout - Could not reach FEL provider"
- Observed Behavior: Request times out in ~550ms (much faster than the configured 30s timeout)
Solution 1: Code Changes (Implemented ✅)
Updated HTTP/HTTPS Agent Configuration
The fel.modules.ts has been updated with explicit TCP socket timeouts and connection management:
HttpModule.register({
timeout: 30000,
maxRedirects: 5,
httpAgent: new http.Agent({
keepAlive: true,
keepAliveMsecs: 30000,
timeout: 30000,
scheduling: "lifo",
}),
httpsAgent: new https.Agent({
keepAlive: true,
keepAliveMsecs: 30000,
timeout: 30000,
scheduling: "lifo",
rejectUnauthorized: true,
}),
})
Benefits:
- Explicit socket-level timeout configuration
- Connection keepAlive for better performance
- LIFO scheduling for improved request handling
Solution 2: Network Configuration Checks
Step 1: Verify FEL Provider Endpoint
First, verify that the FEL provider's endpoint is accessible:
# Check if the FEL provider endpoint is reachable
curl -v https://<FEL_PROVIDER_ENDPOINT>/sharedInfo?NIT=000017195594&DATA1=SHARED_GETINFONITcom&DATA2=NIT|17195594&USERNAME=<USERNAME> \
-H "Authorization: Bearer <TOKEN>"
Step 2: Test from Cloud Run Container
Deploy a debug container to test connectivity from within Cloud Run:
# Deploy a debug container
gcloud run deploy debug-container \
--image=gcr.io/google.com/cloudsdktool/cloud-sdk:alpine \
--region=us-central1 \
--project=barto-dev \
--command=/bin/sh \
--args=-c,"sleep 3600"
# Execute a command in the running container
gcloud run services proxy debug-container --region=us-central1 --project=barto-dev
Then test connectivity:
# From inside the container
apk add curl
curl -v https://<FEL_PROVIDER_ENDPOINT>/sharedInfo
Step 3: Check Cloud Run Network Settings
3.1 Check Egress Settings
Verify that Cloud Run has proper egress configuration:
# Check current Cloud Run service configuration
gcloud run services describe flowpos-backend \
--region=us-central1 \
--project=barto-dev \
--format=json | jq '.spec.template.spec.containers[0]'
3.2 Configure VPC Connector (if needed)
If the FEL provider requires VPC connectivity:
# Create a VPC connector
gcloud compute networks vpc-access connectors create fel-connector \
--region=us-central1 \
--network=default \
--range=10.8.0.0/28 \
--project=barto-dev
# Update Cloud Run service to use the VPC connector
gcloud run services update flowpos-backend \
--vpc-connector=fel-connector \
--vpc-egress=all-traffic \
--region=us-central1 \
--project=barto-dev
3.3 Check Firewall Rules
Ensure there are no firewall rules blocking outbound traffic:
# List firewall rules
gcloud compute firewall-rules list --project=barto-dev
# If needed, create a rule to allow outbound traffic
gcloud compute firewall-rules create allow-fel-outbound \
--direction=EGRESS \
--priority=1000 \
--network=default \
--action=ALLOW \
--rules=tcp:443,tcp:80 \
--destination-ranges=0.0.0.0/0 \
--project=barto-dev
Step 4: Configure Cloud NAT (Recommended)
Cloud NAT provides a stable outbound IP address that can be whitelisted by the FEL provider:
# Create a Cloud Router
gcloud compute routers create fel-router \
--network=default \
--region=us-central1 \
--project=barto-dev
# Create a Cloud NAT configuration
gcloud compute routers nats create fel-nat \
--router=fel-router \
--region=us-central1 \
--auto-allocate-nat-external-ips \
--nat-all-subnet-ip-ranges \
--enable-logging \
--project=barto-dev
# Get the allocated NAT IP addresses
gcloud compute routers describe fel-router \
--region=us-central1 \
--project=barto-dev \
--format="get(nats[0].natIps)"
Provide these IP addresses to the FEL provider for whitelisting.
Step 5: DNS Resolution Check
Verify DNS resolution works correctly:
# Check DNS resolution for FEL provider
nslookup <FEL_PROVIDER_DOMAIN>
dig <FEL_PROVIDER_DOMAIN>
# From Cloud Run (if possible)
gcloud run services proxy flowpos-backend --region=us-central1 --project=barto-dev
# Then inside the container:
nslookup <FEL_PROVIDER_DOMAIN>
Step 6: Check for SSL/TLS Issues
If the FEL provider uses custom certificates:
# Test SSL certificate
openssl s_client -connect <FEL_PROVIDER_ENDPOINT>:443 -servername <FEL_PROVIDER_DOMAIN>
# Check certificate validity
curl -v https://<FEL_PROVIDER_ENDPOINT>
Step 7: Enable Cloud Run Logging
Ensure detailed logging is enabled to capture network issues:
# Update Cloud Run service with more verbose logging
gcloud run services update flowpos-backend \
--region=us-central1 \
--project=barto-dev \
--set-env-vars="LOG_LEVEL=debug"
Step 8: Monitor and Alert
Set up monitoring for FEL endpoint availability:
# Create an uptime check in Cloud Monitoring
gcloud monitoring uptime create fel-provider-check \
--resource-type=uptime-url \
--host=<FEL_PROVIDER_DOMAIN> \
--path=/sharedInfo \
--project=barto-dev
Checklist for Network Configuration
- Verify FEL provider endpoint is accessible from your location
- Test connectivity from Cloud Run container
- Check Cloud Run egress settings
- Verify firewall rules allow outbound traffic
- Configure VPC connector if private network access is needed
- Set up Cloud NAT for stable outbound IP
- Provide NAT IP addresses to FEL provider for whitelisting
- Verify DNS resolution works correctly
- Check SSL/TLS certificate validity
- Enable detailed logging
- Set up uptime monitoring and alerts
Common Issues and Solutions
Issue 1: FEL Provider Blocks Cloud Run IPs
Solution: Use Cloud NAT to provide a stable outbound IP and have it whitelisted.
Issue 2: Intermittent Timeouts
Solution:
- Enable keepAlive connections (already implemented)
- Increase retry attempts with exponential backoff
- Use connection pooling
Issue 3: DNS Resolution Failures
Solution:
- Use Cloud DNS for reliable DNS resolution
- Add custom DNS configuration to Cloud Run
Issue 4: Certificate Validation Errors
Solution:
- Ensure the FEL provider uses valid SSL certificates
- If using self-signed certificates, configure trust store
Testing the Fix
After implementing the changes:
- Test from Postman/curl:
curl --location --request POST 'https://flowpos-backend-723334209984.us-central1.run.app/fel/get-shared-info' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <YOUR_TOKEN>' \
--data '{
"businessId": "097f8743-a317-4169-a793-c2a0db8fba2b",
"data1": "SHARED_GETINFONITcom",
"data2": "NIT|17195594"
}'
- Monitor logs:
gcloud logging read "resource.type=cloud_run_revision AND resource.labels.service_name=flowpos-backend AND severity>=ERROR" \
--limit=50 \
--project=barto-dev \
--format=json
- Check for successful responses:
- Look for HTTP 200 status codes
- Verify no ETIMEDOUT errors in logs
- Confirm FEL provider returns expected data
Additional Resources
Contact
If issues persist after following this guide:
- Check with the FEL provider for any service outages
- Review their API documentation for any network requirements
- Contact their support to verify your IP addresses are whitelisted
- Check their rate limiting policies
URL routing and 406 errors on getSharedInfo
This section covers the June 2026 production incident where POST /fel/get-shared-info returned 406 for business 51ebb168 with NIT|3435555.
Root cause summary
Three compounding defects:
| # | Defect | Symptom |
|---|---|---|
| 1 | getSharedInfo ignored USE_RPA_FEL_API | Always hit direct Digifact, never the RPA proxy |
| 2 | Direct Digifact URL hardcoded to test host | Wrong endpoint even without the proxy |
| 3 | Catch block read errorDetails?.Mensaje; Digifact returns REQUEST[0].Mensaje | "Error fetching shared info" instead of the real provider message |
URL resolution (how it works after the fix)
POST /fel/get-shared-info
│
▼
configService.get("USE_RPA_FEL_API") === "true"?
│
YES │ NO
┌────┘ └────────────────────────────────────────┐
▼ ▼
getSharedInfoViaRpaFelApi() getCertifierApiUrl(certifier, NODE_ENV)
→ ProviderRpaFelApiService + optional DIGIFACT_API_URL override
→ ${baseUrl}/QueryPayerInfo │
│ ▼
│ baseUrl resolved by: NODE_ENV=production|beta → felgtaws.digifact.com.gt
│ RPA_FEL_API_URL override anything else → felgttestaws.digifact.com.gt
│ or NODE_ENV=production
│ → fel.rpapos.com/api/fel
│ else
│ → fel-dev.rpapos.com/api/fel
Doppler prd: USE_RPA_FEL_API=true, NODE_ENV=production → RPA path, fel.rpapos.com.
Doppler stg: same flag values → RPA path, fel.rpapos.com.
Emitter NIT vs certifier registry NIT
These are different NITs and the distinction matters when diagnosing provider errors.
| Field | Where it lives | Purpose |
|---|---|---|
| Emitter NIT | business.tax_id | The merchant's tax ID — who is issuing the document |
| Certifier NIT | fel_certifier.nit | Digifact's own SAT-registered NIT |
The error "El NIT 000017677254, no cuenta con acceso API" refers to the emitter NIT (17677254). Digifact's API requires the issuing merchant to be explicitly enabled for API access on their platform. This is separate from having a valid token.
When you see this error:
- The token and URL are correct.
- Digifact has not enabled API access for that specific emitter NIT on production.
- Fix: contact Digifact support and ask them to enable API access for emitter NIT
17677254onfelgtaws.digifact.com.gt.
Reading Digifact error responses
Digifact returns errors in a REQUEST array, not at the top level:
{
"REQUEST": [
{
"Mensaje": "El NIT 000017677254, no cuenta con acceso API",
"Codigo": "1",
"Procesador": "Digifact",
"Descripcion": "NIT sin acceso API",
"Fecha": "2026-06-12"
}
]
}
extractFelProviderErrorDetails() in apps/backend/src/fel/domain/fel-provider-error.utils.ts parses this and surfaces REQUEST[0].Mensaje as the details field in the 406 response body. The PWA's formatApiErrorForToast then shows it as the toast title instead of the generic "FEL API request failed".
Diagnosing a 406 on production
# 1. Check what the backend actually returned
curl -s -w "\nHTTP %{http_code}\n" \
-X POST 'https://api.flowandgrow.tech/fel/get-shared-info' \
-H 'Authorization: Bearer <token>' \
-H 'Content-Type: application/json' \
-d '{"businessId":"51ebb168-fcf1-4f9d-9428-4b28f6ffc102","data1":"SHARED_GETINFONITcom","data2":"NIT|3435555"}'
# 2. Read the response body — check the `details` field
# If details == "Error fetching shared info" → error parsing bug (Phase 1 not deployed)
# If details == "El NIT ... no cuenta con acceso API" → Digifact API access not enabled (ops issue)
# If details mentions URL / connection → check USE_RPA_FEL_API and RPA_FEL_API_URL
# 3. Check Cloud Logging for the full provider response
gcloud logging read \
'resource.type=cloud_run_revision AND resource.labels.service_name=flowpos-backend AND jsonPayload.message:"getSharedInfo"' \
--limit=10 --project=barto-dev --format=json | jq '.[].jsonPayload'
Emergency URL override
If Digifact changes their production host before the next deploy, override without a code change:
# Doppler prd — emergency override
doppler secrets set DIGIFACT_API_URL=https://felgtaws.digifact.com.gt/gt.com.fel.api.v3/api \
--project flowpos --config prd
This is only used on the direct path (USE_RPA_FEL_API=false). When routing through RPA, use RPA_FEL_API_URL instead.
Deploy order
- Phase 1 (error parsing + PWA) — safe to ship independently, improves error messages immediately
- Ops prerequisite — Digifact enables API access for emitter NIT
17677254on production - Phase 2 (RPA routing) — ship after NIT access is confirmed; this is the actual fix
Rollback
If Phase 2 causes regressions, set USE_RPA_FEL_API=false in Doppler prd and redeploy. The direct path code is unchanged and will continue to work (modulo the Digifact API access requirement).
To force the test host temporarily (direct path only):
doppler secrets set DIGIFACT_API_URL=https://felgttestaws.digifact.com.gt/gt.com.fel.api.v3/api \
--project flowpos --config prd