Skip to content

fix(api): reduce ERROR-log noise from expected client failures#2

Open
Danswar wants to merge 1 commit into
dfx-develop-basefrom
fix/reduce-dfx-api-error-noise
Open

fix(api): reduce ERROR-log noise from expected client failures#2
Danswar wants to merge 1 commit into
dfx-develop-basefrom
fix/reduce-dfx-api-error-noise

Conversation

@Danswar

@Danswar Danswar commented Jun 25, 2026

Copy link
Copy Markdown
Owner

Goal

Reduce recurring ERROR-level log noise in dfx-api from two cases that are actually expected client/business outcomes, not server faults. Together these account for ~50+ ERROR lines/day in prod and pollute error dashboards/alerts.

Reviewing in this fork first (dfx-develop-base is a clean mirror of DFXswiss/api@develop). Retarget to DFXswiss/api:develop once approved.

Changes

1. realunit — Aktionariat upstream 4xx (realunit.service.ts)

When requestPaymentInstructions failed, an upstream client/business 400 (e.g. "User must have…") was:

  • self-logged at ERROR, and
  • re-thrown as a ServiceUnavailableException (503) → the global ApiExceptionFilter (status >= 500) logged it at ERROR again.

So one upstream 4xx produced two ERROR lines and was misclassified as a server outage.

Fix: log at WARN, and when the upstream status is a 4xx, surface it as BadRequestException (4xx) instead of 503. Upstream 5xx / network errors still throw ServiceUnavailableException.

⚠️ Behavior note for review: for the upstream-4xx case the client-facing HTTP status changes 503 → 400. This is arguably more correct (it is a client/business error), but flagging it in case any client/monitor keys on 503 for /v1/realunit/buy/:id/confirm. The response body message is preserved.

2. buy-crypto — expected duplicate-KYC ConflictException (buy-crypto-preparation.service.ts)

doAmlCheck() logs Error during buy-crypto <id> AML check at ERROR. The common case is a ConflictException: A user with this KYC file ID already exists — an expected guard (the AML path allocates kycFileId = MAX+1, which races against the unique constraint). It's already handled (the loop continues); only the log level was wrong. This is also the source of the matching duplicate key value violates unique constraint lines in api-postgres.

Fix: log ConflictException at WARN; all other exceptions stay ERROR. Follows the existing instanceof ConflictException idiom used in bank-tx.service.ts, liquidity-management.service.ts, etc. No AML/business behavior change — the kycFileId is simply not reassigned this run, exactly as today.

Validation

  • prettier --checkeslint --no-fixnest build
  • jest realunit buy-crypto-preparation tracing → 6 suites / 77 tests pass ✅

Not addressed (intentionally)

The bulk of dfx-api ERRORs are genuine external-dependency failures (blockchain RPC flaps, CoinGecko ETIMEDOUT, Olky 403) that the team monitors — those are root-cause/infra items, not log-level noise, so left untouched.

- realunit: an upstream Aktionariat 4xx (e.g. business 400) was wrapped in a
  503 ServiceUnavailableException and self-logged at ERROR, producing two ERROR
  lines per occurrence. Log at WARN and surface upstream 4xx as BadRequestException
  so it is not counted as a server fault.
- buy-crypto: the expected ConflictException (duplicate KYC file ID) raised during
  the AML check is now logged at WARN instead of ERROR; all other failures stay ERROR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant