feat(gs): replace raw-SQL /gs/debug with structured allowlist endpoint#3971
Draft
davidleomay wants to merge 7 commits into
Draft
feat(gs): replace raw-SQL /gs/debug with structured allowlist endpoint#3971davidleomay wants to merge 7 commits into
davidleomay wants to merge 7 commits into
Conversation
POST /gs/debug now takes a JSON description of the query (table + select + where + group/order/limit) instead of a SQL string. The service translates the DTO into SQL with identifiers pulled from a per-table column allowlist (DebugAllowedColumns) and values bound as parameters via TypeORM. No raw SQL is parsed, walked, or interpolated. Replaces the parser-walker approach (PR #3967 / commits 74085a9 + 1fc0be9) entirely. The walker had at least six documented bypasses (simple-CASE operand never walked, aggregate FILTER/OVER ignored, LIMIT substring scan beat by string literals and trailing line comments, LIMIT cap bypassed by arithmetic expressions) — closed by construction here because no parser is involved. DTO (src/subdomains/generic/gs/dto/debug-query.dto.ts): - select items: column | jsonb | aggregate, each with discriminator + regex/enum - where tree: leaf | and | or | not, recursive with depth cap 5 / 50 predicates - jsonb path: dot-separated, each segment regex-validated server-side - limits: identifier regex, IN-list cap, string-value length cap Service (gs.service.ts): - ~660 lines of AST walkers + blocklist + post-execution masker deleted - ~210 lines of structured emitter added - defense-in-depth re-checks for the four DTO-only gates the security review flagged (`as` regex, `aggregate` enum, `op` enum, `direction`) Allowlist (dto/gs.dto.ts): - DebugAllowedColumns covers ~85 tables with explicit column lists including FK IDs (recommenderId, userDataId, etc.) so the script's --referral-* modes work end-to-end - DebugBlockedCols / DebugBlockedSchemas / DebugDangerousFunctions removed - node-sql-parser dependency no longer used by GsService db-debug.sh: - every predefined mode (--anomalies / --balance / --stats / --asset-history / --referral-chain / --referral-tree / default) migrated to JSON payloads built with `jq -n` - custom-SQL pass-through removed (was the script's only raw-SQL surface) Tests: 134 specs covering allowlist, jsonb, WHERE composition, GROUP/ORDER/LIMIT, the eight query shapes db-debug.sh issues, SQL- injection probes against every string field, and the four service-side defense-in-depth gates.
…ghtening
Adversarial review surfaced two real correctness bugs and several worth-fixing
defense-in-depth gaps. Allowlist auditors flagged a handful of columns that
slipped through. All fixed.
Service correctness:
- Prototype-chain table names (`__proto__`, `constructor`, …) used to return
`Object.prototype` from the allowlist lookup, pass the truthy guard, then
TypeError downstream — 500 instead of 400. Use `Object.hasOwn`.
- `{kind: 'jsonb'}` without `jsonbPath` used to TypeError in
`defaultDebugSelectAlias`'s `split('.')` — 500 instead of 400. Push kind-
specific required-field checks ahead of alias derivation.
Defense in depth:
- LIMIT also floored at 1 (was only upper-clamped). DTO `@Min(1)` is the
primary check; the service now never emits `LIMIT 0` or `LIMIT -n` even if
ValidationPipe is ever bypassed.
- OFFSET floored at 0 for the same reason.
- Audit log now redacts WHERE leaf `value` fields. Bound parameters keep PII
out of the SQL string; the verbose-channel audit was undoing that by
serializing the raw DTO. Structure (table/select/where/op) stays for
forensics; values become `<scalar>` / `<array:N>`.
Allowlist:
- Removed: `payment_link_payment.note`, `trading_order.errorMessage`,
`user.apiFilterCT`, `fiat_output.creditInstitution`, `fiat_output.info`,
`liquidity_management_order.errorMessage`, `fiat.ibanCountryConfig`.
- Fixed: `bank_tx.subfamilyCode` → `subFamilyCode` (typo).
- Fixed: `liquidity_management_order.eagerId` → `pipelineId` (misparsed FK
from `@ManyToOne({ eager: true })`).
DTO:
- Dropped unused `IsBoolean` / `IsNotEmpty` imports.
Tests: 141 pass (was 134 + 7 new). New regressions pin the prototype-key
behavior, the kind-without-required-field behavior, the limit floor, and
the WHERE-value redaction.
Three sibling scripts (compare-balance-logs.sh, sum-asset-balances.sh,
inspect-asset-balance.sh) still POSTed `{sql: ...}` to /gs/debug — they
would have silently 400'd after merge. Migrated each to the structured DTO
shape with `jq -n` payload builders and a `{keys, rows}` reshape to keep
the downstream jq filters unchanged.
Allowlist:
- Removed `payment_link_payment.deviceCommand` — `text` column for free-form
merchant-supplied POS commands; same risk class as the already-removed
`note`.
- Documented `setting.key` inclusion (knob NAMES revealed; `value` stays out).
Test coverage:
- Added regression specs for the two db-debug.sh shapes that were missing
from the smoke-test suite: default (`SELECT id, name, blockchain FROM
asset ...`) and the asset resolver used by `--asset-history Blockchain/Name`.
- Expanded the "rejects sensitive column" `it.each` from 9 pairs to ~130,
covering every entry from the previous `DebugBlockedCols` plus the
columns the auditors flagged this round. A future migration that
accidentally surfaces e.g. `totpSecret`, `apiKeyCT`, or `recipientMail`
now fails CI before it ships.
Test count: 268 (was 141).
Flagged by github-code-quality on commit e6e92fe. The test asserts on `result.keys` directly; the captured spy handle was dead.
9a289db to
bc76ab6
Compare
Final adversarial review surfaced three actionable items and an allowlist gap. All addressed. - Audit log moved BEFORE emit/validate so a malformed/probing request is still recorded. Previously a DEBUG-role attacker could enumerate the column allowlist via 400s without leaving a trace in App Insights. - New per-level cap `DebugQueryMaxAndOrChildren = 10` replaces the prior `@ArrayMaxSize(DebugQueryMaxPredicates = 50)` on `where.children`. With depth 5 the validator can no longer be made to walk 50^5 internal nodes on a single oversized body; combined with the leaf-count cap of 50 this bounds class-validator CPU work tightly. - Dropped `node-sql-parser` from package.json + lockfile; no remaining src/ references. - `support_note.userDataId` added — the FK was missing from the existing allowlist entry, which left the table debuggable but blind to ownership. Tests: 273 (was 272). New regression pins that the audit fires even when emit throws — a refactor that pushed it back below emit would let probing become invisible again.
…SAGE Two things, packaged together so CI is green and the data-leak finding is closed in one commit. 1) Prettier reformat. CI's `format:check` step required the `DebugAllowedColumns` arrays to wrap (the single-line-per-table style I used was not prettier-compatible). Reformatted automatically; no semantic changes, just whitespace. 2) Final data-leak review surfaced a real (medium) cross-user audit-log leak: a DEBUG user can call /gs/debug/logs with `template: 'traces-by-message', messageFilter: 'Debug-query by 0xother'` and read another DEBUG user's full debug-query audit history (shape, tables, columns, ops — values are already redacted in the audit). The existing `ALL_TRACES` template already filtered the `LogQueryAuditPrefix`; mirror the same pattern for `TRACES_BY_MESSAGE` and `TRACES_BY_OPERATION`, and extend all three to also filter the new `DebugQueryAuditPrefix` constant. Service uses `DebugQueryAuditPrefix` for both the success-path and failure-path audit lines. New regression spec pins the service-emits-prefix and templates-filter-prefix invariants together — so a future refactor that desynchronizes them fails CI. Tests: 274 (was 273).
Three items from the final 5-angle review round. All non-blocking on their
own, packaged together to keep the PR diff compact.
- DoS reviewer flagged that `DebugQueryMaxAndOrChildren = 10` combined with
depth 5 lets class-validator instantiate up to 100k internal nodes before
the service caps fire. Tightening to 5 brings worst case to 3125 — still
far more branching than any realistic debug query needs.
- Integration reviewer flagged `scripts/sync-prod-logs.js` still posting
`{sql: …}` to /gs/debug; it would 400 the moment this PR merges.
Migrated to the structured DTO with a `{keys, rows}` → array-of-objects
reshape so the rest of the script's `row.id` / `row.message` access
stays unchanged.
- Updated `scripts/DEBUG-ENDPOINT.md` to describe the structured DTO,
remove the stale raw-SQL examples, document the killswitch + denylist
ops paths the auth reviewer's report covered, and refresh the
troubleshooting section.
Tests: 274 (unchanged — the cap change moves a project-wide DoS knob, not
behavior covered by unit tests).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
POST /gs/debugnow takes a structured JSON body (table+select+where+groupBy/orderBy/limit) instead of a SQL string. The service emits SQL via TypeORM with identifiers pulled fromDebugAllowedColumnsand values bound as parameters.74085a920+1fc0be9ed) entirely. The walker is gone; so areDebugBlockedCols,DebugBlockedSchemas,DebugDangerousFunctions, the post-execution masker, thenode-sql-parserdependency in this module, and ~660 lines of recursive AST code.db-debug.shis migrated: every predefined mode (--anomalies,--balance,--stats,--asset-history,--referral-chain,--referral-tree, default) now sends JSON. The custom-SQL pass-through is dropped.Why this design
The previous approach was "parse the SQL, walk the AST, reject anything that looks dangerous." The independent adversarial review on the row-bypass branch found six exploitable bypasses (simple-CASE operand never walked, aggregate
FILTER/OVERignored, LIMIT substring scan beat by string literals and trailing--comments, LIMIT cap bypassed byLIMIT 9000+9000). Any one parser field the walker forgot opened a hole.The structured DTO closes that surface by construction:
A second independent adversarial review on this branch found no exploitable bypass. It did flag four DTO-only gates (
as/aggregate/op/direction) where the service trusted ValidationPipe without re-checking; this PR includes service-side re-checks for all four plus a regression test per check.What's exposed
DebugAllowedColumns(indto/gs.dto.ts) covers ~85 tables. Each entry lists the columns selectable / filterable / sortable on that table. PII / secrets / free-form text are not in the allowlist; safe columns (IDs, FK IDs, timestamps, enums, status discriminators, numeric amounts, public on-chain refs) are.log.messageis the only column flagged for jsonb path access — needed for the financial-balance debugging the script does.FK ID columns (
recommenderId,recommendedId,userDataId,assetId, etc.) are included so the script's referral-chain and referral-tree walkers work end-to-end.Allowed query language (after this PR)
DebugAllowedColumnscount,sum,min,max,avgover an explicit column)= != < <= > >= IN NOT IN LIKE ILIKE IS NULL IS NOT NULLAnything outside that — JOINs, subqueries, CTEs, UNION, INTO, window functions, CASE, arbitrary functions, jsonb on non-jsonb columns — is unrepresentable in the DTO.
Migration
db-debug.shcontinues to work transparently for the existing user-facing flags. Direct API users who were posting{"sql": "..."}need to switch to the structured body — there is a curl example in the script's--helpoutput.Test plan
npx tsc --noEmitcleannpx jest src/subdomains/generic/gs/__tests__/gs.service.spec.ts— 134 / 134 pass (includes 8 db-debug.sh-shape smoke tests, full WhereOp coverage, SQL-injection probes against every string field, defense-in-depth regressions for the four hardened gates)db-debug.sh --statsagainst a non-prod environmentdb-debug.sh --anomalies 5against a non-prod environmentdb-debug.sh --referral-chain <id>against a non-prod environment