fix(gs): reject whole-row references in debug SQL#3967
Open
davidleomay wants to merge 2 commits into
Open
Conversation
Per-column blocklist + post-execution masker can be bypassed by any expression that yields a whole row from a table: - `to_jsonb(t.*)` / `to_json(t)` / `row_to_json(t)` / `jsonb_agg(t)` - `array_to_json(array_agg(t))` - bare `SELECT t FROM ... t` (alias as row reference) - `t::text` / `CAST(t AS text)` - the same wrapped in a subquery or CTE `columnList` surfaces these as either a wildcard `(.*)` (skipped by `findBlockedColumnInQuery`) or as the alias/table name (not recognized as a column on any table). Once the row reaches the executor as a tuple or JSON blob the post-execution masker can no longer reach the inner fields — it only walks top-level result keys. Add a recursive AST walker that reports any whole-row reference (via wildcard or via a bare identifier matching a FROM-clause alias / table name) for tables that have any blocked columns. Walks WITH-clause CTEs and FROM-clause subqueries too. Regression coverage: 13 bypass patterns + one allow-case (`asset`, no blocked columns).
Adversarial review of the first pass found three patterns that still let
row-shaped data past the masker:
1. Quoted/aliased table identifiers (`SELECT t.* FROM "user" t`) crashed
the walker because `node.table` is `{ type, value }` for quoted refs,
not a plain string. Normalize via `identifierName()` so both shapes
resolve to the same alias key.
2. FROM-clause function expressions, e.g. `LATERAL to_jsonb(user_data)`,
were never visited — the walker only inspected `stmt.columns`. Walk
`stmt.from` items whose `expr` is a function/expression too, using
the outer scope's row identifiers.
3. Chained CTEs (`WITH a AS (SELECT * FROM T), b AS (SELECT to_jsonb(a.*)
FROM a) SELECT * FROM b`) bypassed the check because `a` is a CTE
name, not a base-table name, so `tableHasBlockedColumns('a')` was
false. Propagate CTE-name → base-table via a `cteSources` map walked
alongside the AST recursion. `resolveSubqueryRowSource` also takes
the map so it can recognize transitive forwarders.
Regression coverage: 3 new spec cases, all pass. Total 50 tests green.
7 tasks
davidleomay
added a commit
that referenced
this pull request
Jun 26, 2026
POST /gs/debug now takes a JSON description of the query (table + select + where + group/order/limit) instead of a SQL string. The service translates the DTO into SQL with identifiers pulled from a per-table column allowlist (DebugAllowedColumns) and values bound as parameters via TypeORM. No raw SQL is parsed, walked, or interpolated. Replaces the parser-walker approach (PR #3967 / commits 74085a9 + 1fc0be9) entirely. The walker had at least six documented bypasses (simple-CASE operand never walked, aggregate FILTER/OVER ignored, LIMIT substring scan beat by string literals and trailing line comments, LIMIT cap bypassed by arithmetic expressions) — closed by construction here because no parser is involved. DTO (src/subdomains/generic/gs/dto/debug-query.dto.ts): - select items: column | jsonb | aggregate, each with discriminator + regex/enum - where tree: leaf | and | or | not, recursive with depth cap 5 / 50 predicates - jsonb path: dot-separated, each segment regex-validated server-side - limits: identifier regex, IN-list cap, string-value length cap Service (gs.service.ts): - ~660 lines of AST walkers + blocklist + post-execution masker deleted - ~210 lines of structured emitter added - defense-in-depth re-checks for the four DTO-only gates the security review flagged (`as` regex, `aggregate` enum, `op` enum, `direction`) Allowlist (dto/gs.dto.ts): - DebugAllowedColumns covers ~85 tables with explicit column lists including FK IDs (recommenderId, userDataId, etc.) so the script's --referral-* modes work end-to-end - DebugBlockedCols / DebugBlockedSchemas / DebugDangerousFunctions removed - node-sql-parser dependency no longer used by GsService db-debug.sh: - every predefined mode (--anomalies / --balance / --stats / --asset-history / --referral-chain / --referral-tree / default) migrated to JSON payloads built with `jq -n` - custom-SQL pass-through removed (was the script's only raw-SQL surface) Tests: 134 specs covering allowlist, jsonb, WHERE composition, GROUP/ORDER/LIMIT, the eight query shapes db-debug.sh issues, SQL- injection probes against every string field, and the four service-side defense-in-depth gates.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Close a per-column-blocklist bypass on
POST /v1/gs/debug. Any expression that yields a whole row from a table slips past pre-execution validation (the parser surfaces it as a wildcard or a bare alias) and the post-execution masker can't reach the inner fields (the row arrives as a tuple or a JSON blob).Adds a recursive AST walker that rejects whole-row references for tables with any blocked columns. Covers WITH-clause CTEs and FROM-clause subqueries.
Regression coverage in
gs.service.spec.ts— 13 bypass patterns blocked, one allow-case (table with no blocked columns) still passes.tsc --noEmitclean. All 44 spec tests pass.