perf: GIN index for trace attribute filtering#47
Merged
Conversation
The traces attribute filter (attributes @> '{...}') seq-scanned the spans table,
so an unbounded or wide-window query could exceed statement_timeout and 500.
- Add spans_attrs_gin (gin jsonb_path_ops), mirroring logs_attrs_gin. The
migration sets statement_timeout=0 locally so the one-time build isn't aborted
by the pool's 60s cap; the build briefly write-locks spans (span ingest pauses
once — metrics/logs ingest is unaffected).
- Restructure the attr filter from `HAVING bool_or(attributes @> $6)` (can't use
an index) to `WHERE trace_id IN (SELECT trace_id FROM spans WHERE attributes @>
$6)`, which the GIN serves — selecting whole traces that contain a matching
span, so the per-trace aggregates stay correct.
Spans are lower write-volume than metrics, so the index's ingest cost is modest.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to #46. The traces
attrfilter (attributes @> '{...}') seq-scanned the spans table — fine for narrow time windows, but an unbounded or wide-window query could exceed the 60sstatement_timeoutand 500 (the timeout caught it gracefully, but it's slow).spans_attrs_gin(gin jsonb_path_ops), mirroringlogs_attrs_gin. The migration setsstatement_timeout = 0locally so the one-time build isn't aborted by the pool's 60s cap. The build briefly write-locksspans, so span ingest pauses once while it builds — metrics/logs ingest is unaffected (their inserts don't touchspans), and the collector buffers.HAVING bool_or(attributes @> $6)(can't use an index — it runs after aggregation) toWHERE trace_id IN (SELECT trace_id FROM spans WHERE attributes @> $6), which the GIN serves. It selects whole traces containing a matching span, so per-trace aggregates stay correct.Spans are lower write-volume than the 5.3M-row metrics table, so the index's ingest cost is modest (the reason we're comfortable adding it here but removed the unused metrics index earlier).
34 tests pass; the existing
traces_filter_by_name_attr_errors_and_durationcovers the restructured attr path.🤖 Generated with Claude Code