Skip to content

perf: GIN index for trace attribute filtering#47

Merged
thejefflarson merged 1 commit into
mainfrom
spans-attr-gin
Jun 8, 2026
Merged

perf: GIN index for trace attribute filtering#47
thejefflarson merged 1 commit into
mainfrom
spans-attr-gin

Conversation

@thejefflarson

Copy link
Copy Markdown
Owner

Follow-up to #46. The traces attr filter (attributes @> '{...}') seq-scanned the spans table — fine for narrow time windows, but an unbounded or wide-window query could exceed the 60s statement_timeout and 500 (the timeout caught it gracefully, but it's slow).

  • spans_attrs_gin (gin jsonb_path_ops), mirroring logs_attrs_gin. The migration sets statement_timeout = 0 locally so the one-time build isn't aborted by the pool's 60s cap. The build briefly write-locks spans, so span ingest pauses once while it builds — metrics/logs ingest is unaffected (their inserts don't touch spans), and the collector buffers.
  • Restructured the filter from HAVING bool_or(attributes @> $6) (can't use an index — it runs after aggregation) to WHERE trace_id IN (SELECT trace_id FROM spans WHERE attributes @> $6), which the GIN serves. It selects whole traces containing a matching span, so per-trace aggregates stay correct.

Spans are lower write-volume than the 5.3M-row metrics table, so the index's ingest cost is modest (the reason we're comfortable adding it here but removed the unused metrics index earlier).

34 tests pass; the existing traces_filter_by_name_attr_errors_and_duration covers the restructured attr path.

🤖 Generated with Claude Code

The traces attribute filter (attributes @> '{...}') seq-scanned the spans table,
so an unbounded or wide-window query could exceed statement_timeout and 500.

- Add spans_attrs_gin (gin jsonb_path_ops), mirroring logs_attrs_gin. The
  migration sets statement_timeout=0 locally so the one-time build isn't aborted
  by the pool's 60s cap; the build briefly write-locks spans (span ingest pauses
  once — metrics/logs ingest is unaffected).
- Restructure the attr filter from `HAVING bool_or(attributes @> $6)` (can't use
  an index) to `WHERE trace_id IN (SELECT trace_id FROM spans WHERE attributes @>
  $6)`, which the GIN serves — selecting whole traces that contain a matching
  span, so the per-trace aggregates stay correct.

Spans are lower write-volume than metrics, so the index's ingest cost is modest.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@thejefflarson thejefflarson merged commit 4ae8227 into main Jun 8, 2026
5 checks passed
@thejefflarson thejefflarson deleted the spans-attr-gin branch June 8, 2026 05:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant