api: bound unbounded trace endpoints + temp traceparent debug log#49
Merged
Conversation
list_traces and service_red did a full-table GROUP BY when called with no time window (the UI always passes one, but a bare API call would scan the whole spans table and hit statement_timeout → 500). Default the lower bound to now()-24h when absent so they can't blow up; explicit ranges are unchanged. Two tests that queried unbounded over ancient timestamps now use recent data / expect the recent-window default. Also add a temporary debug log in otel_request_span recording whether an incoming `traceparent` reached the app — to settle whether the missing traefik→watcher edge is traefik not injecting it vs the linkerd sidecar stripping it. Removed in a follow-up once diagnosed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
thejefflarson
added a commit
that referenced
this pull request
Jun 9, 2026
The debug log from #49 served its purpose. The missing "traefik -> watcher" edge turned out to be a non-issue: watcher is fronted by its own cloudflared tunnel, not traefik, so there's no traced upstream to parent to — root spans are expected. The trace-context extraction (otel_request_span) stays for when a tracing-aware proxy does front it; only the debug log is removed. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two things, from the service-map investigation.
Bound the unbounded trace endpoints (fixes the 500s)
list_tracesandservice_reddid a full-tableGROUP BY/ percentile aggregate when called with no time window — the UI always passes one, but a bare/api/tracesor/api/serviceswould scan the whole spans table and hitstatement_timeout→ 500 (and the heavy scan starved the pool, slowing ingest). Default the lower bound tonow() - 24hwhen absent so they can't blow up. Explicit ranges are unchanged;/api/logsalready uses an indexedORDER BY time DESC LIMITso it didn't need it. Two tests that queried unbounded over 1970-era timestamps now use recent data and expect the recent-window default.Temporary traceparent debug log
A one-line
infoinotel_request_spanrecording whether an incomingtraceparentreached the app, to settle traefik-not-injecting vs linkerd-stripping for the missingtraefik → watcheredge. Will be removed in a follow-up once diagnosed.34 tests pass.
🤖 Generated with Claude Code