feat(trace-utils)!: add v1 decoder#2174
Conversation
📚 Documentation Check Results📦
|
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
🔒 Cargo Deny Results📦
|
|
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5434161c71
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if metadata_matches_v1(dest, src) { | ||
| dest.chunks.append(&mut src.chunks); |
There was a problem hiding this comment.
Keep V1 payloads separate when metadata differs
When coalesce_send_data finds two SendData entries with the same target it always returns true after calling append, so dedup_by drops the source entry. In the V1 branch, a metadata mismatch makes this if false and no chunks are moved, which means all traces from the dropped source payload are lost whenever same-target V1 payloads carry different env/runtime/host/etc. metadata. The append path needs to either merge the chunks safely or let the caller keep both payloads.
Useful? React with 👍 / 👎.
| "V1 chunk trace_id must be 16 bytes, got {len}" | ||
| ))); | ||
| } | ||
| let bytes = buf.try_slice_and_advance(16).ok_or_else(|| { |
There was a problem hiding this comment.
Return an error for truncated V1 trace IDs
For owned decoding, BytesData::try_slice_and_advance indexes the requested range before it can return None, so a malformed V1 payload that declares a 16-byte trace_id but has fewer bytes left will panic here instead of producing the intended DecodeError. Since decode_to_trace_chunks(..., TraceEncoding::V1) decodes external request bodies, truncated /v1.0/traces input can unwind the caller rather than being rejected cleanly.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
@anais-raison I don't think this applies only to trace-id, it's anything that calls try_slice_and_advance (including the v04 decoder). Could you just add a guard in that function?
if bytes > buf.len() {
return None;
}| ("tracer_version", dest.tracer_version == src.tracer_version), | ||
| ("runtime_id", dest.runtime_id == src.runtime_id), | ||
| ("env", dest.env == src.env), | ||
| ("hostname", dest.hostname == src.hostname), | ||
| ("app_version", dest.app_version == src.app_version), |
There was a problem hiding this comment.
Compare V1 payload attributes before appending chunks
This metadata check ignores TracerPayload.attributes, but the append path keeps only dest.attributes while moving src.chunks. If two same-target V1 payloads have identical string metadata but different payload-level attributes such as _dd.apm_mode or git commit data, coalescing sends the source chunks under the destination attributes and drops the source attributes, corrupting the V1 payload metadata.
Useful? React with 👍 / 👎.
| span.attributes.mark_deduped(); | ||
|
|
||
| Ok(span) |
There was a problem hiding this comment.
Validate mandatory V1 span fields before returning
The chunk decoder rejects missing required fields, but the span decoder returns success even when mandatory fields that the V1 encoder always emits, such as span_id and start, are absent. For malformed V1 input this silently forwards a different span using default values (span_id = 0, start = 0) instead of rejecting the payload, which can corrupt traces that pass through the new V1 decode/re-encode path.
Useful? React with 👍 / 👎.
| span.start = decode::read_int::<u64, _>(buf.as_mut_slice()).map_err(|_| { | ||
| DecodeError::InvalidFormat("V1 span start u64 read failure".to_owned()) | ||
| })? as i64; |
There was a problem hiding this comment.
Reject out-of-range V1 timestamps instead of wrapping
When a V1 payload contains a start value larger than i64::MAX, this as i64 cast wraps it to a negative timestamp; the V1 encoder later treats negative starts as invalid and replaces them with the current time. Since this decoder handles external V1 request bodies, out-of-range unsigned timestamps should be rejected with a decode error instead of silently changing trace timing; the adjacent duration cast has the same wrapping behavior.
Useful? React with 👍 / 👎.
| map.insert(key, value); | ||
| } | ||
|
|
||
| map.mark_deduped(); |
There was a problem hiding this comment.
Do not mark flat V1 attributes as deduped
V1 attributes are decoded from a flat array of triplets, not an actual msgpack map, so an incoming payload can contain the same key more than once. Marking the VecMap as deduped here bypasses the encoder's defensive last-write-wins deduplication on re-encode, causing duplicate attributes from external V1 input to be forwarded instead of being normalized or rejected.
Useful? React with 👍 / 👎.
ajgajg1134
left a comment
There was a problem hiding this comment.
Another thing we can do here to build confidence in the decoder is take the working dd-trace-go implementation and generate a bunch of various traces and put them just in to flat files, then decode them with this implementation and the trace-agent implementation to make sure they align? (Maybe there's an easier way to check this though)
| payload.attributes = span::read_attributes_map(buf, table)?; | ||
| } | ||
| unknown => { | ||
| return Err(DecodeError::InvalidFormat(format!( |
There was a problem hiding this comment.
Don't error out on an unknown payload key, we want to make sure we're forward compatible in case we ever add new keys to this format. If it's an unknown key we should silently drop it
| // Integer keys used by the V1 wire format. Kept in sync with the encoder side | ||
| // (`msgpack_encoder::v1::{trace_key, chunk_key, SpanKey, SpanLinkKey, SpanEventKey, AnyValueKey}`). | ||
|
|
||
| pub(super) mod trace_key { |
There was a problem hiding this comment.
Looks like the ContainerID key is missing here
There was a problem hiding this comment.
The encoder (v1 and v04->V1) is also not setting ContainerID in-body.
| endpoint.as_ref(), | ||
| )); | ||
| } | ||
| TracerPayloadCollection::V1(payload) => { |
There was a problem hiding this comment.
Relevant to this thread, if we're going to set ContainerId in-body, you probably want to strip it out of the headers? It's one of the default ones. Should there be a single set of default headers for all protocols? CC @ajgajg1134
There was a problem hiding this comment.
I think we should actually keep all the normal standard headers here, they're useful in cases where the payload fails to decode for any variety of reasons so that we can tag the metrics with stuff from the HTTP headers which are more likely to make it across successfully
What does this PR do?
Adds a V1 msgpack decoder in
libdd-trace-utilsand wires the V1 variant through the trace pipeline so any V1 payload can be decoded, inspected, and re-encoded.Motivation
APMSP-2813 : Prerequisite for the sidecar V1-native path. Splitting out the
libdd-trace-utils/libdd-data-pipelineplumbing so it can land before the sidecar wiring.Additional Notes
trace_serializer.rsnow dispatches on(TraceChunks, OutputFormat); the existing v0.4 → V1 cross-encode path is preserved.trace_utils::collect_trace_chunksrenamed toconvert_trace_chunks_v04_to_v05.