Add E2E test verifying OTEL tracing across Zenko services#2378
Conversation
Hello delthas,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
|
c362986 to
9a3a251
Compare
9a3a251 to
f32a16b
Compare
f32a16b to
a7b196c
Compare
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
|
a7b196c to
c9ae52e
Compare
c9ae52e to
79d656d
Compare
francoisferrand
left a comment
There was a problem hiding this comment.
overall not sure where to stand here: there is a fundamental compromise: do we want to test the "production" case first and foremost (i.e. without tracing), or is it safe-enough to enable Otel all the time (i.e. not testing production anymore, which could hide crashes or introduce subtle delays...)
SylvainSenechal
left a comment
There was a problem hiding this comment.
side question, here the jaeger server will be killed at the end the ci run, but would it be possible to have the traces data retained, so that maybe we could export these traces data in the artifact, and spin up a server to explore them with a ui ?
79d656d to
648091c
Compare
8f05574 to
5160774
Compare
|
Moved to a ratio of 0% (no traces, but enabled); and injecting a forcefully enabled trace using a This should bring no performance overhead, no overlay reconfiguration, and nicely isolate our test trace. |
5160774 to
4a072e5
Compare
4a072e5 to
c7882fa
Compare
3b4b0fa to
dcf5457
Compare
dcf5457 to
69afb02
Compare
Includes OTEL tracing support needed by the E2E test in this PR. Issue: ZENKO-5258
Includes OTEL tracing support needed by the E2E test in this PR. Issue: ZENKO-5258
- Deploy Jaeger all-in-one (memory-only, OTLP-enabled) in the kind CI cluster alongside existing dependencies - Patch the Zenko CR with `spec.otel` (enabled, sampling ratio 1.0) so every request is traced during CI — also acts as a smoke test that OTEL doesn't break existing @premerge tests - Add a new @premerge CTST scenario that puts an S3 object and then polls the Jaeger query API to assert a trace exists with spans from both cloudserver and vault Issue: ZENKO-5258
69afb02 to
43b1e51
Compare
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
|
| dashboard: cloudserver/cloudserver-dashboards | ||
| image: cloudserver | ||
| tag: 9.3.9 | ||
| tag: 9.4.0-preview.1 |
There was a problem hiding this comment.
9.4.0-preview.1 is a pre-release tag. deps.yaml defines the component versions for the solution — the review criteria require pinning to a concrete released tag or content hash. If this PR is meant to land only after cloudserver 9.4.0 ships, consider gating the merge on that release and updating the tag then. If the preview tag is intentional for CI validation, document that constraint so it isn't accidentally shipped.
Summary
1.76.0)in the kind CI cluster alongside existing dependencies (Keycloak,
Prometheus, etc.)
spec.tracing, ratio"0") so thecluster runs in production-mode (no ambient traces); the test sends
a single PUT carrying a W3C
traceparentheader through a port-forwardthat bypasses nginx (which strips
traceparent). Cloudserver'sparent-based sampler honors the header → only that one request is
traced.
@PreMergeCTST scenario that asserts the trace is present inJaeger and contains spans from both connector-cloudserver and
connector-vault
What changed
solution/deps.yamlvault8.11.6 → 8.11.7(OTEL support) andcloudserver9.3.9 → 9.4.0-preview.1(OTEL support, parent-based sampler)configs/zenko.yamlspec.tracing(enabled, ratio"0", jaeger endpoint)configs/jaeger.yaml1.76.0, memory-only, requests/limits)install-kind-dependencies.shkubectl apply -f configs/jaeger.yamlsetup-e2e-env.shsvc/${ZENKO_NAME}-connector-cloudserver:8000+JaegerQueryEndpoint/InternalCloudserverEndpointin world paramsworld/Zenko.tsfeatures/otel-tracing.feature@PreMergescenariosteps/otel-tracing.tstraceparent, signs PUT via SDK middleware injecting the header, pollsGET /api/traces/<id>until the trace appears, asserts cloudserver + vault spansWhy
Parent ticket OS-1072
tracks adding OpenTelemetry tracing across the Zenko stack. This PR
adds CI coverage to verify that traces actually propagate end-to-end
(cloudserver → vault) once tracing is enabled.
Issue: ZENKO-5258