Skip to content

feat(resources): isolate gateway and OTel collector memory from PostgreSQL#409

Open
xgerman wants to merge 5 commits into
documentdb:mainfrom
xgerman:feat/sidecar-resource-isolation
Open

feat(resources): isolate gateway and OTel collector memory from PostgreSQL#409
xgerman wants to merge 5 commits into
documentdb:mainfrom
xgerman:feat/sidecar-resource-isolation

Conversation

@xgerman

@xgerman xgerman commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

What

Treats spec.resource.memory as the total pod memory envelope and carves sidecar memory out of it, so a gateway or OTel-collector memory leak is OOM-isolated to its own container instead of crowding out PostgreSQL.

For a pod with spec.resource.memory: 16Gi:

Container Memory (request == limit) Source
documentdb-gateway min(18.75% × envelope, 32Gi) = 3Gi carved out (Guaranteed)
otel-collector (when monitoring.enabled) request 48Mi / limit 128Mi carved out
postgres envelope − gateway − otel = ~13Gi remainder

PostgreSQL's memory-aware GUCs (shared_buffers, etc.) are recomputed from the carved value, not the full envelope.

Why

The gateway sidecar (and the OTel collector) previously ran BestEffort (no limits). The gateway's async runtime also spawns one worker thread per visible CPU, so on large nodes a leak or thread blow-up could create pod/node memory pressure that starves the co-located PostgreSQL. Giving each sidecar a bounded, isolated footprint contains that blast radius. Defaults follow the production sizing model (18.75% of the envelope, capped at 32Gi).

What's included

  • API: spec.resource.{gateway,database,otel} ComponentResources overrides (explicit wins over the auto-split).
  • Carve-out: ComputeResourceSplit (pure, unit-tested against production SKU rows) wired into the CNPG cluster spec; PostgreSQL GUCs recomputed from the carved memory; sidecar resources reconciled on the existing sync path (so existing clusters re-split on next reconcile).
  • Plugin: the sidecar-injector sets Resources on the gateway and otel containers and adds GOMEMLIMIT (≈80% of the otel limit).
  • OTel: a memory_limiter processor (cgroup-aware limit_percentage: 80) is added first in the pipeline.
  • Helm: operator-level defaults (operator.sidecarResources.*) wired to env vars; helm-unittest coverage.
  • Docs: CHANGELOG behavioral-change entry + user docs; design doc under docs/designs/.

Configuration

spec:
  resource:
    memory: "16Gi"          # total pod envelope
    gateway:  { memory: "3Gi", cpu: "1" }      # optional explicit override
    otel:     { memory: "128Mi" }              # optional
    database: { memory: "12Gi" }               # optional

Fleet-wide defaults via Helm: operator.sidecarResources.{gatewayMemoryFraction,gatewayMemoryCap,otelMemoryRequest,otelMemoryLimit,otelCpuRequest,gatewayCpuLimit}.

Rollout

Ships default-on. Existing clusters adopt the split — and take a one-time rolling restart to recompute PostgreSQL memory — on their next reconcile. Called out in the CHANGELOG.

Testing

  • Unit: split math against production rows (M20/M50/M60/M200-capped, monitoring carve-out, overrides), plugin param round-trip + GOMEMLIMIT, OTel memory_limiter config. go test ./... (operator + plugin) green.
  • Helm: helm unittest (83 passing) incl. the new operator env assertions.
  • E2E (new tests/resources area): stood up a kind cluster with the images built from this branch and asserted the live container split:
    • monitoring off → gateway 192Mi, postgres 832Mi, no collector
    • monitoring on → gateway 192Mi, otel 48Mi/128Mi + GOMEMLIMIT, postgres 704Mi
    • Result: 2/2 specs passed.

Notes

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

German Eichberger and others added 3 commits June 23, 2026 16:10
…tants, split)

Introduce the contract for splitting the pod memory envelope across the
PostgreSQL, gateway, and OTel collector containers:
- spec.resource.{gateway,database,otel} ComponentResources overrides
- operator-level env/defaults (18.75% gateway fraction, 32Gi cap, otel 48Mi/128Mi)
- ComputeResourceSplit pure function + production-row unit tests

Part of the sidecar resource isolation work (Phase 0).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: German Eichberger <geeichbe@microsoft.com>
…reSQL

Treat spec.resource.memory as the total pod envelope and carve sidecar
memory out of it so a gateway or collector leak is OOM-isolated to its own
container instead of crowding out PostgreSQL:

- gateway: requests==limits == min(18.75% x memory, 32Gi) (production model)
- otel collector (when monitoring enabled): request 48Mi / limit 128Mi, plus
  a memory_limiter processor (limit_percentage 80) and GOMEMLIMIT (80% of limit)
- PostgreSQL gets the remainder; memory-aware GUCs recomputed from it

All three components are independently overridable via
spec.resource.{gateway,database,otel}; fleet defaults via operator Helm values
(gatewayMemoryFraction, gatewayMemoryCap, otel*). Sidecar resources are plumbed
through the sidecar-injector plugin and reconciled on the existing sync path, so
existing clusters adopt the split (one rolling restart) on next reconcile.

Adds unit tests (split math against production rows, plugin params, otel config)
and an e2e 'resources' area asserting the live container split.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: German Eichberger <geeichbe@microsoft.com>
The OTel collector requires a valid pipeline (an exporter) to start; without
one it crash-loops and the cluster never becomes healthy. Configure a
prometheus exporter on port 9464 (avoiding CNPG's built-in metrics port 9187)
so the monitoring-enabled e2e spec exercises a realistic, healthy three-way
container split.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: German Eichberger <geeichbe@microsoft.com>
Copilot AI review requested due to automatic review settings June 23, 2026 23:55

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces sidecar resource isolation for DocumentDB pods by treating spec.resource.memory as a total pod memory envelope, carving out bounded memory for the gateway and (optionally) the OTel collector, and assigning the remainder to PostgreSQL so that leaks are OOM-contained to the leaking sidecar container and PostgreSQL tuning is computed from the carved database memory.

Changes:

  • Add a resource-splitting model (ComputeResourceSplit) plus new CRD/API fields spec.resource.{gateway,database,otel} to override per-container sizing.
  • Plumb the resolved resources into the CNPG sidecar-injector plugin (gateway + OTel), including GOMEMLIMIT for the collector and an OTel memory_limiter processor.
  • Add Helm defaults/env wiring, unit + helm-unittest coverage, and new E2E “resources” area tests validating the live container split.

Reviewed changes

Copilot reviewed 27 out of 28 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
test/e2e/tests/resources/sidecar_resources_test.go New E2E specs asserting live gateway/otel/postgres memory split with monitoring on/off.
test/e2e/tests/resources/resources_suite_test.go New Ginkgo suite bootstrap for the E2E “resources” area.
test/e2e/tests/resources/helpers_test.go Area helpers to create clusters from templates and inspect pod container resources/env.
test/e2e/manifests/mixins/sidecar_resources.yaml.template New manifest mixin to set envelope memory + monitoring exporter for resources tests.
test/e2e/labels.go Adds the new resources area label.
operator/src/internal/utils/constants.go Adds env var names and plugin param keys for sidecar resource isolation defaults.
operator/src/internal/otel/config.go Wires memory_limiter into the metrics pipeline processor order.
operator/src/internal/otel/config_test.go Validates embedded memory_limiter config and pipeline processor ordering.
operator/src/internal/otel/base_config.yaml Adds memory_limiter processor to the static collector config.
operator/src/internal/cnpg/resource_split.go New split algorithm + env-driven defaults for gateway/otel carve-out and DB remainder.
operator/src/internal/cnpg/resource_split_test.go Unit tests validating split math, monitoring carve-out, and override precedence.
operator/src/internal/cnpg/cnpg_sync.go Syncs additional plugin parameters for sidecar resources (add/update/remove).
operator/src/internal/cnpg/cnpg_cluster.go Applies carved postgres resources, passes sidecar resource params to plugin, recomputes PG GUCs from carved DB memory.
operator/src/internal/cnpg/cnpg_cluster_test.go Adds tests asserting carved resources + plugin params + recomputed PG parameters.
operator/src/config/crd/bases/documentdb.io_dbs.yaml CRD schema updates adding resource.gateway/database/otel component resource stanzas.
operator/src/api/preview/zz_generated.deepcopy.go Generated deepcopy updates for new API fields.
operator/src/api/preview/documentdb_types.go Adds API types/docs for per-component resource overrides.
operator/documentdb-helm-chart/values.yaml Adds operator-level sidecarResources defaults.
operator/documentdb-helm-chart/tests/09_operator_deployment_test.yaml Adds helm-unittest coverage for default sidecar env var wiring (partial).
operator/documentdb-helm-chart/templates/09_documentdb_operator.yaml Wires Helm values into operator env vars for split defaults.
operator/documentdb-helm-chart/crds/documentdb.io_dbs.yaml Helm chart CRD copy updated to match generated CRD schema.
operator/cnpg-plugins/sidecar-injector/internal/lifecycle/lifecycle.go Sets container Resources for gateway/otel and adds GOMEMLIMIT for collector.
operator/cnpg-plugins/sidecar-injector/internal/lifecycle/lifecycle_test.go Adds unit test verifying injected resources + GOMEMLIMIT from plugin params.
operator/cnpg-plugins/sidecar-injector/internal/config/config.go Adds parsing/validation/serialization of new resource plugin parameters.
operator/cnpg-plugins/sidecar-injector/internal/config/config_test.go Adds tests for parameter parsing and round-trip serialization of new fields.
docs/operator-public-documentation/postgresql-tuning.md Documents envelope semantics, sidecar carve-out, overrides, and Helm defaults.
docs/designs/vertical-autoscaling-design.md New design doc covering vertical autoscaling roadmap and the carve-out as phase 0.
CHANGELOG.md Adds behavioral-change note describing the new default-on split and restart implications.
Files not reviewed (1)
  • operator/src/api/preview/zz_generated.deepcopy.go: Generated file

Comment thread operator/src/api/preview/documentdb_types.go Outdated
Comment thread docs/operator-public-documentation/postgresql-tuning.md
Comment thread operator/src/internal/cnpg/resource_split.go
Comment thread operator/src/internal/cnpg/resource_split.go
Comment thread operator/cnpg-plugins/sidecar-injector/internal/config/config.go Outdated
@documentdb-triage-tool

Copy link
Copy Markdown

🤖 Auto-triaged by documentdb-triage-tool.

Applied: go, test, documentation, enhancement
Project fields suggested: Component controllers · Priority P2 · Effort XL · Status Needs Review
Confidence: 0.88 (mixed)

Reasoning

component from path globs (controllers, test, docs, api, manifests); effort from diff stats (2054+132 LOC, 28 files); LLM: Cross-component feature (API types, controller reconciliation logic, sidecar injector plugin, Helm chart, OTel config, and docs) that introduces a new resource-splitting model with schema changes and behavioral impact on all existing clusters.

If a label is wrong, remove it manually and ping @patty-chow so the rules can be tuned. The bot will not re-label items that already have component labels.

@documentdb-triage-tool documentdb-triage-tool Bot added documentation Improvements or additions to documentation enhancement New feature or request go Pull requests that update go code test labels Jun 24, 2026
… containers)

Treat spec.resource.memory and spec.resource.cpu as an OPTIONAL pod envelope.
For each dimension:
- envelope set  -> operator carves it (sidecars reserved, PostgreSQL remainder)
- envelope omitted, gateway+database both set -> effective envelope = their sum
- envelope omitted, nothing set -> dimension left unmanaged (no limits)
- envelope omitted, partially specified -> rejected by the validating webhook

CPU is now carved symmetrically with memory (sidecar cpu reservations are
subtracted from the envelope so resolved container CPUs sum to the envelope
instead of exceeding it).

Adds cnpg.ValidateResources (wired into the DocumentDB validating webhook) which
rejects partial-without-envelope configs and configurations where sidecar
reservations leave nothing for PostgreSQL or explicit values exceed the envelope.
Adds unit tests (split + validation), webhook tests, and an e2e spec asserting
the envelope is derived from per-container memory when omitted.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: German Eichberger <geeichbe@microsoft.com>
@xgerman

xgerman commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator Author

Update: the pod resource envelope is now optional

Refined the model so the envelope (spec.resource.memory / spec.resource.cpu) no longer has to be specified — it can be derived as the sum of the per-container values. For each dimension independently:

Configuration Behavior
Envelope set Operator carves it (sidecars reserved, PostgreSQL gets the remainder)
Envelope omitted, gateway + database both set Effective envelope = sum of the containers (OTel uses its default)
Envelope omitted, nothing set Dimension left unmanaged (no limits) — backward compatible
Envelope omitted, partially specified Rejected by the validating webhook (gateway fraction / PG remainder can't be derived)
Envelope set but reservations exhaust it, or explicit values exceed it Rejected by the validating webhook

Also made CPU carve symmetrically with memory (sidecar CPU reservations are subtracted from the envelope so the resolved container CPUs sum to the envelope rather than exceeding it).

New cnpg.ValidateResources is wired into the DocumentDB validating webhook.

Tests added/updated: split + validation unit tests, webhook admission tests (accept/reject paths), and a new e2e spec asserting the envelope is derived from per-container memory when omitted. The full tests/resources e2e area (3/3 specs) passes against images built from this branch (incl. the new envelope-omitted case: gateway 256Mi + database 1Gi explicit, no top-level envelope → containers sized exactly).

@xgerman xgerman force-pushed the feat/sidecar-resource-isolation branch from 1e58f79 to 6d5e4d7 Compare June 24, 2026 15:05
xgerman pushed a commit to xgerman/documentdb-kubernetes-operator that referenced this pull request Jun 24, 2026
Reduce duplication in the sidecar resource carve-out and act on the
review feedback on PR documentdb#409:

- resource_split.go: collapse the three identical *CPUOverride helpers
  into one componentCPU(); add ComponentResource.setMemory/setCPU and use
  them at every request==limit site; reuse the shared componentMemSet/
  isSet predicates instead of inline checks.
- resource_validation.go: factor the structurally identical memory and
  cpu validation blocks into a single validateDimension() driven by a
  dimension descriptor (error wording and field paths unchanged).
- parseFloatOr: allow a 0 fraction (intentionally disabling the gateway
  carve-out) and clamp the result to [0,1]; fall back to the documented
  default on parse errors.
- parseQuantityOr (was parseQuantityOrZero): fall back to the documented
  default on an invalid non-empty value instead of silently disabling the
  cap; clamp negatives to 0.
- sidecar-injector config ToParameters(): omit empty optional resource
  parameters to avoid noisy defaulting diffs in the CNPG Cluster spec.
- helm unittest: assert the OTel memory/CPU request env vars and the
  default omission (and configured presence) of the gateway CPU limit.
- docs: describe spec.resource.cpu as the total pod CPU envelope carved
  symmetrically with memory (postgresql-tuning.md and the CPU field doc
  comment in documentdb_types.go).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: German Eichberger <geeichbe@microsoft.com>
@xgerman

xgerman commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator Author

Addressed review feedback + reduced duplication (aca9c66)

Pushed a refactor: commit that resolves all six Copilot review comments and removes duplication flagged in the self-review. No behavior change to the carve-out math; the validation error wording and field paths are unchanged (webhook tests still green).

Review comments addressed

  • parseFloatOr now allows 0 (disables the gateway carve-out) and clamps the fraction to [0,1]; falls back to the default on parse errors.
  • parseQuantityOrZeroparseQuantityOr(value, fallback): invalid non-empty values fall back to the documented DEFAULT_GATEWAY_MEMORY_CAP instead of silently disabling the cap; negatives clamp to 0.
  • Plugin ToParameters() omits empty optional resource params (no more noisy empty-string diffs from the defaulting webhook).
  • Helm unittest now asserts the OTel memory/CPU request env vars and the gateway CPU limit default-omission / configured-presence.
  • Docs (postgresql-tuning.md + documentdb_types.go CPU field) now describe spec.resource.cpu as the total pod CPU envelope, carved symmetrically with memory.

Duplication removed

  • resource_split.go: three identical *CPUOverride helpers collapsed into one componentCPU(); added ComponentResource.setMemory/setCPU and used them at every request==limit site; reuse the shared componentMemSet/isSet predicates instead of inline != "" && != "0" checks.
  • resource_validation.go: the structurally identical memory and CPU validation blocks are now a single validateDimension() driven by a dimension descriptor (unit + noun + formatter), called once per dimension.

Verification: go build ./... + go test ./... pass in both the operator (operator/src) and plugin (operator/cnpg-plugins/sidecar-injector) modules; helm unittest 85/85.

@WentingWu666666

Copy link
Copy Markdown
Collaborator

🟡 Minor (API consistency): otel collector's QoS class silently flips between Burstable and Guaranteed depending on whether resource.otel.memory is set

What I'm seeing

The OTel collector's memory QoS changes based on how it's configured, not what value it ends up with:

otel config request limit QoS
default (no override) 48Mi 128Mi Burstable
spec.resource.otel.memory: 128Mi 128Mi 128Mi Guaranteed

Root cause

It's an expressiveness mismatch between the two config layers:

  • The default comes from two separate values (otelMemoryRequest: 48Mi / otelMemoryLimit: 128Mi), so it can be Burstable.
  • The user override flows through ComponentResources.Memory — a single scalar — applied via setMemory(), which pins request = limit. A single field can't express request ≠ limit, so any override is forced to Guaranteed.
// override path — forces request == limit
func (c *ComponentResource) setMemory(q string) { c.MemoryRequest = q; c.MemoryLimit = q }

// default path — two values, stays Burstable
split.OTel.MemoryRequest = cfg.OTelMemoryRequest  // 48Mi
split.OTel.MemoryLimit   = cfg.OTelMemoryLimit    // 128Mi

Why it matters (silent footgun)

A user who reads "otel limit defaults to 128Mi" and sets otel.memory: 128Mi to "pin the documented value" unknowingly raises the request from 48Mi → 128Mi — reserving 128Mi per pod instead of 48Mi (a fleet-wide density regression) and dropping the burst behavior — with no signal that the semantics changed. The value looks identical; only the QoS differs.

Suggested resolution (pick one)

  1. Document the flip (minimum). Note in the OTel field doc + design doc that setting otel.memory makes the collector Guaranteed (request==limit), and the Burstable 48/128 default only applies when unset. Cheap; removes the surprise.
  2. Treat an otel override as the limit, keeping request at the default (e.g. min(defaultRequest, override)), so an override preserves Burstable semantics. Contains the special-casing to otel.
  3. Add explicit request/limit fields to ComponentResources (e.g. memoryRequest/memoryLimit). Principled fix — lets any container express Burstable consistently (and would also enable a soft-CPU PostgreSQL option). Costs API surface.

I'd consider (1) a merge-blocker-lite (the silent 48→128 request jump is a real operational surprise) and (2)/(3) optional depending on whether we want users to express request ≠ limit at all.

Guiding principle: a component's QoS class shouldn't change silently just because the user pins its value. Today otel violates that.

@WentingWu666666

Copy link
Copy Markdown
Collaborator

🟡 Minor (memory carve-out): two behaviors worth pinning down — gateway override bypasses the cap, and the intentional Guaranteed/Burstable asymmetry is undocumented

Following up on the memory-split design with two related points.

1. An explicit resource.gateway.memory bypasses both the fraction and the 32Gi cap

When gateway.memory is set, the split uses it verbatim — the gatewayMemoryFraction (18.75%) and the gatewayMemoryCap (32Gi) are skipped entirely:

if componentMemSet(res.Gateway) {
    split.Gateway.setMemory(res.Gateway.Memory)            // verbatim; cap NOT applied
} else {
    gatewayBytes = gatewayMemoryReservationBytes(env, cfg) // min(fraction×env, cap)
}

So gateway.memory: 40Gi is honored even though it exceeds the 32Gi cap; the only guard is the webhook rule that gateway + otel < envelope (so PostgreSQL gets a remainder).

Question: is the cap intended as a default-only heuristic, or as a safety ceiling? If the latter, an explicit override should arguably still be clamped to (or validated against) gatewayMemoryCap, otherwise the cap gives a false sense of an upper bound. Either way, worth a one-line doc note clarifying that overrides bypass the fraction/cap and that PostgreSQL absorbs the reduction (with shared_buffers recomputed from the smaller remainder).

2. Document the intentional memory QoS asymmetry across containers

The memory model is sound, but the QoS treatment is deliberately non-uniform, and that's currently invisible to readers:

Container Default memory QoS Rationale
postgres Guaranteed (req==limit) data path; OOM protection
gateway Guaranteed (req==limit) leak isolation
otel Burstable (48Mi/128Mi) spiky telemetry; limit is pre-carved from the envelope so a burst can't starve PG; better density

This is a good design — the carve subtracts otel's limit (128Mi) from the envelope, so otel's burst headroom can't impact PostgreSQL — but nothing states it. A short note in the design doc / field docs ("all container memory is request==limit except the OTel collector, which is intentionally Burstable because its limit is reserved from the envelope and telemetry memory is bursty") would prevent the same confusion a reviewer/operator will otherwise hit.

Neither is a blocker — both are "make the intent explicit (and decide whether the cap is a real ceiling)" requests.

@WentingWu666666

Copy link
Copy Markdown
Collaborator

Question / discussion (CPU model): could you walk through the reasoning behind the CPU treatment? A few defaults surprised me.

The memory carve-out reads really cleanly. I want to make sure I understand the CPU side, since it behaves differently from memory and I may be missing the rationale. Here's what each container gets for CPU on the default path:

Container CPU request CPU limit Effective CPU QoS
postgres remainder == request Guaranteed / hard
gateway none none BestEffort
otel 50m none Burstable

(From ComputeResourceSplit — postgres via setCPU (req==limit); gateway gets CPU only if gateway.cpu is overridden or GatewayCPULimit is set; otel = firstNonEmpty(override, 50m) request, no limit unless overridden.)

A few things I'd like to understand:

  1. Gateway is BestEffort on CPU by default — is that intended? With no CPU request, the gateway has no floor and the lowest CFS weight under contention (below otel's 50m). Since it's the request-handling hot path, I'd have expected it to want a guaranteed floor. Was a default gateway CPU request considered, or is BestEffort deliberate (e.g. to keep it from reserving CPU it usually doesn't need)?

  2. For CPU specifically, was request==limit a deliberate choice over "request-only"? CPU is compressible, so a limit means throttling rather than OOM — and the common DB guidance is to set a CPU request but no limit to avoid throttling a bursty workload (postgres: checkpoints, autovacuum, parallel workers; the gateway: thread-per-CPU runtime). I suspect you may be intentionally following CNPG's literal Guaranteed recommendation (req==limit for both dims) — if so, it'd be worth a comment noting that trade-off was chosen knowingly. Did you weigh throttling vs. the Guaranteed-QoS benefit here?

  3. The PR describes CPU as "carved symmetrically with memory" — does that match the default? subtractCPU(env, split.Gateway.CPURequest, split.OTel.CPURequest) uses the gateway CPU request, which is empty by default, so postgres ends up with env − 0 − 50m and the gateway reserves no CPU from the envelope unless GatewayCPULimit is set. Memory always carves a gateway slice; CPU seems to only when opted in. Just want to confirm that's the intended asymmetry vs. a gap.

  4. Bigger-picture: because the API exposes a single CPU scalar (applied as request==limit via setCPU), there's currently no way for a user to express "CPU request floor, no limit" for postgres or the gateway. If soft CPU is something we'd want to support later, is that on the roadmap, or intentionally out of scope for Phase 0?

None of these are blockers — mostly I want to confirm the intent and, where it's deliberate, capture the reasoning in a comment/design-doc note so the next reader (me included!) doesn't read it as an oversight.

@WentingWu666666

Copy link
Copy Markdown
Collaborator

Small correction / clarification to my CPU note above 🟡

I wrote that otel's CPU has "no limit unless overridden." That parenthetical isn't quite right — on closer reading, otel never gets a CPU limit, even when spec.resource.otel.cpu is set.

Tracing it through:

  • resource_split.go computes split.OTel.CPULimit = componentCPU(res.OTel) // limit only when explicitly set
  • …but that value doesn't appear to be consumed anywhere downstream: there's no OTEL_CPU_LIMIT plugin param, no corresponding field in the sidecar-injector config.go, and lifecycle.go passes "" as the otel container's CPU limit.

So an otel.cpu override only raises the request; the limit stays empty and otel's CPU remains soft in all cases. That actually reinforces the consistency point I was making — otel CPU is uniformly soft, no QoS flip there (unlike memory).

One small follow-up question for you, @xgerman: is split.OTel.CPULimit intended to be wired up later (i.e. a placeholder), or is it effectively dead today? If it's not meant to be plumbed through yet, the inline // limit only when explicitly set comment might be a little misleading, since the value is dropped before it reaches the container. Not blocking — just flagging in case it's an oversight.

Reduce duplication in the sidecar resource carve-out and act on the
review feedback on PR documentdb#409:

- resource_split.go: collapse the three identical *CPUOverride helpers
  into one componentCPU(); add ComponentResource.setMemory/setCPU and use
  them at every request==limit site; reuse the shared componentMemSet/
  isSet predicates instead of inline checks.
- resource_validation.go: factor the structurally identical memory and
  cpu validation blocks into a single validateDimension() driven by a
  dimension descriptor (error wording and field paths unchanged).
- parseFloatOr: allow a 0 fraction (intentionally disabling the gateway
  carve-out) and clamp the result to [0,1]; fall back to the documented
  default on parse errors.
- parseQuantityOr (was parseQuantityOrZero): fall back to the documented
  default on an invalid non-empty value instead of silently disabling the
  cap; clamp negatives to 0.
- sidecar-injector config ToParameters(): omit empty optional resource
  parameters to avoid noisy defaulting diffs in the CNPG Cluster spec.
- helm unittest: assert the OTel memory/CPU request env vars and the
  default omission (and configured presence) of the gateway CPU limit.
- docs: describe spec.resource.cpu as the total pod CPU envelope carved
  symmetrically with memory (postgresql-tuning.md and the CPU field doc
  comment in documentdb_types.go).
- otel CPU is now bounded: wire the OTel collector CPU limit end-to-end
  (new DOCUMENTDB_OTEL_CPU_LIMIT env / otelCpuLimit plugin param, default
  200m). Previously split.OTel.CPULimit was computed but dropped before
  reaching the container, so the collector ran with no CPU ceiling. An
  explicit spec.resource.otel.cpu still pins request==limit (Guaranteed);
  the unset default stays Burstable (50m request / 200m limit) and only
  the request is carved from the CPU envelope.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: German Eichberger <geeichbe@microsoft.com>
@xgerman xgerman force-pushed the feat/sidecar-resource-isolation branch from aca9c66 to afce496 Compare June 26, 2026 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request go Pull requests that update go code test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants