[codex] add embedded Langfuse client#3
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughAdds a new Langfuse ingestion exporter module for durable harness observations, including client construction, batch building, and async sending. Exports the new types from observability mod and crate root, updates README docs, and makes reqwest a non-optional dependency while emptying the openai feature list. ChangesLangfuse export feature
Sequence Diagram(s)sequenceDiagram
participant Caller
participant LangfuseClient
participant LangfuseAPI
Caller->>LangfuseClient: send_observations(trace, observations)
LangfuseClient->>LangfuseClient: build_ingestion_batch(trace, observations)
LangfuseClient->>LangfuseAPI: POST ingestion endpoint with auth
LangfuseAPI-->>LangfuseClient: response (status, body)
LangfuseClient-->>Caller: parsed Value or error
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes Poem
✨ Finishing Touches📝 Generate docstrings
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6f898fbfa4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "startTime": timestamp, | ||
| "endTime": timestamp, |
There was a problem hiding this comment.
Preserve real call start times
When a batch contains the normal ModelStarted at t0 followed by ModelCompleted at t1, this creates the Langfuse generation with both startTime and endTime set to the completion timestamp, so Langfuse records zero latency for every model call even though the durable observations include the real start time. Pre-index the started observations by call_id and use that timestamp (and model name) when emitting the generation; the tool branch below has the same zero-duration issue.
Useful? React with 👍 / 👎.
| .await | ||
| .map_err(|e| TinyAgentsError::Model(format!("Langfuse response read failed: {e}")))?; | ||
| let parsed = serde_json::from_str(&body).unwrap_or_else(|_| json!({ "message": body })); | ||
| if !status.is_success() && status.as_u16() != 207 { |
There was a problem hiding this comment.
Report per-event failures in 207 responses
Langfuse's ingestion endpoint returns HTTP 207 with a result body that can include errors for individual rejected events, but this treats every 207 as Ok(parsed). In cases such as an invalid generated id or a Langfuse deployment rejecting trace/observation events, callers of send_observations will see success while some or all telemetry was dropped unless they know to inspect the raw JSON themselves; check the parsed errors array and return an error when it is non-empty.
Useful? React with 👍 / 👎.
Implements gap #3 (reasoning channel): providers can stream reasoning separately from visible text, round-tripping through the existing stream and AgentEvent::ModelDelta so UIs render it from events alone. - MessageDelta gains a reasoning: String channel (kept out of the final message text) alongside text and tool_call; MessageDelta::text/reasoning constructors. - StreamAccumulator accumulates reasoning into a side buffer exposed via reasoning(); the merged ModelResponse text excludes it. - Updated all MessageDelta literals (providers, openai, testkit, accumulator default stream) for the new field. - Test: reasoning side-channel accumulation kept out of final text.
Summary
LangfuseClient,LangfuseAuth, andLangfuseTraceConfigexportsAgentObservationrecords into Langfuse trace/generation/tool/event ingestion batchesValidation
Summary by CodeRabbit
New Features
Documentation
Chores