LCORE-2310: Query agent run by asimurka · Pull Request #1865 · lightspeed-core/lightspeed-stack

asimurka · 2026-06-08T06:55:45Z

Description

Wired pydantic agents into non-streaming query flow.
Conversation compaction is temporarily disconnected and will be reconnected in a follow-up PR.

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

Related Issue #
Closes # https://redhat.atlassian.net/browse/LCORE-2310

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

Refactor
- Modernized query endpoint to leverage improved agent-based architecture
- Enhanced tool handling with consistent error reporting for MCP integrations
- Refactored test mocking infrastructure for better coverage of agent workflows

coderabbitai · 2026-06-08T06:55:53Z

Warning

Review limit reached

@asimurka, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 41 minutes and 49 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 21ba7fe8-d022-4eea-b54e-0cd31b9c7b49

📥 Commits

Reviewing files that changed from the base of the PR and between be9f7dd and 70949de.

📒 Files selected for processing (7)

src/app/endpoints/query.py
src/utils/agents/query.py
tests/integration/conftest.py
tests/integration/endpoints/test_query_byok_integration.py
tests/integration/endpoints/test_query_integration.py
tests/unit/app/endpoints/test_query.py
tests/unit/utils/agents/test_tool_processor.py

Walkthrough

The /query endpoint switches response retrieval from retrieve_response to retrieve_agent_response (now accepting an optional _original_input parameter) and delegates topic summary generation to maybe_get_topic_summary. The legacy retrieve_response is marked deprecated. Integration and unit tests are migrated from responses.create/OpenAIResponseObject mocks to a new mock_query_agent fixture and pydantic-ai agent-run result builders. MCP tool-result dispatch tests are updated to include an error field in payloads.

Changes

Query endpoint agent-based retrieval, topic summary delegation, MCP dispatch fix, and test migration

Layer / File(s)	Summary
Core source: agent retrieval signature, endpoint wiring, topic summary, and deprecation `src/utils/agents/query.py`, `src/app/endpoints/query.py`	`retrieve_agent_response` gains an optional `_original_input: Optional[ResponseInput]` parameter. The `/query` endpoint switches `turn_summary` production to `retrieve_agent_response`, refactors topic-summary delegation to `maybe_get_topic_summary` with a `should_generate` boolean, and annotates `retrieve_response` with `@deprecated`.
Integration test helpers: agent run result builders and `mock_query_agent` fixture `tests/integration/conftest.py`	Lifts pydantic-ai type imports to module scope and adds `create_agent_run_result`, `create_file_search_agent_run_result`, `create_mcp_list_tools_agent_run_result`, `create_multi_tool_agent_run_result`, `set_query_agent_run` helpers, and the `mock_query_agent` fixture that patches `utils.agents.query.build_agent`.
Integration tests migrated to `query_agent.run` mocking `tests/integration/endpoints/test_query_integration.py`, `tests/integration/endpoints/test_query_byok_integration.py`	Removes `OpenAIResponseObject` mocks and replaces them with `mock_query_agent.run.return_value` assignments using the new builder helpers. Tool-presence assertions shift to `build_agent_mock` call args. BYOK inline RAG assertions validate context injection in the agent prompt instead of inspecting `responses.create` inputs.
Unit tests updated for `retrieve_agent_response`, `maybe_get_topic_summary`, and MCP error-field dispatch `tests/unit/app/endpoints/test_query.py`, `tests/unit/utils/agents/test_tool_processor.py`	Patches `retrieve_agent_response` and `maybe_get_topic_summary` instead of the old helpers across all affected test cases. MCP payload mocks consistently include the `error` field; a new regression test confirms error-field-only payloads are not misrouted to the list-tools summarizer.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

lightspeed-core/lightspeed-stack#1796: Directly aligned with this PR's addition of _original_input to retrieve_agent_response and the compaction-mode original_input threading through query turn retrieval.
lightspeed-core/lightspeed-stack#1855: This PR updates MCP tool-result dispatch tests for error field handling in test_tool_processor.py, directly related to the MCP/list-tools dispatch logic introduced in that PR.
lightspeed-core/lightspeed-stack#1880: Directly related — this PR extends the retrieve_agent_response introduced there with the new _original_input parameter and migrates the endpoint and tests to use it.

Suggested reviewers

jrobertboos
tisnik
max-svistunov

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'LCORE-2310: Query agent run' directly matches the main objective of the PR: implementing integration of pydantic agents into the non-streaming query flow.
Docstring Coverage	✅ Passed	Docstring coverage is 95.16% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

asimurka · 2026-06-09T07:28:38Z

Will be rebased on #1880

asimurka · 2026-06-12T12:25:36Z

    responses_params: ResponsesApiParams,
    moderation_result: ShieldModerationResult,
    endpoint_path: str,
+    _original_input: Optional[ResponseInput] = None,


Compacted turn persistence is disconnected and will be part of a separate PR as it will require additional logic.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/utils/agents/query.py (1)

304-310: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the original pre-rewrite input when persisting blocked turns.

Line 304 handles moderation-blocked persistence, but it always writes responses_params.input. When _original_input is provided (Line 285), this can persist rewritten synthetic input instead of the user’s actual query.

Proposed fix

 async def retrieve_agent_response(
@@
 ) -> TurnSummary:
@@
-    if moderation_result.decision == "blocked":
+    turn_input = (
+        _original_input if _original_input is not None else responses_params.input
+    )
+    if moderation_result.decision == "blocked":
         await append_turn_items_to_conversation(
             client,
             responses_params.conversation,
-            responses_params.input,
+            turn_input,
             [moderation_result.refusal_response],
         )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/utils/agents/query.py` around lines 304 - 310, When persisting a
moderation-blocked turn in the condition checking if moderation_result.decision
== "blocked", the append_turn_items_to_conversation call currently passes
responses_params.input which may be a rewritten or synthetic version. Instead,
use the original pre-rewrite input when available. Check if _original_input is
provided and pass that to append_turn_items_to_conversation instead of
responses_params.input to ensure the user's actual query is persisted rather
than any rewritten version.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/integration/conftest.py`:
- Around line 271-275: The NativeToolReturnPart instantiation in the
create_mcp_list_tools_agent_run_result function is missing an explicit error
field. Since MCP tool-result handling uses error is None to classify success or
failure, add error=None as an explicit parameter to the NativeToolReturnPart
constructor to match the runtime contract and ensure the integration fixture
accurately represents the expected payload structure.

In `@tests/unit/utils/agents/test_tool_processor.py`:
- Around line 499-510: The test
test_mcp_call_with_error_field_not_routed_to_list_tools currently uses a payload
with both output and error fields, but the test name and intent suggest it
should specifically test the error-only edge case. In the content dictionary
passed to NativeToolReturnPart, remove the output field and instead set error to
an actual error message or exception object (rather than None) to create a truly
error-only payload. This will properly exercise the misrouting edge case the
test is meant to guard against, rather than duplicating coverage of mixed
output/error scenarios already tested elsewhere.

---

Outside diff comments:
In `@src/utils/agents/query.py`:
- Around line 304-310: When persisting a moderation-blocked turn in the
condition checking if moderation_result.decision == "blocked", the
append_turn_items_to_conversation call currently passes responses_params.input
which may be a rewritten or synthetic version. Instead, use the original
pre-rewrite input when available. Check if _original_input is provided and pass
that to append_turn_items_to_conversation instead of responses_params.input to
ensure the user's actual query is persisted rather than any rewritten version.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a69df830-70ba-4415-b592-cb7459c4ef5e

📥 Commits

Reviewing files that changed from the base of the PR and between fbef706 and be9f7dd.

📒 Files selected for processing (7)

src/app/endpoints/query.py
src/utils/agents/query.py
tests/integration/conftest.py
tests/integration/endpoints/test_query_byok_integration.py
tests/integration/endpoints/test_query_integration.py
tests/unit/app/endpoints/test_query.py
tests/unit/utils/agents/test_tool_processor.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)

GitHub Check: E2E: server mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: integration_tests (3.13)
GitHub Check: integration_tests (3.12)
GitHub Check: unit_tests (3.12)
GitHub Check: E2E Tests for Lightspeed Evaluation job
GitHub Check: build-pr
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request

🧰 Additional context used

📓 Path-based instructions (4)

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

src/app/endpoints/query.py
src/utils/agents/query.py

src/app/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/app/**/*.py: FastAPI dependencies: Import from fastapi module for APIRouter, HTTPException, Request, status, Depends
Use FastAPI HTTPException with appropriate status codes for API endpoints and handle APIConnectionError from Llama Stack

Files:

src/app/endpoints/query.py

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

tests/unit/utils/agents/test_tool_processor.py
tests/unit/app/endpoints/test_query.py
tests/integration/conftest.py
tests/integration/endpoints/test_query_byok_integration.py
tests/integration/endpoints/test_query_integration.py

tests/**/conftest.py

📄 CodeRabbit inference engine (AGENTS.md)

Use conftest.py for shared pytest fixtures and pytest-mock for AsyncMock objects

Files:

tests/integration/conftest.py

🧠 Learnings (2)

📚 Learning: 2026-01-14T09:37:51.612Z

Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 988
File: src/app/endpoints/query.py:319-339
Timestamp: 2026-01-14T09:37:51.612Z
Learning: In the lightspeed-stack repository, when provider_id == "azure", the Azure provider with provider_type "remote::azure" is guaranteed to be present in the providers list. Therefore, avoid defensive StopIteration handling for next() when locating the Azure provider in providers within src/app/endpoints/query.py. This change applies specifically to this file (or nearby provider lookup code) and relies on the invariant that the Azure provider exists; if the invariant could be violated, keep the existing StopIteration handling.

Applied to files:

src/app/endpoints/query.py

📚 Learning: 2026-04-06T20:18:07.852Z

Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.

Applied to files:

src/app/endpoints/query.py

🔇 Additional comments (7)

src/utils/agents/query.py (1)

6-7: LGTM!

Also applies to: 36-37, 285-297

src/app/endpoints/query.py (1)

18-19: LGTM!

Also applies to: 46-47, 73-75, 231-262, 308-311

tests/unit/app/endpoints/test_query.py (1)

130-132: LGTM!

Also applies to: 156-164, 251-253, 323-325, 379-381, 400-408, 468-474, 489-489, 520-522, 554-562

tests/unit/utils/agents/test_tool_processor.py (1)

433-491: LGTM!

Also applies to: 588-589, 608-609

tests/integration/conftest.py (1)

11-26: LGTM!

Also applies to: 169-270, 276-348, 708-717

tests/integration/endpoints/test_query_byok_integration.py (1)

19-23: LGTM!

Also applies to: 94-97, 120-125, 138-152, 173-179, 196-216, 288-336, 404-419, 475-491, 748-804, 846-864, 933-950, 1038-1054, 1130-1146, 1240-1255

tests/integration/endpoints/test_query_integration.py (1)

27-31: LGTM!

Also applies to: 47-68, 94-122, 165-191, 234-258, 365-390, 439-480, 499-539, 558-582, 604-657, 663-719, 730-752, 782-831, 855-877, 909-940, 977-1000, 1030-1068, 1091-1134, 1140-1172, 1195-1230, 1262-1291, 1329-1353, 1421-1468

asimurka marked this pull request as draft June 8, 2026 06:55

asimurka force-pushed the query_agent_run branch 2 times, most recently from 7e4d7ef to ff3dc97 Compare June 9, 2026 07:07

asimurka force-pushed the query_agent_run branch 2 times, most recently from 0831831 to 5efdc0e Compare June 12, 2026 12:23

asimurka commented Jun 12, 2026

View reviewed changes

asimurka marked this pull request as ready for review June 15, 2026 11:55

asimurka force-pushed the query_agent_run branch from 5efdc0e to f2be235 Compare June 15, 2026 11:55

asimurka requested review from jrobertboos and tisnik June 17, 2026 07:47

asimurka force-pushed the query_agent_run branch from f2be235 to be9f7dd Compare June 17, 2026 14:38

coderabbitai Bot reviewed Jun 17, 2026

View reviewed changes

Comment thread tests/integration/conftest.py

Comment thread tests/unit/utils/agents/test_tool_processor.py

asimurka force-pushed the query_agent_run branch from be9f7dd to 4382014 Compare June 17, 2026 14:52

Wired query agent.run

70949de

asimurka force-pushed the query_agent_run branch from 4382014 to 70949de Compare June 17, 2026 14:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LCORE-2310: Query agent run#1865

LCORE-2310: Query agent run#1865
asimurka wants to merge 1 commit into
lightspeed-core:mainfrom
asimurka:query_agent_run

asimurka commented Jun 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading

Review limit reached

Uh oh!

asimurka commented Jun 9, 2026

Uh oh!

asimurka Jun 12, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

asimurka commented Jun 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

asimurka commented Jun 9, 2026

Uh oh!

asimurka Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

asimurka commented Jun 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading