Skip to content

Python: Fix Magentic manager duplicating conversation history#6297

Open
ghominejad wants to merge 1 commit into
microsoft:mainfrom
ghominejad:fix/python-magentic-manager-duplicate-history
Open

Python: Fix Magentic manager duplicating conversation history#6297
ghominejad wants to merge 1 commit into
microsoft:mainfrom
ghominejad:fix/python-magentic-manager-duplicate-history

Conversation

@ghominejad
Copy link
Copy Markdown
Contributor

@ghominejad ghominejad commented Jun 3, 2026

Fixes #6298

Motivation and Context

StandardMagenticManager._complete() ran every manager model call against a single persistent AgentSession (self._session). For local sessions the default InMemoryHistoryProvider reloads and prepends previously stored input/response messages on each call. But the manager already rebuilds the complete prompt it wants the model to see on every call ([*magentic_context.chat_history, ...] in plan, replan, and create_progress_ledger). The two history mechanisms stacked, so the task/facts/plan were re-sent and compounded every round — wasting tokens and degrading the manager's context with duplicate turns.

GroupChatOrchestrator does not have this problem: it deliberately keeps history in the persistent session and passes only the new delta per call, so it's left unchanged.

Description

  • StandardMagenticManager._complete() now creates a fresh session per call (session=self._agent.create_session()) instead of reusing self._session. This keeps each call stateless — matching the manager's design of supplying the full prompt itself — while still passing a non-None session so context providers configured on the manager agent are invoked (preserving the intent of Python: [Bug]: StandardMagenticManager does not propagate session to manager agent, silently breaking context providers (e.g. RedisHistoryProvider) #4371). self._session is retained solely for checkpoint save/restore continuity.
  • Tests (packages/orchestrations/tests/test_magentic.py):
    • Added test_standard_manager_does_not_duplicate_history, which drives a real Agent through the real session machinery with a recording chat client and asserts the facts pre-survey reaches the model exactly once per call. It fails on the previous code (2 == 1) and passes with the fix.
    • Updated test_standard_manager_propagates_session_to_agent: still asserts a session is propagated (not None) on each call, but now asserts each call uses a fresh session rather than one shared, accumulating instance.
  • Group-chat behavior verified unchanged; full test_magentic.py and test_group_chat.py suites pass.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

_complete() reused one persistent AgentSession, so the default history provider
re-injected prior turns on top of the full prompt the manager already rebuilds
each call — duplicating task/facts/plan and compounding every round. Use a fresh
session per call; keep self._session only for
checkpointing. GroupChatOrchestrator is unaffected. Add a regression test and
update the session-propagation test.
Copilot AI review requested due to automatic review settings June 3, 2026 07:49
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR updates StandardMagenticManager to avoid duplicated prompt/history injection by using a fresh AgentSession for each _complete() call, and adds regression coverage to ensure the manager doesn’t resend already-sent turns.

Changes:

  • Switch _complete() to create a new session per call (instead of reusing self._session) to prevent history duplication.
  • Update existing session assertions to require non-None sessions while enforcing “fresh session per call”.
  • Add a new regression test that inspects the exact message list sent to the chat client and asserts prompts aren’t duplicated.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
python/packages/orchestrations/tests/test_magentic.py Adds/adjusts regressions tests validating non-duplicated history and fresh-per-call sessions.
python/packages/orchestrations/agent_framework_orchestrations/_magentic.py Changes manager _complete() to use a new session each call; updates inline documentation around session usage.

Comment on lines +1213 to +1216
async def _get() -> ChatResponse:
return ChatResponse(messages=Message(role="assistant", contents=["recorded"]))

return _get()
Comment on lines +1198 to +1199
facts_marker = ORCHESTRATOR_TASK_LEDGER_FACTS_PROMPT.split("{")[0].strip()
plan_marker = ORCHESTRATOR_TASK_LEDGER_PLAN_PROMPT.split("{")[0].strip()
Comment on lines 571 to 575
self._agent: SupportsAgentRun = agent
# Retained only for checkpoint save/restore continuity. LLM calls in `_complete`
# use a fresh session each time so the agent's history provider cannot re-inject
# (and thereby duplicate) the conversation the manager already passes in full.
self._session: AgentSession = self._agent.create_session()
providers configured on the manager agent are still invoked (regression #4371).
"""
response: AgentResponse = await self._agent.run(messages, session=self._session)
response: AgentResponse = await self._agent.run(messages, session=self._agent.create_session())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: [Bug]: Magentic manager duplicates conversation history on every model call

3 participants