Skip to content

perf(web): isolate streaming text to fix main-thread jank#1111

Open
wbxl2000 wants to merge 5 commits into
mainfrom
perf/web-streaming-render
Open

perf(web): isolate streaming text to fix main-thread jank#1111
wbxl2000 wants to merge 5 commits into
mainfrom
perf/web-streaming-render

Conversation

@wbxl2000

Copy link
Copy Markdown
Collaborator

Related Issue

No linked issue — this came out of a profiling session on the web chat.

Problem

During a long streaming reply the web chat stalls the main thread for hundreds of milliseconds per frame, so the stream looks janky. A Chrome profile showed a single Animation frame taking ~15 s, dominated by App.vue re-rendering and the sidebar computeds (sessionsForView / workspaceGroups / mergedWorkspaces) re-evaluating.

Root cause: every assistantDelta was routed through the immutable reducer and the coarse rawState graph. cloneState also copied the sessions array on every event, so each token dirtied the sidebar computeds and re-rendered the whole App — even though a token only appends a few characters to one message.

What changed

  • Live streaming text is kept in a separate, fine-grained store outside rawState, and assistantDelta bypasses the reducer to append there in O(1). Only the actively streaming block re-renders on each token.
  • cloneState no longer copies the sessions array for events that do not touch it, so the sidebar computeds stay stable during streaming.
  • A small StreamingBlocks component subscribes to the store and renders the live text/thinking blocks; the committed content takes over at turn end via the existing messageUpdated.
  • Re-enabled markstream's frame-budget batch rendering during streaming.

This keeps the cost of a token proportional to the size of the change (a few characters) instead of the size of the whole UI.

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked a related issue, or explained the problem above.
  • I have added tests that prove my feature works.
  • Ran gen-changesets skill, or this PR needs no changeset.
  • Ran gen-docs skill, or this PR needs no doc update.

@changeset-bot

changeset-bot Bot commented Jun 25, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 318e752

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@moonshot-ai/kimi-code Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d2c0d7f577

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

// StreamingBlocks component — instead of cloning all of `rawState` and
// re-rendering the whole App + sidebar on every token.
if (event.type === 'assistantDelta') {
appendStreamingDelta(sessionId, event.messageId, event.contentIndex, event.delta);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clear the streaming store when applying snapshots

Because this writes deltas outside rawState, the snapshot path no longer overwrites them: syncSessionFromSnapshot flushes pending render events before setSessionMessages, so any queued assistantDelta lands in streamingBySession and then survives the authoritative snapshot/seed. On reconnect or delta-gap resync while a reply is streaming, the old live chunks are rendered in addition to the seeded snapshot (or can leak into the next assistant turn) until another clear event arrives. Please clear streamingBySession[sessionId] when installing a snapshot, before seedSnapshot.

Useful? React with 👍 / 👎.

Comment on lines +47 to +48
<div v-else-if="blk.kind === 'text' && blk.text" class="msg">
<Markdown :text="blk.text" :streaming="true" :open-file="(target) => emit('openFile', target)" />

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Merge live deltas with seeded markdown blocks

When a user refreshes or reconnects mid-reply, seedInFlight puts the already-generated assistantText/thinkingText into turn.blocks, and later deltas for the same contentIndex render here as a separate Markdown/ThinkingBlock appended after the seeded block. Markdown constructs spanning the snapshot boundary, such as an open code fence or list, are parsed as two documents and render incorrectly until messageUpdated commits. Seed the streaming store with the snapshot block or render same-index live text through the existing block instead of a separate Markdown instance.

Useful? React with 👍 / 👎.

wbxl2000 added a commit that referenced this pull request Jun 25, 2026
Since streaming deltas now live outside rawState, the snapshot install path
(which overwrites messagesBySession) no longer clears them. A queued delta
flushed right before a snapshot would survive the authoritative seed and render
on top of it (or leak into the next turn) after a reconnect or delta-gap
resync. Clear the per-session streaming store when installing a snapshot and
when tearing down a session.

Addresses review feedback on PR #1111.
wbxl2000 added 2 commits June 26, 2026 00:29
During a long streaming reply every assistantDelta was routed through the
immutable reducer and the coarse rawState graph, which made the whole App and
the sidebar computeds re-render on every token and stalled the main thread for
hundreds of milliseconds per frame.

Keep the live streaming text in a separate fine-grained store outside rawState
and let assistantDelta bypass the reducer, so only the actively streaming block
re-renders. Also stop cloneState from copying the sessions array for events that
do not touch it, so the sidebar computeds stay stable during streaming.
Since streaming deltas now live outside rawState, the snapshot install path
(which overwrites messagesBySession) no longer clears them. A queued delta
flushed right before a snapshot would survive the authoritative seed and render
on top of it (or leak into the next turn) after a reconnect or delta-gap
resync. Clear the per-session streaming store when installing a snapshot and
when tearing down a session.

Addresses review feedback on PR #1111.
@wbxl2000 wbxl2000 force-pushed the perf/web-streaming-render branch from 8fc414d to ab17799 Compare June 25, 2026 16:30
@pkg-pr-new

pkg-pr-new Bot commented Jun 25, 2026

Copy link
Copy Markdown
pnpm dlx https://pkg.pr.new/@moonshot-ai/kimi-code@318e752
npx https://pkg.pr.new/@moonshot-ai/kimi-code@318e752

commit: 318e752

@wbxl2000

Copy link
Copy Markdown
Collaborator Author

@codex

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ab177991a2

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

:text="blk.text"
:mobile="mobile"
:streaming="true"
@open="emit('openThinking', { turnId, blockIndex: blk.contentIndex })"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Resolve streaming thinking from live blocks

When a user clicks a still-streaming thinking block, this emits blockIndex: blk.contentIndex, but the live block is no longer present in client.turns because assistantDelta now bypasses messagesBySession; useDetailPanel still reads turn.blocks?.[blockIndex], so the side panel either stays closed or shows an older committed thinking block. This regresses the existing “click thinking to view full text” behavior for long live thoughts until the final messageUpdated arrives.

Useful? React with 👍 / 👎.

wbxl2000 added 2 commits June 26, 2026 00:59
Clicking a still-streaming thinking block emitted a blockIndex (contentIndex)
that did not match the turn.blocks index the detail panel reads, and the live
block is not in client.turns at all (deltas bypass messagesBySession). So the
panel either stayed closed or showed a stale committed block.

Tag live thinking targets with a `live` flag and read their text from the
streaming store in thinkingPanelText, so the panel shows the growing thinking
text while the reply is streaming (restoring the pre-existing "click to view
full thinking" behavior). Once the turn settles the store is cleared and the
live target returns null, closing the panel rather than indexing turn.blocks
with a mismatched contentIndex.
…aries

The previous fix read live thinking text from the streaming store, but the
store is cleared on every messageUpdated (tool slot / step end / turn end) so
the committed content can take over in the chat. The final deltas and that
clear land in the same tick and coalesce, so the panel would close *before*
rendering the last chunk, and at step boundaries it would stay closed for the
next step. That made the stream look unstable and drop the final content.

When the store is empty, fall back to the committed thinking block in the turn
(which already holds the full text via messageUpdated). The panel now shows the
live text while streaming and the complete committed text through each
boundary, instead of flickering closed.

Also add an integration test that drives the real projector -> store -> reducer
pipeline (thinking deltas, step/turn completion, and a mid-stream reconnect) to
lock in the live-store lifecycle.
@wbxl2000

Copy link
Copy Markdown
Collaborator Author

@codex

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 119eba8e49

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

// The committed prefix is NOT duplicated into the live store.
expect(live!.blocks[0]!.text).not.toContain('prefix');
});
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove the extra test-suite closer

With this extra closing });, the newly added test file has one more closing block than it opens; running the file fails with ERR_INVALID_TYPESCRIPT_SYNTAX before any tests execute. Any kimi-web test job that includes this suite will abort instead of running the regression coverage.

Useful? React with 👍 / 👎.

// holds the full text — so the panel keeps showing the complete content
// through the boundary instead of flickering closed.
const turn = client.turns.value.find((tn) => tn.id === target.turnId);
const committed = turn?.blocks?.find((b) => b.kind === 'thinking');

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve the selected live thinking block

When a user opens a later live thinking block, such as a turn that already had an earlier thinking segment before a tool and then starts thinking again, messageUpdated clears streamingBySession and this fallback ignores the selected block identity by returning the first committed thinking block in the merged turn. The right-side panel then switches to an unrelated earlier thought at the commit boundary instead of the block the user opened.

Useful? React with 👍 / 👎.

Two follow-ups to the streaming-store migration:

- "Copy conversation" and "copy final summary" serialized only props.turns,
  which during streaming stops at the last messageUpdated and drops the live
  tail. Merge the streaming store's live blocks into the streaming turn before
  serializing, so a copy mid-stream captures the full in-flight text.

- The WS sessionDeleted event flows through the reducer (not forgetSession),
  so it never cleared the streaming store and left an orphaned entry behind.
  Clear it in applyEvent, mirroring forgetSession.
@itxaiohanglover

Copy link
Copy Markdown

Nice approach isolating streaming deltas into a separate reactive store — that's a clean way to avoid dirtying the whole computed graph on every token. The withLiveBlocks merge for mid-stream copy is a thoughtful touch, and the sessions reference stability test is great. One thing: since streamingBySession is a module-level reactive, make sure there's no path where a session is reused rapidly (e.g. quick retry) that could leave stale blocks from a previous message before the first delta arrives — the messageId guard should cover it, but worth a quick double-check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants