Per-block littidx flush + single shard (gated on #3645) by Kbhat1 · Pull Request #3660 · sei-protocol/sei-chain

Kbhat1 · 2026-06-29T14:14:32Z

Description

NOTE: Only Merge post 3645: That makes per-block flush affordable, reclaiming near-per-block crash durability for free.
littFlushInterval 100ms -> 5ms (~one block at Giga throughput) — bounds crash loss to ~a single block (near-WAL durability for receipt bodies).
ShardingFactor 16 -> 1. flushing one segment file is cheaper; sharding mainly helps across multiple disks.

Testing

reads-off write ceiling 167k (vs 168k prior)
reads-on ~153–157k writes / 58k point-reads/s / 520 getLogs/s @ 7ms
3h GC soak flat (old −27% control-loop step-down gone), 0 panics

With litt's async keymap flush (#3645), flushing ~once per block is cheap, so: - littFlushInterval 100ms -> 5ms (~one block at Giga throughput), bounding crash loss to roughly a single block (near-WAL durability) - ShardingFactor 16 -> 1 (flushing a single segment file is cheaper; sharding mainly helps across multiple disks) DO NOT MERGE until #3645 (litt async keymap) lands: without it, flushing this often regresses write throughput ~48%. Validated via cryptosim (main littidx + #3645 + parallel getLogs): - reads-off write ceiling 167k (vs 168k prior) - reads-on ~153-157k writes / 58k point-reads/s / 520 getLogs/s @ 7ms - 3h GC soak flat (old -27% control-loop step-down gone), 0 panics Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-29T14:16:09Z

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`✅ passed`	`✅ passed`	`✅ passed`	Jun 29, 2026, 2:20 PM

codecov · 2026-06-29T14:18:03Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 58.18%. Comparing base (c713a03) to head (362413b).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3660      +/-   ##
==========================================
- Coverage   59.14%   58.18%   -0.97%     
==========================================
  Files        2262     2178      -84     
  Lines      187031   177408    -9623     
==========================================
- Hits       110625   103226    -7399     
+ Misses      66456    65023    -1433     
+ Partials     9950     9159     -791

Flag	Coverage Δ
sei-chain-pr	`56.13% <100.00%> (?)`
sei-db	`70.41% <ø> (-0.22%)`	⬇️
sei-db-state-db	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
sei-db/ledger_db/receipt/litt_receipt_store.go	`58.97% <100.00%> (ø)`

... and 85 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

cursor · 2026-06-29T14:19:57Z

PR Summary

Medium Risk
Changes crash-durability bounds and write/sharding behavior for the auxiliary receipt store; consensus is unaffected but RPC may see not-found for very recent receipts after a hard crash.

Overview
Tightens littidx receipt-body durability by lowering littFlushInterval from 100ms to 5ms (~one block at Giga block times), so a hard crash should lose at most about one block of auxiliary RPC receipt data instead of a much longer window. Comments now tie that aggressiveness to async keymap flushing off the control loop (without it, per-block-style flushing was observed to hurt write throughput badly).

The receipts litt table ShardingFactor drops from 16 to 1, favoring cheaper single-segment flushes on one disk rather than spreading writes across many shards.

^{Reviewed by Cursor Bugbot for commit 362413b. Bugbot is set up for automated code reviews on this repo. Configure here.}

seidroid

A small, well-documented tuning PR that lowers the receipt-store litt flush interval (100ms → 5ms) and drops the sharding factor (16 → 1). The changes are correct and internally consistent; the only notes are process-related (merge ordering vs. #3645) and that the second-opinion reviews produced no findings.

Findings: 0 blocking | 4 non-blocking | 0 posted inline

Blockers

None at the file/PR level.

Non-blocking

Merge-ordering dependency: the PR is only safe to merge after #3645 (async keymap flush), as the author notes and the code comments state. The 5ms per-block flush is described as regressing write throughput by ≈-48% without that change. Confirm #3645 has landed before merging this.
The 5ms ticker calls receipts.Flush() ~200x/sec continuously, including when there are no new writes. This is fine since Flush on a clean table is a cheap no-op and time.Ticker coalesces missed ticks (no goroutine pileup), but it is worth confirming Flush() is genuinely cheap on an empty/already-flushed table to avoid idle-CPU overhead.
Second-opinion reviews produced no actionable output: codex-review.md reports 'No material findings' and cursor-review.md is empty. REVIEW_GUIDELINES.md is also empty, so no repo-specific standards were applied.
ShardingFactor is changed to 1 with the rationale that sharding mainly helps across multiple disks. If a future deployment uses multiple disks for the receipt store, this tuning would need to be revisited — consider whether this should eventually be config-driven rather than a hardcoded constant.

cody-littley

Comments are true only once I merge my branch, but IMO not a big deal since that's merging soon.

My only concern is that timing might not provide an ironclad crash safety guarantee. But this change by itself doesn't make the problem any worse, so no need to block it while we have this discussion.

cody-littley · 2026-06-29T14:38:28Z

+	// littFlushInterval is roughly one flush per block at Giga throughput (a
+	// block is ~7ms), bounding crash loss to about a single block. Flushing this
+	// often is only cheap because litt flushes its keymap asynchronously, off the
+	// control loop; without that, a per-block flush regresses write throughput
+	// badly (≈-48% observed before the async keymap landed).
+	littFlushInterval = 5 * time.Millisecond


Are we requiring one flush per block for crash recovery safety? If so, doing it via wall clock time seems a bit fragile.

Kbhat1 requested a review from cody-littley June 29, 2026 14:17

Kbhat1 added the non-app-hash-breaking label Jun 29, 2026

Kbhat1 changed the title ~~feat(receipt): per-block littidx flush + single shard (gated on #3645)~~ Per-block littidx flush + single shard (gated on #3645) Jun 29, 2026

Kbhat1 marked this pull request as ready for review June 29, 2026 14:19

seidroid Bot approved these changes Jun 29, 2026

View reviewed changes

cody-littley approved these changes Jun 29, 2026

View reviewed changes

Kbhat1 added this pull request to the merge queue Jun 29, 2026

Merged via the queue into main with commit bb69d38 Jun 29, 2026
76 of 79 checks passed

Kbhat1 deleted the littidx-flush-shard branch June 29, 2026 21:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Per-block littidx flush + single shard (gated on #3645)#3660

Per-block littidx flush + single shard (gated on #3645)#3660
Kbhat1 merged 1 commit into
mainfrom
littidx-flush-shard

Kbhat1 commented Jun 29, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

cursor Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

seidroid Bot left a comment

Uh oh!

cody-littley left a comment

Uh oh!

cody-littley Jun 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Kbhat1 commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Uh oh!

github-actions Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cursor Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

seidroid Bot left a comment

Choose a reason for hiding this comment

Blockers

Non-blocking

Uh oh!

cody-littley left a comment

Choose a reason for hiding this comment

Uh oh!

cody-littley Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Kbhat1 commented Jun 29, 2026 •

edited

Loading

github-actions Bot commented Jun 29, 2026 •

edited

Loading

codecov Bot commented Jun 29, 2026 •

edited

Loading

cursor Bot commented Jun 29, 2026 •

edited

Loading