feat(seidb): dump-flatkv computes per-bucket/total LtHash#3665
feat(seidb): dump-flatkv computes per-bucket/total LtHash#3665blindchaser wants to merge 1 commit into
Conversation
…tle, metadata verify
PR SummaryLow Risk Overview New flags Tests update Reviewed by Cursor Bugbot for commit 8ba6f87. Bugbot is set up for automated code reviews on this repo. Configure here. |
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3665 +/- ##
==========================================
- Coverage 59.17% 58.21% -0.96%
==========================================
Files 2262 2178 -84
Lines 187053 177514 -9539
==========================================
- Hits 110682 103338 -7344
+ Misses 66433 65024 -1409
+ Partials 9938 9152 -786
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Want higher recall? High effort reviews run extra passes and find more bugs. A team admin can switch effort levels in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 8ba6f87. Configure here.
| } | ||
| if err := limiter.WaitN(ctx, n); err != nil { | ||
| return fmt.Errorf("read rate limiter: %w", err) | ||
| } |
There was a problem hiding this comment.
Read throttle undercharges large rows
Medium Severity
When a scanned row’s len(key)+len(val) exceeds the limiter burst, the scan calls WaitN once with n clamped to the burst instead of consuming tokens for the full row size. Large values (for example contract bytecode) can therefore be read with far less throttling than --read-limit-mb implies, weakening the live-node disk protection this flag is meant to provide.
Reviewed by Cursor Bugbot for commit 8ba6f87. Configure here.
There was a problem hiding this comment.
The PR cleanly extends seidb dump-flatkv with per-bucket/total LtHash computation, committed-root verification, and a read-throughput limiter; the logic (slice cloning, batch associativity, burst clamping, nil-map reads) is correct and well-tested. No blockers — only a couple of minor robustness/UX notes.
Findings: 0 blocking | 4 non-blocking | 1 posted inline
Blockers
- None at the file/PR level.
Non-blocking
- Cursor's second-opinion review (./cursor-review.md) is empty — that pass produced no output.
- Behavior change:
--lthashnow defaults totrueand--read-limit-mbdefaults to64, so existingdump-flatkvinvocations now do extra LtHash work and are throttled to 64 MiB/s by default. This is documented in the command help, but consider calling it out in release notes/changelog so offline users aren't surprised by the slower default. - Minor redundancy:
printFlatKVLtHashalready computes the total LtHash but discards it, andverifyFlatKVLtHashre-sums the per-bucket accumulators from scratch. Cheap (just MixIn of four vectors), but the total could be computed once and reused. - 1 suggestion(s)/nit(s) flagged inline on specific lines.
| // zero LtHash. Treat that as "nothing to verify against" rather than a | ||
| // spurious failure (e.g. a snapshot that predates LtHash metadata). | ||
| zero := lthash.New().Checksum() | ||
| if bytes.Equal(committedTotal, zero[:]) { |
There was a problem hiding this comment.
[suggestion] This conflates "no committed LtHash metadata" with "committed metadata present but zero/corrupt." The zero-LtHash checksum is also what an empty store and a store whose committed metadata was incorrectly zeroed both produce, so a non-empty store with zeroed/corrupt committed metadata is reported as skipped rather than failing verification — exactly the case verification exists to catch. Consider detecting metadata presence explicitly (e.g. an IsZero()/has-metadata signal from the store) instead of inferring absence from the checksum value. (Also raised by Codex.)
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8ba6f87ef0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // zero LtHash. Treat that as "nothing to verify against" rather than a | ||
| // spurious failure (e.g. a snapshot that predates LtHash metadata). | ||
| zero := lthash.New().Checksum() | ||
| if bytes.Equal(committedTotal, zero[:]) { |
There was a problem hiding this comment.
Skip verification when the source snapshot lacks LtHash
When the selected snapshot predates LtHash metadata but the cloned changelog replays later blocks, LoadVersion starts the LtHash accumulators at zero and catchup persists a non-zero hash for only the replayed WAL deltas. This check uses the post-open CommittedRootHash() and only skips the all-zero case, so it no longer recognizes that the baseline snapshot had no verifiable LtHash and compares a full scan of snapshot+WAL data against that partial hash, causing the default dump-flatkv --lthash path to fail on upgraded live stores until a new snapshot with LtHash metadata exists. Please track whether the selected snapshot had LtHash metadata before replay and skip verification in that case.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
LGTM — additive changes to the offline seidb dump-flatkv tool with no impact on commit/consensus paths.
Extended reasoning...
Overview
The PR extends an offline diagnostic CLI (seidb dump-flatkv) with two additions: (1) per-bucket and total LtHash computation, verified against snapshot metadata, and (2) byte-throughput read throttling so the scan can safely run against a live node. Changes are isolated to sei-db/tools/cmd/seidb/operations/dump_flatkv.go and its test file — no production commit/consensus code is touched.
Security risks
None substantive. The tool is read-only against a hard-linked snapshot clone. No new authentication, crypto, or RPC surfaces are introduced. The LtHash usage feeds raw iterator bytes into the same lthash package the FlatKV store itself uses for committed root hashing — no new cryptographic logic. A failed verification fails the command by design (intended exit-non-zero contract).
Level of scrutiny
Low. This is offline tooling whose worst-case failure is a wrong dump or a misleading PASS/FAIL on a CLI command — there is no on-chain or persisted-state effect. The default --read-limit-mb 64 is conservative and --lthash=true adds work that is purely diagnostic.
Other factors
The author handled the tricky edges explicitly: keys/values are cloned before retention (iterator slice-reuse), the limiter burst is floored at 4 MiB so an oversized row can't deadlock WaitN, and a zero committed-LtHash is treated as 'nothing to verify' rather than spuriously failing. A new test (TestBucketLtHasherMatchesSingleShot) exercises the batched MixIn path against a single-shot ComputeLtHash over the union, which is the right invariant to assert. Bug hunter found nothing and the cursor[bot] also flagged it low risk.


Summary
Extends the
seidb dump-flatkvcommand to compute and verify the lattice hash (LtHash) of scanned FlatKV state and to throttle its disk reads so it can safely run against a live, block-producing node. The core scan path (dumpFlatKVFromStore) gains two parameters —withLtHashandreadLimitMiBps— and two new abstractions: a per-bucket streamingbucketLtHasherand a byte-throughput read limiter.sei-db/tools/cmd/seidb/operations/dump_flatkv.go:--lthash(defaulttrue) and--read-limit-mb(default64) flags, plus extended command documentation covering live-node and Kubernetes usage.--read-limit-mbis validated to be>= 0(0means unlimited).newReadLimiterbuilds agolang.org/x/time/ratelimiter from a MiB/s ceiling. The burst is one second of budget but never less thanminReadBurstBytes(4 MiB), guaranteeing a single large row (e.g. contract bytecode) can never exceed the burst and deadlockWaitN. Returnsnil(unlimited) when the ceiling is<= 0. The scan callsWaitN(len(key)+len(val))per row so throughput is measured on actual key+value bytes read.bucketLtHasherincrementally accumulates a bucket's LtHash from a stream of raw physical key / serialized value pairs. Keys and values are cloned onaddbecause the iterator may reuse the backing slices onNext(). Pairs are buffered up tolthashBatchCap(8192) then folded into the running accumulator viaMixIn; batching is safe because the LtHash group is associative.--bucketwrite filter, so the printed TOTAL equals the node's committed global LtHash even when only one bucket is written to disk. Each bucket feeds exactly the bytes the FlatKV store hashes into its per-DB working LtHash, so per-bucket checksums match per-DB committed LtHashes and their MixIn matches the global root.verifyFlatKVLtHashcross-checks the re-scanned total againstCommittedRootHash()from snapshot metadata, printing a PASS/FAIL block and returning an error (non-zero exit) on mismatch. A store with no committed LtHash (zero-hash, e.g. a snapshot predating LtHash metadata) is treated as "nothing to verify" and skipped rather than failing spuriously.Test plan
sei-db/tools/cmd/seidb/operations/dump_flatkv_test.go:TestDumpFlatKVFromStoreAllBucketsandTestDumpFlatKVFromStoreSingleBucketcalls for the newwithLtHash/readLimitMiBpssignature (LtHash on, throttling off).TestBucketLtHasherMatchesSingleShot, which uses more than two full batches (lthashBatchCap*2 + 17pairs) to exercise the incrementalMixInpath. It asserts each batched per-bucket checksum equals a single-shotComputeLtHashover the same pairs (associativity), and that theMixInof the per-bucket accumulators equals oneComputeLtHashover the union — the exact per-bucket + total relationship the command reports.