Skip to content

[0.2] Initial batch of 0.2.3 backports#4683

Open
TheBlueMatt wants to merge 25 commits into
lightningdevkit:0.2from
TheBlueMatt:2026-06-0.2-backports-1
Open

[0.2] Initial batch of 0.2.3 backports#4683
TheBlueMatt wants to merge 25 commits into
lightningdevkit:0.2from
TheBlueMatt:2026-06-0.2-backports-1

Conversation

@TheBlueMatt

Copy link
Copy Markdown
Collaborator

tnull and others added 25 commits June 11, 2026 18:29
Previously, we logged "Persisting LiquidityManager..." on each
background processor wakeup, which can be very spammy, even on TRACE
level.

Here, we opt to only log if something actually needed to be repersisted
and we did so (in case of failure we're logging that anyways, too).

Backport of 369ea98
When we make an MPP claim we push RAA blockers for each chanel to
ensure we don't allow any single channel to make too much progress
until all channels have the preimage durably on disk. We don't have
to store those RAA blockers on disk in the ChannelManager as
there's no point - if the ChannelManager gets to disk with the RAA
blockers it also brought with it the pending ChannelMonitorUpdates
that contain the preimages and will now be replayed, ensuring the
preimage makes it to all ChannelMonitors.

However, just because those RAA blockers dissapear on reload
doesn't mean the implications of them does too - if a later
ChannelMonitorUpdate was blocked in the channel we don't have logic
to unblock it on startup.

Here we add such logic, simply attempting to unblock all blocked
`ChannelMonitorUpdate`s that existed on startup.

Code written by Claude.

Fixes lightningdevkit#4518

Backport of b0c312d

Conflicts resolved in:
 * lightning/src/ln/channelmanager.rs
Add a characterization test for a claimed MPP payment whose
preimage monitor updates are only partially persisted before restart.
The test drives both channels through a held fee-update commitment
dance, claims with async monitor persistence, reloads one fresh and
one stale monitor, and verifies that we don't leave a sender-side
HTLC stuck after reconnect.

Backport of 01d55dc

Conflicts due to different channel sorting and holding cell free
timing resolved in:
 * lightning/src/ln/reload_tests.rs
In a very specific case, forgetting to do so can lead to a debug
assertion failure when we see a double-claim of an HTLC (see the
included test).

Found by @joostjager's work on growing the chanmon_consistency
fuzzer.

Backport of f14b4b2

Trivial silent conflict resolved in:
 * lightning/src/ln/functional_tests.rs
It seems we forgot to ensure `OffersMessageHandler::best_block` is
consistently updated, leading to us building invalid blinded
payment paths for short-lived payment paths after two weeks without
restart.

Backport of 417b065

Trivial silent conflict resolved in:
 * lightning/src/offers/flow.rs
Backport of 6bf2352

Conflicts resolved in:
 * lightning/src/chain/chainmonitor.rs
Incomplete keysend MPPs skipped the receive timeout path,
allowing partial payments to hold HTLC slots until CLTV
expiry instead of failing after `MPP_TIMEOUT_TICKS`.
Apply the existing `total_mpp_amount_msat` completeness
check to all MPP receives and add a regression test
covering the keysend case.

The timeout logic was originally added only for
invoice-backed MPPs in 2022, and that invoice-only
guard remained when receive-side MPP keysend support
landed in 2023, leaving this gap latent until now.

Co-Authored-By: HAL 9000

Backport of fd8846b

Conflicts resolved in:
 * lightning/src/ln/channelmanager.rs
 * lightning/src/ln/payment_tests.rs
`GetHistoryRes::height` from electrum-client is a *signed* integer. Here
we first check for `<= 0` *before* casting to `u32`.

Signed-off-by: Elias Rohrer <dev@tnull.de>

Backport of 8b383bb
`OMDomainResolver` rate-limits in-flight DNSSEC proof builds via a
`pending_query_count` counter capped at `MAX_PENDING_RESPONSES` (1024).
The counter was only released when the proof build succeeded, so any
failure mode -- NXDOMAIN, insecure zones, unreachable resolvers, I/O
timeouts, malformed names -- permanently consumed a slot.

Because the queried name is attacker-controlled (it travels in over a
`DNSSECQuery` onion message from any LN peer, given DNS resolution is
an opt-in network-advertised feature), an adversary could exhaust the
counter with ~1025 failing queries and persistently DoS the resolver
for any subsequent legitimate BIP-353 lookups, until the process is
restarted.

Always release the slot once the proof build completes, regardless of
outcome, and add a regression test which points the resolver at a
TCP-refusing local port and asserts the counter returns to zero.

Co-Authored-By: HAL 9000

Backport of fb4103d

Trivial conflicts resolved in:
 * lightning-dns-resolver/src/lib.rs
`can_support_additional_anchor_channel` decides whether the wallet has
enough on-chain reserve to back another anchor channel by counting the
node's existing anchor channels. The classification only checked the
`anchors_zero_fee_htlc_tx` feature, so channels negotiated with the
`anchor_zero_fee_commitments` (TRUC / 0FC, option 41) variant — which
require the same on-chain reserve to fund commitment / HTLC fee bumps
on force-close — were silently dropped from the count.

A node enabling `negotiate_anchor_zero_fee_commitments` would therefore
be green-lit to open more anchor channels than its wallet can actually
back, risking unfunded fee bumps and HTLC loss on simultaneous
force-closes.

Treat both feature flags as marking a channel as an anchor channel for
reserve-accounting purposes (factored into a small `is_anchor_channel_type`
helper, used in both the chain-monitor and channel-manager loops), and
add a regression test that opens a single 0FC channel with reserves
sized for exactly one channel and asserts the function refuses to
authorize a second.

Co-Authored-By: HAL 9000

Backport of 33987e8

Trivial conflicts resolved in:
 * lightning/src/util/anchor_channel_reserves.rs
`PrintableString` is the sanitiser LDK uses to render untrusted strings
(node aliases, BOLT-12 invoice / offer text, `UntrustedString`, LSPS
messages, `lightning-invoice` descriptions) to logs and UI. It only
replaced `char::is_control` matches (Unicode general category `Cc`)
with U+FFFD, leaving the entire `Cf` (Format) category untouched.

That is the exact category covering the bidirectional override /
isolate codepoints (U+202A..U+202E, U+2066..U+2069) and zero-width
characters (U+200B..U+200D, U+FEFF) behind the "Trojan Source" attack
family (CVE-2021-42574): a peer can set its alias / invoice description
/ offer fields to e.g. `safe\u{202E}cipsxe.exe`, which previously
passed through verbatim while a human reader sees `safeexe.cips` —
defeating the threat model `PrintableString` exists to defend against.

Replace `Cf` codepoints alongside `Cc` ones. The `Cf` ranges are
inlined as a `matches!` table sourced from Unicode 16.0 to keep the
change `no_std`-friendly with no new dependencies.

Co-Authored-By: HAL 9000
Signed-off-by: Elias Rohrer <dev@tnull.de>

Backport of 1a01b5a
The std-only `StaticInvoice::is_offer_expired` accessor delegated to
`InvoiceContents::is_expired`, which compares `created_at +
relative_expiry` against the current time — that is the *invoice*'s
expiry, not the offer's. The `_no_std` sibling and `flow.rs::
enqueue_static_invoice` already treat the two as distinct checks.

A payer or forwarder using the std API to decide whether to honor a
static invoice would therefore get the wrong answer in either
direction: forwarding offers the issuer has already retired (when the
invoice is still fresh), or refusing offers that are still valid (when
the invoice has aged past its `relative_expiry` but the offer itself
has no `absolute_expiry`).

Route the std accessor through `InvoiceContents::is_offer_expired` so
both the std and no-std paths consult the offer's expiry.

Co-Authored-By: HAL 9000

Backport of c005b11
`composite_custom_message_handler!` expanded `peer_connected` to call
every sub-handler and remember the last error, but never undo the
already-succeeded ones. The `CustomMessageHandler::peer_connected`
contract is that `PeerManager` will *not* invoke `peer_disconnected`
when `peer_connected` returns `Err` — so any per-peer state allocated
by an earlier sub-handler that returned `Ok` was leaked permanently
once a later sub-handler returned `Err`.

A peer who can elicit `Err` from any sub-handler in the composite
(feature-bit gate, banlist, etc.) could repeatedly reconnect to grow
that leaked state without bound (slow resource DoS), and "currently
connected" predicates in the leaking sub-handler would lie about
peers that were actually rejected.

Mirror the rollback pattern `PeerManager` already uses for the four
built-in handlers (`peer_handler.rs:2149-2188`): record each
sub-handler's `peer_connected` result, and if any returned `Err`,
call `peer_disconnected` on the ones that succeeded before
propagating the failure.

Co-Authored-By: HAL 9000
Signed-off-by: Elias Rohrer <dev@tnull.de>

Backport of 5455058
`EsploraSyncClient::get_confirmed_tx` parsed the SPV proof returned by
the Esplora server but threw away the security check: the merkle root
computed by `PartialMerkleTree::extract_matches` was discarded
(`let _ = …`), and only the leaf-equality check (`matches[0] == txid`)
remained. Anyone can construct a single-leaf partial tree advertising
an arbitrary txid via `PartialMerkleTree::from_txids(&[txid], &[true])`,
so this gate was vacuous.

A malicious or compromised Esplora server could therefore convince
`EsploraSyncClient` that any transaction was confirmed in any block by
returning `MerkleBlock { header: real_header, txn: forged_partial_tree }`,
causing LDK to feed a synthesized `ConfirmedTx` into `Confirm`
implementations such as `ChannelManager` / `ChainMonitor`. From there,
the channel-funding / closing / HTLC flows would treat the transaction
as confirmed at an attacker-chosen height, with consequences ranging
from premature state transitions to force-close races.

Capture the merkle root returned by `extract_matches` and require it
to equal `block_header.merkle_root`, matching the validation the
Electrum sibling already performs via `validate_merkle_proof`.

Co-Authored-By: HAL 9000

Backport of b64efcd
`LSPS5ServiceHandler::persist` incremented `persistence_in_flight` at
the top as a single-runner gate, but only decremented it on the
success path: each interior `?` on a `kv_store` future propagated the
error out of the function while leaving the counter at >= 1. After
one transient I/O failure (disk full, brief unavailability of a
remote `KVStore`, EPERM, etc.) every subsequent `persist()` call hit
the `fetch_add > 0` short-circuit and silently returned `Ok(false)`.

The in-memory `needs_persist` flags then continued to grow without
ever reaching disk, so webhook state, removals, and notification
cooldowns were lost on the next process restart — including the
spec-mandated webhook retention/pruning state — without any error
surfaced to the operator. The counter is monotonic, so recovery
required a process restart.

Adopt the LSPS1 / LSPS2 pattern: split the body into an inner
`do_persist` and an outer `persist` that unconditionally clears the
counter via `store(0)` after the call returns, regardless of
outcome. A failed write now still propagates `Err`, but the next
`persist()` attempt actually retries the write instead of no-op'ing.

Co-Authored-By: HAL 9000

Backport of b3544de

Silent conflicts resolved in:
 * lightning-liquidity/tests/lsps5_integration_tests.rs
`regenerate_and_broadcast_spend_if_necessary` used `pending_sweep:
AtomicBool` as a single-runner gate but only cleared the flag with an
unconditional `store(false)` *after* the inner future resolved. If the
caller's future was dropped while the inner await was `Pending` —
which `tokio::time::timeout`, `futures::select!`, manual
`JoinHandle::abort`, etc. all do — the reset never ran, leaving the
flag stuck `true` and every subsequent call to the function
short-circuiting with `Ok(())`.

Because `OutputSweeper` is what claims `SpendableOutputDescriptor`s
back to the user's wallet after channel closure (including HTLC
outputs with time-bounded recovery deadlines), a stuck flag turns
into fund-loss exposure: time-sensitive HTLC sweeps simply stop
happening, while every other code path keeps queueing new outputs to
sweep, until the process is restarted.

Replace the trailing `store(false)` with an RAII `PendingSweepGuard`
whose `Drop` impl always releases the flag — covering normal return,
error, and cancellation alike.

Co-Authored-By: HAL 9000

Backport of 6394d18

Silent conflicts due to API changes resolved in:
 * lightning/src/util/sweep.rs
1a01b5a added detection of unicode
format characters in `PrintableString`, but used a hard-coded table
which may eventually become out of date.

Here we switch to an auto-generated table, include all
`General_Category` `Other` characters, and also ban unallocated
code points.

Finally, CI validates that the file is kept up to date.

Written by Claude

Backport of 65e8cc8
The exhaustive filesystem store listing treated leftover temp and trash
files as namespace directories after identifying them as non-keys.
Skip those artifacts before recursing so migrations can ignore crash
leftovers.

Backport of 9246d86

Conflicts resolved due to moved/split file in:
 * lightning-persister/src/fs_store.rs
lnd is preparing to ship a release with opt-in onion messages
without support for forwarding onion messages from non-channel
peers. This breaks the common BOLT 12 OM flow today where we
direct-connect to the blinded path introduction point and send the
`invoice_request` without a channel. For CLN it turns out this is
fine as they never select a peer for their introduction point at
all. However, for LDK this would break existing nodes as nodes
might now pick an lnd peer as an introduction node but it won't
forward the onion message.

For now, we just drop the separate introduction point selection and
just always use ourselves as an introduction point (assuming we're
an announced node).

This should also have the side-effect of making offers marginally
more robust, which may be worth it, even if it sucks to drop any
pretense of privacy.

Backport of 8c08a30

Conflicts resolved in:
 * lightning/src/ln/offers_tests.rs
 * lightning/src/onion_message/messenger.rs
If an outbound payment was abandoned with htlcs in-flight and later claimed, we
would previously have the PaymentSent::fee_paid_msat be set to None. This
contradicted some docs on the event that stated the field would always be Some
after 0.0.103.

Backport of 3e9e6e9

Silent conflicts resolved in:
 * lightning/src/ln/payment_tests.rs
An offer advertising Quantity::Bounded expects at least one item, but
is_valid_quantity accepted a quantity of 0 since it only checked the
upper bound. Require the quantity to be greater than 0 so that an
invoice request for 0 items is rejected as an InvalidQuantity.

Co-Authored-By: Claude <noreply@anthropic.com>

Backport of 7620124
AnchorDescriptor::previous_utxo is used for coin selection and PSBT witness_utxo metadata. For keyed anchors it should describe the on-chain P2WSH anchor output instead of the witness script so wallets can validate and sign the package.

Co-Authored-By: HAL 9000

This finding was discovered by Project Loupe

Backport of ccf45e4

Trivial conflicts resolved in:
 * lightning/src/events/bump_transaction/mod.rs
get_supportable_anchor_channels estimates how much each reserve UTXO can contribute after spending fees. Include the base input weight in that fee so UTXOs just below the public per-channel reserve are not counted as supporting another anchor channel.

Co-Authored-By: HAL 9000

This finding was discovered by Project Loupe

Backport of 5e7b7d3

Trivial conflicts resolved in:
 * lightning/src/util/anchor_channel_reserves.rs
When a JIT channel open fails, release queued intercepted HTLCs through the intercept API so they are not held until expiry. Keep resetting the LSPS2 state if an intercept has already been released.

Co-Authored-By: HAL 9000

This finding was discovered by Project Loupe

Backport of 6b75e5a

Conflicts resolved in:
 * lightning-liquidity/tests/lsps2_integration_tests.rs
@ldk-reviews-bot

ldk-reviews-bot commented Jun 11, 2026

Copy link
Copy Markdown

I've assigned @tankyleo as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@@ -1816,6 +1815,7 @@ where
for counterparty_node_id in need_persist.into_iter() {
debug_assert!(!need_remove.contains(&counterparty_node_id));
self.persist_peer_state(counterparty_node_id).await?;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ? here (and at the future.await? / persist_peer_state(...).await? below) returns early on an IO error without resetting persistence_in_flight. After one failed persist the counter stays >= 1, so every subsequent persist() call hits the fetch_add(1) > 0 guard at line 1789 and returns Ok(false) immediately — silently dropping all future persistence forever.

This is the identical bug that the LSPS5 part of this PR fixes via do_persist() + persistence_in_flight.store(0, Release) on error (and covers with lsps5_service_persist_resets_in_flight_counter_on_io_error). The LSPS2 persist here has the same structure but was not given the same fix.

@ldk-claude-review-bot

Copy link
Copy Markdown
Collaborator

Posting the top-level summary.

Summary

This is a backports PR (#4683). Most hunks are correct backports (DNS resolver counter leak fix, electrum/esplora height as u32 <= 0 fix, composite handler rollback, fs_store artifact skipping, LSPS5 in-flight counter reset). One notable issue found:

  • lightning-liquidity/src/lsps2/service.rs:1817 — LSPS2 persist() propagates IO errors via ? (at persist_peer_state(...).await?, future.await?) without resetting persistence_in_flight. After one failed persist, the counter stays >= 1 and every later persist() returns Ok(false) early, silently disabling persistence forever. This is the same bug that the LSPS5 portion of this PR explicitly fixes (do_persist() + persistence_in_flight.store(0, Release) on error, with a regression test). LSPS2 has the identical structure but was not given the parallel fix.

Cross-cutting / minor

  • lightning-liquidity/src/manager.rs — typo in the two new doc comments: "Returns true if it persisted sevice handler data." (should be "service"). Cosmetic.
  • The LSPS5 error path resets the in-flight counter via store(0), which can discard a concurrent caller's "go around again" increment that occurred during the failed run. This is acceptable since an error is returned and the caller retries, but worth being aware of.

@ldk-reviews-bot ldk-reviews-bot requested a review from tankyleo June 11, 2026 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants