Skip to content

Add is_analytically_solvable() to NonRepairableRBD#28

Merged
derrynknife merged 9 commits into
masterfrom
claude/rbd-functions-gaps-neve09
Jun 17, 2026
Merged

Add is_analytically_solvable() to NonRepairableRBD#28
derrynknife merged 9 commits into
masterfrom
claude/rbd-functions-gaps-neve09

Conversation

@derrynknife

Copy link
Copy Markdown
Owner

Adds a check for whether an RBD's system reliability can be solved
analytically / with the BDD, or whether it requires simulation because
one or more nodes are simulation-based (standby arrangements).

  • _node_is_analytic(): classifies a node model, recursing through
    RepeatedNode wrappers and nested NonRepairableRBDs.
  • get_non_analytic_nodes(): returns {node: model type} for the offending
    simulation-based nodes (StandbyModel, RepeatedStandbyNode).
  • is_analytically_solvable(): True iff there are no such nodes.
  • Records the result in structure_check as "is_analytically_solvable"
    and "non_analytic_nodes" at construction time.

Adds tests covering analytic fixtures, standby (non-analytic) fixtures,
repeated nodes/components, and the structure_check wiring.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
Claude-Session: https://claude.ai/code/session_01D3uxEk7qdLTcRhR8Avnk5M

claude added 9 commits June 17, 2026 08:09
Adds a check for whether an RBD's system reliability can be solved
analytically / with the BDD, or whether it requires simulation because
one or more nodes are simulation-based (standby arrangements).

- _node_is_analytic(): classifies a node model, recursing through
  RepeatedNode wrappers and nested NonRepairableRBDs.
- get_non_analytic_nodes(): returns {node: model type} for the offending
  simulation-based nodes (StandbyModel, RepeatedStandbyNode).
- is_analytically_solvable(): True iff there are no such nodes.
- Records the result in structure_check as "is_analytically_solvable"
  and "non_analytic_nodes" at construction time.

Adds tests covering analytic fixtures, standby (non-analytic) fixtures,
repeated nodes/components, and the structure_check wiring.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01D3uxEk7qdLTcRhR8Avnk5M
Replaces the `dd` Binary Decision Diagram dependency with a pure-Python
exact reliability engine and a direct structure-function evaluation.

- rbd.py: add probability_any_set_satisfied(), an exact Shannon
  decomposition over the minimal path/cut sets with memoisation. It
  returns the same result as the previous inclusion-exclusion but avoids
  the 2^(#sets) blow-up by solving each shared sub-problem once.
- system_probability(): now uses the new engine. method="p" computes
  reliability from path sets; method="c" computes unreliability from cut
  sets. The approx=True path keeps the first-order rare-event cut-set
  approximation, and method="p" + approx=True still raises.
- is_system_working(): re-implemented without a BDD. The system works
  iff some minimal path set is fully up ("p") / every minimal cut set has
  one component up ("c"). Path/cut sets are cached on first use for the
  simulation hot loop.
- Drop compile_bdd() and the BDD bdd_to_string/bdd_to_file helpers from
  RBD and RepairableRBD; RepairableRBD now inherits is_system_working.
- Remove dd from requirements.txt and the mypy override list.

Adds test_rbd_exact_probability.py: cross-checks the path-set and cut-set
methods against a brute-force enumeration on a bridge network, and checks
the new is_system_working truth tables. Existing sf()/availability tests
(both methods + approx) continue to pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01D3uxEk7qdLTcRhR8Avnk5M
Ignore __pycache__, compiled files, and pytest/coverage/mypy caches so
they are not reported as untracked.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01D3uxEk7qdLTcRhR8Avnk5M
- minimal_cut_sets_from_path_sets(): computes the minimal cut sets as the
  minimal transversals (hitting sets) of the minimal path sets using
  Berge's algorithm. It builds the transversals incrementally one path set
  at a time and prunes non-minimal candidates at each step, instead of
  materialising the full Cartesian product of the path sets and filtering
  once at the end. Works directly from the path sets, so it stays correct
  for k-out-of-n structures. get_min_cut_sets() now delegates to it; the
  now-unused itertools.product import is removed.

- Default the system reliability to the path-set method:
  - RBD.system_probability default method "c" -> "p".
  - NonRepairableRBD.sf default method is now resolved from None: path
    sets by default (avoids deriving the cut sets), falling back to the
    cut-set method when approx=True (the approximation only applies to cut
    sets). Explicit method="p" with approx=True still raises.

Adds test_rbd_berge_cut_sets.py: cross-checks Berge against the previous
Cartesian-product implementation across every fixture (including the
k-out-of-n ones), checks hand-computed examples, verifies the cut sets are
minimal transversals, and checks the new default sf behaviour.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01D3uxEk7qdLTcRhR8Avnk5M
The approx flag was a first-order rare-event approximation that pre-dated
the exact engine. It was never actually wired through sf() (only its
error-check fired), and the exact path/cut-set computation is now cheap, so
the option added API surface without benefit.

- RBD.system_probability: drop the approx parameter and the first-order
  cut-set branch; both methods now always return the exact result.
- NonRepairableRBD.sf: drop approx; method now defaults straight to "p"
  (no None sentinel needed) and the method="p"+approx ValueError is gone.
- Remove the corresponding approx tests from test_rbd_sf.py and the
  approx-routing assertions from test_rbd_berge_cut_sets.py.

The separate Fussell-Vesely approx option (_fussel_vesely) is unrelated and
left untouched.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01D3uxEk7qdLTcRhR8Avnk5M
Standby sf/ff were estimated from Monte-Carlo samples via a Kaplan-Meier
fit: stochastic, a step function bounded by the sampled support, and the
source of intermittent test failures. For cold standby the lifetime is the
sum of the components' lifetimes, so its survival function is their
convolution, which can be computed directly.

- numerical_convolution.py: ConvolvedSurvival computes the survival
  function of a sum of independent lifetimes by numerically convolving the
  component densities (FFT) on a fine grid bounded robustly by doubling on
  sf, integrating to the CDF with trapezoidal quadrature and normalising
  away discretisation drift. sf()/ff()/mean() then read the precomputed
  grid -- fast, deterministic and accurate (~1e-3 vs closed forms).
- StandbyModel: for k=1 (cold standby = sum) sf/ff now use the exact
  convolution instead of the KM fit; k>=2 (order-dependent) keeps the
  Monte-Carlo + KM approximation.
- RepeatedStandbyNode: sf/ff use the convolution of `repeats` copies.

random()/mean() are deliberately left unchanged (out of scope for this sf
work). Note RepeatedStandbyNode.random/mean have a separate pre-existing
bug (they sample the KM-of-the-sum and re-sum it, ~repeats x too large),
which is why test_rbd_repeated_standby::test_mean expects 9 rather than 3
and remains the only flaky standby test.

Adds test_rbd_numerical_convolution.py: checks the convolution against the
Erlang (Gamma) and hypoexponential closed forms, array input, the
single-model identity, and that StandbyModel(k=1)/RepeatedStandbyNode sf is
exact and reproducible.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01D3uxEk7qdLTcRhR8Avnk5M
Models imperfect switching (failure on demand) for cold standby. Each switch
onto the next spare succeeds with a given probability, so the lifetime is a
mixture of partial sums: run the primary; if its switch works, run the next
component too; and so on. The survival function is the corresponding
weighted mixture of the partial-sum convolutions.

- ConvolvedSurvival: new switching_probability argument (a scalar applied to
  every switch, or a per-switch sequence of length n-1). It builds the
  partial-sum survival functions it already computes incrementally and
  returns their mixture; default 1.0 reduces exactly to the full
  convolution. Helpers switch_success_probs()/is_perfect_switching() and the
  mixture weights are exposed/computed for reuse.
- StandbyModel: switching_probability argument for k=1 (passed to the
  convolution, and random() now also reflects switching via a per-switch
  Bernoulli). For k>=2 a non-perfect switching probability raises
  NotImplementedError rather than being silently ignored.
- RepeatedStandbyNode: switching_probability argument applied to sf/ff (its
  random()/mean() still reflect perfect switching, as before).

Adds test_rbd_switching_probability.py, verifying against closed forms:
single-switch S(t)=e^-t(1+pt), zero-switching = primary only, perfect =
full convolution, mean 1+p+p^2 for three components, a per-switch sequence,
weight normalisation, the k>=2 NotImplementedError, and input validation.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01D3uxEk7qdLTcRhR8Avnk5M
random() previously sampled from a Kaplan-Meier fit of the summed lifetime
and then re-summed `repeats` of those draws, overcounting by a factor of
~repeats (e.g. a 3-fold standby of Exp(1) reported a mean of ~9 instead of
3). It was also the source of intermittent -inf failures.

random() now sums `repeats` independent draws from the base model directly
(respecting switching_probability via a per-switch Bernoulli), and mean()
returns the exact value from the convolution (E[T] = integral of sf), so
both are correct and deterministic. The Kaplan-Meier fit is dropped; the N
and lower arguments are kept (unused) for backwards compatibility.

Updates test_rbd_repeated_standby::test_mean to assert the correct mean (3,
not the previous buggy 9). The standby tests are now deterministic (no more
flakiness) and run in a fraction of the time.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01D3uxEk7qdLTcRhR8Avnk5M
When every standby component is an Exponential with the same rate (and
switching is perfect), the k-out-of-n cold standby lifetime is exactly
Erlang(N-k+1, k*rate): while k units operate the failure rate is k*rate, and
by memorylessness the inter-failure times are i.i.d. Exponential(k*rate).

- StandbyModel now detects identical exponential units and uses that closed
  form (via scipy.stats.gamma) for any k -- deterministic and exact,
  replacing the Monte-Carlo + Kaplan-Meier fit for k>=2 and the convolution
  for k=1 in this case. Non-exponential or mixed-rate inputs fall back to the
  existing convolution (k=1) / Monte-Carlo (k>=2) paths.
- mean() now returns the exact value from the analytic model when available
  (exponential closed form or convolution), only falling back to the
  Monte-Carlo estimate for the k>=2 general case.

Adds test_rbd_exponential_standby.py: checks the closed form against scipy's
gamma and surpyval's Gamma for k=1 and k=2, the k=N min-exponential limit,
determinism, agreement (in mean) with the priority-queue simulation, and
that mixed-rate inputs use the convolution path instead.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01D3uxEk7qdLTcRhR8Avnk5M
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@derrynknife derrynknife merged commit 28e7be8 into master Jun 17, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants