Skip to content

feat: infer R argument parity (med, prop_test options, weighted sampling, …)#19

Merged
ismayc merged 1 commit into
mainfrom
feat/infer-arg-parity
Jun 20, 2026
Merged

feat: infer R argument parity (med, prop_test options, weighted sampling, …)#19
ismayc merged 1 commit into
mainfrom
feat/infer-arg-parity

Conversation

@ismayc

@ismayc ismayc commented Jun 20, 2026

Copy link
Copy Markdown
Member

Second of the two argument-parity PRs (companion to #18). Closes the remaining infer argument gaps; all functions already existed.

  • hypothesize(med=) / observe(med=) — median point null (bootstrap shift centered on the median).
  • prop_test(z=, correct=, conf_int=, conf_level=) — now mirrors R's prop.test: chi-square statistic by default (with a chisq_df column), z=True for the signed z, Yates continuity correction, and a confidence interval (Wilson score for one proportion; Wald + correction for a two-proportion difference). Validated numerically against R (X-squared, p-value, and CI all match).
  • rep_slice_sample(prop=, weight_by=) and rep_sample_n(prob=) — fractional and weighted sampling.
  • generate(variables=) — choose which column the permutation shuffles.
  • shade_p_value/shade_confidence_interval(fill=) and visualize(dens_color=); shade_p_value now also honors color (it was previously ignored — a latent bug, now fixed).

Tests cover every new path, with prop_test and the correlation methods asserted against R-computed reference values. 296 tests, 100% coverage, ruff clean.

Behavior change: prop_test now defaults to the chi-square statistic (R parity) rather than a z; pass z=True for the old z-statistic.

Merge note: #18 and this PR both append to CHANGELOG.md; the second to merge will need a one-line conflict resolution (I'll handle it).

🤖 Generated with Claude Code

- hypothesize/observe(med=): median point null (bootstrap shift on the median).
- prop_test(z=, correct=, conf_int=, conf_level=): now mirrors R's prop.test —
  chi-square statistic by default (z= for the signed z), Yates continuity
  correction, and a confidence interval (Wilson score for one proportion;
  Wald + correction for a two-proportion difference). Validated against R.
- rep_slice_sample(prop=, weight_by=) and rep_sample_n(prob=): fractional and
  weighted sampling.
- generate(variables=): choose which column the permutation shuffles.
- shade_p_value/shade_confidence_interval(fill=) and visualize(dens_color=);
  shade_p_value now also honors color (previously ignored).

Tests for every new path, with prop_test/correlation asserted against R values.
296 tests, 100% coverage, ruff clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_017CTL1QSTg1DmDUpqYuPEog
@codecov

codecov Bot commented Jun 20, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (82c3afb) to head (a456fa2).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##              main       #19   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           22        22           
  Lines         1422      1492   +70     
=========================================
+ Hits          1422      1492   +70     
Files with missing lines Coverage Δ
moderndive/infer/core.py 100.00% <100.00%> (ø)
moderndive/infer/viz/__init__.py 100.00% <100.00%> (ø)
moderndive/infer/viz/_plotly.py 100.00% <100.00%> (ø)
moderndive/infer/viz/_plotnine.py 100.00% <100.00%> (ø)
moderndive/infer/wrappers.py 100.00% <100.00%> (ø)
moderndive/sampling.py 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ismayc ismayc merged commit b4944a0 into main Jun 20, 2026
8 checks passed
@ismayc ismayc deleted the feat/infer-arg-parity branch June 20, 2026 20:57
ismayc added a commit that referenced this pull request Jun 20, 2026
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_017CTL1QSTg1DmDUpqYuPEog
ismayc added a commit that referenced this pull request Jun 20, 2026
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_017CTL1QSTg1DmDUpqYuPEog
ismayc added a commit that referenced this pull request Jun 20, 2026
…18)

* feat: moderndive R argument parity (correlation methods, newdata, etc.)

- get_correlation(method="spearman"|"kendall", na_rm=): rank correlations via
  scipy; na_rm toggles per-pair null dropping.
- get_regression_points(newdata=, ID=): predict on a held-out frame (residual
  included when the outcome is present) and use a column as the identifier.
- get_regression_table(default_categorical_levels=True): keep raw statsmodels
  factor-level term names instead of the prettified "var: level" form.
- gg_parallel_slopes(alpha=): point transparency (both engines).

Docs (correlation methods + newdata prediction) and CHANGELOG updated. 291 tests,
100% coverage, ruff clean, docs build -W clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_017CTL1QSTg1DmDUpqYuPEog

* docs: rebuild HTML after merging main (#19)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_017CTL1QSTg1DmDUpqYuPEog

* docs: align _build with main to avoid generated-file merge conflicts

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_017CTL1QSTg1DmDUpqYuPEog

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
ismayc added a commit that referenced this pull request Jun 20, 2026
#20)

* docs+test: NEWS.md, R-compatibility notes, and R-reference value tests

- NEWS.md summarizing the current state under 0.1.0.
- doc/r-compatibility.md documenting the deliberate choices made to match R
  results that differ from "typical" Python (pop_sd ddof=0, regression-summary
  mse denominator, prop_test chi-square+Yates+Wilson CI, get_p_value two-sided
  convention, UTC datetimes, derived early_january_2023_weather, plotly default,
  correlation na_rm default, etc.).
- tests/test_r_reference.py: cross-package checks asserting the Python results
  equal R's exact values on a fixed dataset (get_regression_table/summaries,
  get_correlation, pop_sd, t_test). New-argument R validations live with their
  feature tests (#18/#19).

279 tests, 100% coverage, ruff clean, docs build -W clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_017CTL1QSTg1DmDUpqYuPEog

* docs: rebuild HTML after merging main (#19)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_017CTL1QSTg1DmDUpqYuPEog

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant