Ngmix v2.0 (CI mirror of #740)#741
Conversation
…to ngmix_update
- bin/ scripts were untracked, causing Docker build to fail - Fix license field to use SPDX string format (MIT) to resolve SetuptoolsDeprecationWarning Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The column now stores ngmix 2.x's nfev (solver function-evaluation count, ~tens-hundreds, -1 on some failures), not the v1 1-5 retry count; the old name misrepresented the value. No downstream consumer reads it: make_cat's _save_ngmix_data never touches it, and the only ntry matches in sp_validation are base64 image blobs in notebook outputs.
copyfile was orphaned by the resume-path removal; Tile_cat's size/e/ theta attributes were read from the catalog but never consumed anywhere in src/ or scripts/. get_noise stays: scripts/jupyter/ test_centroid_shift.py imports and calls it.
Review — part 3 (fresh pass)Provenance: same convention as parts 1–2 — this is a fresh full-diff pass by Claude working on candide, against head b2dcd79. Every finding below was empirically demonstrated before being fixed, each fix carries a regression test confirmed red on the unfixed code, and the chain is now pushed to this branch ( Blockers (2)Both live in the v2.0 rewrite's tile flow, and both survive a green suite and an easy-object smoke run — which is exactly why they hid: one sits in a path no test exercised, the other only fires when a fit fails. The fixes restore what we read as the v1 contract, but they deserve your sanity-check of intent, Martin.
Should-fixes (pushed; please confirm intent)
Noted, not changed
Branch state, empiricallyWith the chain applied the container suite is 270 passed / 1 failed — the one failure is the known — Claude on behalf of Cail |
|
My response to the review: Blockers:
Should-fixes
Noted
|
uv.lock conflict resolved by regenerating from the merged pyproject (uv lock): branch's ngmix>=2.4 / esheldon v2.4.0 pin preserved, dependabot lockfile-group bumps from #742 absorbed. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Same latent FILE_PATTERN/FILE_EXT length mismatch 3eb6a66 fixed elsewhere: the 3-entry FILE_PATTERN override fell back on the decorator's new 4-entry FILE_EXT default, tripping the length check at startup. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Replies point by point: Blockers — thanks for the empirical confirmation on the TILE_ID truncation; matches what we traced this afternoon (the cap is mcal_flags — intent confirmed, nothing further to do. Config files —
Incoming on this branch: a GalSim noise-injection validation of the background-RMS inverse-variance weights (per this afternoon's discussion) — known profile + known per-pixel noise, accuracy and pull-calibration checks, with a binary-weights control under spatially-varying noise so the test fails if the weight wiring were wrong. Numbers will be posted here when it lands. — Claude (Fable), on behalf of Cail |
…ducer run_sp_MiViSmVi was deleted with its config (ff612b7); the surviving background_rms vignet producers run as run_sp_tile_PiViVi (matching config_tile_Ng_batch_psfex_uc.ini). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The rms branch drew the metacal noise image at a scalar median(rms) while the weights claim per-pixel variance; under a strong RMS gradient (factor 8 across a 48px stamp) fixnoise then floods the low-noise half and the ivar weights measure WORSE than binary weights through metacal (noshear shape scatter 0.116 vs 0.078 over 50 paired realisations). Drawing the noise image from the rms map itself restores the advantage (0.053 vs 0.078); constant maps are bit-identical to the old behaviour. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Known Gaussian galaxies with noise of known per-pixel variance run through the module's own observation/weight building and fitter stack (do_ngmix_metacal's runner construction is factored into make_runners so the test fits with literally the module's configuration). Teeth: mean reduced chi2 = 1.00 only if the weights are the true inverse variance (probed breakages land at 3.0-1.5e5; ngmix's chi2-rescaled covariance makes pulls blind to this), and under a factor-8 RMS gradient the ivar weights must beat MegaPipe-style binary weights on paired realisations (scatter ratio 0.59 direct, 0.68 through metacal; inverted or transposed maps push it past 1.6). Accuracy and pull calibration asserted per case. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…pe, pull-tol derivation) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
The GalSim noise-injection validation has landed ( The finding. In What the test pins (N=50 seeded realizations per case, true g=(+0.03,−0.02), Gaussian galaxy × Gaussian PSF, observations built by the module's own
The teeth — injected wiring bugs, all caught: ×4 normalization → χ²/dof 4.0; 1/rms instead of 1/rms² → 13.2; inverted weights → 1.5×10⁵ (and scatter ratio > 1.6); transposed rms map → 13.9. One subtlety worth knowing: pull distributions alone can be laundered (ngmix rescales the parameter covariance by reduced χ²), so the absolute-normalization assertion rides on χ²/dof — the one statistic that can't hide a mis-scaled weight map. Honest boundaries: the test enters at 11 tests, ~35 s; full suite 274 passed, no regressions. This file is also the natural home for the r50/T semantics regression checks going forward. — Claude (Fable), on behalf of Cail |
|
One more finding from tonight's star-validation investigation (prep for tomorrow's session with Fabian), orthogonal to the weight work but real: the module's PSF-model fit is prior-dominated. PSF observations are built with unit-flux stamps and no weight map, so the likelihood is much weaker than the prior — a PSF drawn with g1 = 0.025 fits to g1 = 9×10⁻⁵ (HSM confirms the stamp itself carries g1 = 0.0241; removing the prior recovers g = (0.0241, −0.0144)). Consequences: the Possibly also relevant to Monday's m-bias discussion: the 2023 star-test failure (#656, still open) links to esheldon/ngmix#72 — metacal m ≈ −0.2 when the jacobian departs from a simple pixel scale. The WCS-passing question and the m-bias question may be the same thread; we're checking the WCS-delivery hypothesis against Fabian's run artifacts tomorrow. — Claude (Fable), on behalf of Cail |
Importing mpi4py initializes MPI at import time, which aborts the whole process when Open MPI detects a SLURM step environment but no PMI server — i.e. inside any srun-launched shell on a cluster whose container OMPI lacks SLURM PMI support. Even shapepipe_run -h dies before printing (#744; empirically bisected to SLURM_STEP_ID being set). Gate the import on launcher-set env vars (OMPI_COMM_WORLD_SIZE for mpirun, PMI_RANK for srun --mpi=pmi2, PMIX_RANK for srun --mpi=pmix). A bare shapepipe_run never touches MPI and runs SMP as before; mpirun launches are unchanged. Verified on candide: bare run under srun now exits 0 (was OPAL abort), mpirun -n 1 and -n 2 paths intact. Behavior note: a config with MODE = MPI launched without any MPI launcher now falls back to SMP instead of running single-rank MPI. Closes #744 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The else-branch of the mpi4py dependency line appended the whole list to itself (harmless only because DependencyHandler dedups); with the launcher gate this became the common path. And the regression test now surfaces the subprocess stderr on failure instead of swallowing it in a bare CalledProcessError. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Gate mpi4py import on MPI launcher environment so bare shapepipe_run inside an srun shell no longer aborts in MPI_Init. Same commits as PR #747; folded here so the fix ships with the ngmix v2.0 line.
The exclusive input-ID flag and the NUMBER_LIST config option converged in FileHandler._format_process_list and did the same thing for a single ID; NUMBER_LIST is now the one mechanism. Pipeline: - remove -e/--exclusive from args.py and its plumbing through run.py, FileHandler, and JobHandler (where it was stored but never used) - NUMBER_LIST entries are now validated against the input file numbers found on disk, preserving -e's early failure on a wrong ID: the run aborts at start-up instead of when a module first opens files - unit tests for the validation (subset passes, typo raises, no-list scan path unchanged) Canfar chain (script-level -e options are unchanged; one ID per headless job remains the interface): - job_sp_canfar.bash, job_sp_canfar_v2.0.bash, and init_run_exclusive_canfar.sh write NUMBER_LIST into a per-job config copy (set_config_number_list: insert-or-replace under [FILE], ID in numbering-scheme form: leading dash, dots->dashes) instead of passing -e to shapepipe_run. Side benefit: the processed ID is recorded in the config copied to the run's log directory. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Review hardening: set_config_number_list now verifies the key landed in the config copy and aborts otherwise — a config with no NUMBER_LIST line and no bare [FILE] header would previously install an unmodified copy and silently process every image. Also fix init_run_exclusive_canfar.sh's missing-ID guard, which tested an unset variable and never fired. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…exclusive Replace -e/--exclusive flag with NUMBER_LIST (#746)
This could be a feature - with |
Consolidate the test suite around one discovery root with three suite-level
tiers under tests/, keeping module-unit tests next to the code in
src/shapepipe/tests/. The root conftest.py is the single source of markers,
candide detection, and the off-cluster skip policy, so the same suite is green
on a laptop, in CI, and on the cluster.
Markers (declared in pyproject.toml, --strict-markers on):
slow heavy compute; excluded from the fast inner loop
candide needs the cluster and/or its real data; auto-skipped elsewhere
Guardrails:
- tests/science/test_mbias.py (fast, local): inject g1=0.02 through
make_ngmix_observation / do_ngmix_metacal, build the per-object metacal
response R11, assert |m| < 5e-3 in the ideal limit. Recovers g1=0.019964
(m=-1.8e-3).
- tests/cluster/test_star_shear_response.py (slow + candide): faithful
reproduction of Fabian's R-function. Reads ngmix metacal catalogs
(HDU [2]=1p [1]=1m [4]=2p [3]=2m), mask 20<mag<26,
R1=(g1_1p-g1_1m)/0.02, R2=(g2_2p-g2_2m)/0.02, 100-group jackknife;
asserts <R1>,<R2> within 0 +/- 0.03. Parametrized outputs base_dir;
the SLURM regeneration chain is wired (tests/helpers/cluster.py) but
opt-in only, never launched by the suite.
Helpers: tests/helpers/{star_response,cluster,artifacts}.py.
Artifacts seam: the star-response test emits a plot + status JSON + markdown
to tests/_artifacts/ (on pass and fail) for a later GitHub Pages step. Emit
only; the deploy is intentionally not built here.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The suite lived in two trees — top-level `tests/` (unit/science/cluster/ helpers) and `src/shapepipe/tests/` (the per-module tests). Fold the module tests into `tests/module/` so there is a single canonical location, discovered from one `testpaths` root. - Relocate the 12 `src/shapepipe/tests/*.py` modules to `tests/module/` (pure moves — their imports are already absolute `from shapepipe...`, so no source edits and no packaging change were needed; nothing referenced `shapepipe.tests`). - Make `tests/` a package (`tests/__init__.py`, `tests/module/__init__.py`). The cluster guardrail imports shared code as `tests.helpers.*`; that only resolves once `tests` itself is importable, which also removes a latent collection error in the harness branch. - `pyproject.toml`: `testpaths = ["tests"]` (was `["tests", "src/shapepipe/tests"]`). - Document the single layout in `tests/README.md`; fix the stale two-tree description in `docs/source/testing.md` and the project `CLAUDE.md`. Full suite collects from one root with no errors; the fast tier (`-m "not slow and not candide"`) is green (276 passed, 5 skipped). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The galaxy Jacobian origin sets where the centroid prior sits, so it must land on the object. Two ways to place it, now selectable per run: - `hsm` (DEFAULT, unchanged behavior): re-center on the HSM adaptive-moment centroid measured from the stamp. Robust for galaxies — it follows the light, so the centroid prior does not bias an object offset from the stamp center. - `wcs` (Fabian's): place the origin at the catalog sky position projected through the WCS to a sub-pixel offset, with no shape measurement. Better for stars, whose HSM moments are too noisy to re-center on. Wiring: - `make_ngmix_observation` gains `centroid_source` (+ `wcs_full`/`ra`/`dec` for the wcs branch); the hsm branch is the prior code verbatim. - `do_ngmix_metacal`, `Ngmix.__init__`, and `Ngmix.process` thread it through; `Postage_stamp` carries the per-epoch WCS + ra/dec, filled in `prepare_postage_stamps` (used only by the wcs branch). - `ngmix_runner` reads the optional `CENTROID_SOURCE` ini key (default `hsm`); documented in `example/cfis/config_tile_Ng_template.ini`. The star-response guardrail declares `CENTROID_SOURCE = wcs` and threads it into the regeneration chain (`SP_CENTROID_SOURCE`), so the star grid is built with the star-appropriate centroid. Default path verified unchanged: 26 ngmix/science tests green; the wcs branch reproduces the WCS-projected sub-pixel offset on a synthetic exposure. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add `.github/workflows/fast-tests.yml`: the quick inner-loop gate that runs the `tests/` suite minus the heavy and cluster tiers — `pytest -m "not slow and not candide"` — on every push and pull request. It complements `deploy-image.yml` (which builds the dev image and runs the *full* suite inside it) by skipping the Docker build, so it returns in minutes. Matches the repo's conventions: SHA-pinned actions (covered by the existing github-actions dependabot policy), Python 3.12, and the environment reproduced from `uv.lock` via `uv sync --frozen --extra test` — the same pinned set a developer's `.venv` carries, including the pinned ngmix git source. Tests that need the astromatic binaries self-skip via `shutil.which`, so they don't run here (the dev-image CI exercises them). Concurrency-cancels superseded runs. Verified the exact invocation locally: 276 passed, 5 skipped, 2 deselected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
What this is
A same-repo mirror of #740 (@martinkilbinger's "Ngmix v2.0"), pushed to a branch on
CosmoStat/shapepipeso that CI actually runs. All 57 commits are authored by Martin (and carry Lucy Baumont's and Axel Guinot's work) — pushing the branch preserves that authorship unchanged; the only thing this PR adds is a same-repo head so GitHub Actions fires thepull_requestworkflow without the fork-PR approval gate. #740 received no CI runs at all for this reason.Substance is identical to #740 — see that PR for the full description. In short: upgrade ngmix to 2.4.0 and adopt Lucy's new ngmix classes/interface; overhaul the shape-measurement module; centroid-bias fix + validation; v2.0 production-run plumbing.
Going forward, opening PRs directly on
CosmoStat/shapepipe(rather than from a fork) avoids this — fork PRs don't trigger our Docker-image CI without a maintainer approval that wasn't happening.Closes/supersedes #740 once CI is green (leaving that call to Martin).
Review
A detailed review is on its way (read against Martin's checklist plus a science-quality pass). Headline from exercising the new fitter against real ngmix 2.4.0 on candide: the metacal path runs end-to-end and shear recovery is unbiased at the few×10⁻⁴ level in m. Full notes to follow.
— Claude on behalf of Cail