Skip to content

Fix spurious negative auto_loan_interest from SCF -1 sentinels#1171

Open
MaxGhenis wants to merge 1 commit into
mainfrom
fix-scf-auto-loan-interest-sentinel
Open

Fix spurious negative auto_loan_interest from SCF -1 sentinels#1171
MaxGhenis wants to merge 1 commit into
mainfrom
fix-scf-auto-loan-interest-sentinel

Conversation

@MaxGhenis

Copy link
Copy Markdown
Contributor

Problem

add_auto_loan_interest cleans the SCF "not applicable" sentinels with:

auto_df[AUTO_LOAN_COLUMNS].replace(-1, 0, inplace=True)

That runs on a column-slice copy, so it is a silent no-op — the -1 sentinels survive into the per-car balance * rate product. A real loan balance times a -1 rate sentinel (after the /10,000 scaling) yields a small negative interest: e.g. a $90k loan with an unreported rate becomes -$9 of interest.

This is invisible inside policyengine-us-data but propagates: populace-build's SCF wealth stage imputes SCF_2022().load_dataset() onto households, so 844 of 75,112 populace_us_2024 households carry impossible negative auto_loan_interest. Surfaced by the PolicyBench populace refresh, whose prompts then showed models facts like auto loan interest: -$4.50.

Fix

Extract the balance/interest computation into _summarize_auto_loans and floor the raw balance and rate columns at zero with clip(lower=0) before combining. This assigns back (unlike the old in-place call) and is robust to any negative sentinel code, not just -1.

Positive magnitudes are unchanged — the /10,000 rate scaling is preserved and the real median interest/balance stays ~0.04 (a sane ~4% APR). Adds a focused unit test covering both the sentinel-balance and unreported-rate cases.

🤖 Generated with Claude Code

add_auto_loan_interest cleaned the SCF 'not applicable' sentinels with
auto_df[AUTO_LOAN_COLUMNS].replace(-1, 0, inplace=True), which runs on a
column-slice copy and is a silent no-op. The -1 sentinels therefore survived
into the per-car balance*rate product: a real loan balance times a -1 rate
sentinel (after the /10,000 scaling) yields a small negative interest, so a
$90k loan with an unreported rate became -$9 of interest.

Extract the balance/interest computation into _summarize_auto_loans and floor
the raw balance and rate columns at zero (clip(lower=0)) before combining —
which also assigns back, unlike the old in-place call, and is robust to any
negative sentinel code. Positive magnitudes are unchanged (rate /10,000 kept;
median interest/balance stays ~0.04). Adds a unit test.

Surfaced by the PolicyBench populace refresh: 844 of 75,112 populace_us_2024
households carried impossible negative auto_loan_interest traced to this.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant