Skip to content

Revert US certified dataset to Enhanced CPS#392

Draft
MaxGhenis wants to merge 1 commit into
mainfrom
claude/revert-mp-ecps-restore-ecps
Draft

Revert US certified dataset to Enhanced CPS#392
MaxGhenis wants to merge 1 commit into
mainfrom
claude/revert-mp-ecps-restore-ecps

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

Fixes #391

Summary

Reverts the certified US dataset back to production Enhanced CPS. v4.14.1 (#390) promoted the microplex MP 2cdd45d artifact to the certified US default, but on a correctly-scored target surface eCPS decisively beats MP — and MP is materially worse on the income components that drive a tax model.

metric (lower is better) eCPS MP 2cdd45d
full PE-native loss 0.02662 0.04275
holdout loss 0.00769 0.00938
unweighted MSRE 0.174 0.290
target wins (of 2809) 2334 267 (208 ties)

MP is 14–49× worse on interest, retirement, dividends, capital gains, wages, and self-employment, and only better on SNAP. Verified by two independent symmetric-refit comparisons agreeing to the digit. The promotion cleared a content baseline gate; the numeric baseline-sanity gate would have failed it because eCPS was mis-scored at MSRE 3.464 on the promotion surface (e.g. unemployment +615%).

Change

Produced with the supported refresh path, not by hand-editing the manifest:

scripts/refresh_release_bundle.py --country us --model-version 1.715.2 \
  --data-version 1.115.5 --release-manifest-path release_manifest.json \
  --release-manifest-revision d47fb5475144260a75467d2f2e22b2d5d53d4d57

Restores:

  • release_manifests/us.json + us.trace.tro.jsonldenhanced_cps_2024 (policyengine-us-data 1.115.5, rev d47fb547, dataset sha 0a6b961a), model 1.715.2, legacy_compatible_model_package
  • pyproject.toml policyengine-us pins → 1.715.2
  • tests/test_models.py + tests/test_release_manifests.py → reverts Update US release manifest to MP eCPS artifact #390's MP expectations
  • uv.lock

us.json is byte-identical to the pre-#390 eCPS manifest except the policyengine_version/bundle_id stamp, which CI restamps on release.

Testing

  • test_release_manifests, test_trace_tro, test_bundle_refresh, test_import_policyengine_bundle: 98 passed
  • test_models: 35 passed
  • test_household_calculator_snapshot: 10 passed — no drift from the 1.715.3→1.715.2 model revert
  • make format / make lint: clean

MP remains a research track, to be re-promoted only once it wins on a correctly-scored surface (after the CPS-passthrough income-component fix). HF revision a091769a is left in place; nothing points at it once this merges.

🤖 Generated with Claude Code

The microplex MP 2cdd45d artifact promoted as the certified US dataset in
v4.14.1 (#390) loses to production eCPS on a correctly-scored target surface:
full PE-native loss 0.0428 vs 0.0266, holdout 0.0094 vs 0.0077, eCPS wins
2334 of 2809 targets, and MP is 14-49x worse on every income component
(interest, retirement, dividends, capital gains, wages). The promotion was
decided on a surface that mis-scored eCPS (unweighted MSRE 3.464).

Restore enhanced_cps_2024 (policyengine-us-data 1.115.5, policyengine-us
1.715.2) via the supported scripts/refresh_release_bundle.py path.

Fixes #391

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Revert US certified dataset to Enhanced CPS (MP 2cdd45d loses clean-surface comparison)

1 participant