docs: add utility doctest examples#804
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #804 +/- ##
==========================================
+ Coverage 88.79% 88.91% +0.11%
==========================================
Files 89 89
Lines 5060 5060
Branches 646 646
==========================================
+ Hits 4493 4499 +6
+ Misses 423 417 -6
Partials 144 144
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
please pull main and incorporate recent changes |
0a7c2f9 to
9175ae7
Compare
|
|
||
| .. testcode:: | ||
|
|
||
| clrd = cl.load_sample("clrd").groupby("LOB").sum().iloc[:2] |
There was a problem hiding this comment.
test demonstrates that concatting identical columns doesn't do anything, which doesn't match the example text.
| def minimum(x1, x2): | ||
| """Element-wise minimum of two triangles (delegates to ``Triangle.minimum``). | ||
|
|
||
| Examples |
There was a problem hiding this comment.
we need more basic docstring before a doctest. what's x1? what's x2?
|
|
||
| Examples | ||
| -------- | ||
| Cap a triangle cell-by-cell by comparing it with another triangle of limits. |
There was a problem hiding this comment.
are we certain this is true? can x2 be a scalar?
| def read_json(json_str, array_backend=None): | ||
| """Deserialize JSON produced by ``to_json`` (triangle, estimator, or pipeline). | ||
|
|
||
| Examples |
There was a problem hiding this comment.
this example feels empty without seeing the actual json string. please follow the example from pandas
| print(round(float(by_dev.ldf_.values[0, 0, 0, 0]), 6)) | ||
| print(round(float(by_both.ldf_.values[0, 0, 0, 0]), 6)) | ||
|
|
||
| .. testoutput:: |
There was a problem hiding this comment.
should we be showing all the numbers?
…henrydingliu - read_pickle: show fitted Development estimator round-trip via pickle, verify transform works after restore - read_json: show full Pipeline serialization round-trip with step names and params - concat: show paid+incurred column join enabling MunichAdjustment directly - minimum: compare volume vs simple CL ultimates, pick element-wise lower for low-side scenario - maximum: same comparison, pick element-wise higher for high-side scenario - PatsyFormula: clarify when to use custom DevelopmentML pipeline vs TweedieGLM; show ldf_ output instead of coefficient count
| import chainladder as cl | ||
|
|
||
| tri = cl.load_sample("raa") | ||
| dev = cl.Development(average="volume").fit(tri) |
There was a problem hiding this comment.
to demonstrate that to_pickle does something, we should use non-default parameters. something like avg = simple, n = 4.
| dev.to_pickle(p) | ||
| restored = cl.read_pickle(p) | ||
| os.remove(p) | ||
| print(restored.transform(tri).ldf_.values[0, 0, 0, :4].round(4)) |
There was a problem hiding this comment.
can we print the full ldf_ from both the original and the restored estimators?
| combined = cl.concat([paid, incurred], axis=1) | ||
| adj = cl.MunichAdjustment(paid_to_incurred=("CumPaidLoss", "IncurLoss")) | ||
| result = adj.fit_transform(combined) | ||
| print(result.ldf_["CumPaidLoss"].values[0, 0, 0, :4].round(4)) |
There was a problem hiding this comment.
good use case for concat. can we focus the test output around concat only?
|
@EKtheSage are you interested in finishing up this PR? |
- read_pickle: use non-default params (average=simple, n_periods=4), print ldf_ from both original and restored estimators, and call .transform() on restored to prove it is still functional - read_json: show the full serialized JSON string before round-tripping, following pandas docstring style - concat: remove MunichAdjustment output; focus on concat result only by printing combined.columns - minimum/maximum: add prose descriptions for x1 and x2 parameters, confirming x2 can be a scalar - maximum: trim testoutput to show only high_side result
|
@henrydingliu thanks for the detailed review. All comments have been addressed in the latest commit. Summary below:
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 13c3d48. Configure here.
|
@EKtheSage I don't think all the edits came through on this one either |
…um test - read_pickle: print the full ldf_ from the original, restored, and transformed estimators instead of the first four factors - PatsyFormula: print the full ldf_ patterns for both GLM formulations and for the custom DevelopmentML pipeline - test_maximum: assert the element-wise maximum directly instead of the NaN-tolerant inequalities flagged in review
|
@henrydingliu thanks for checking. The earlier commit did land, but two of your comments were implemented incorrectly in it: the read_pickle example printed only the first four factors instead of the full |
| assert not missing, f"sdist is missing sample CSVs: {sorted(missing)}" | ||
|
|
||
|
|
||
| def test_load_sample_uspp() -> None: |
There was a problem hiding this comment.
we did a sizable refactor on load_sample a couple of weeks back. can you please review the current load_sample test(s) in here and see if this new test is redundant?
|
everything looks great. small nitpick on one potentially redundant test. |

Summary: Add Sphinx doctest examples for the PatsyFormula utility docs. Split from the larger #792 work and intentionally excludes .github/workflows/sync-main-to-docs.yml. Refs #704
Note
Low Risk
Documentation and test-only changes; utility implementations are unchanged aside from expanded docstrings.
Overview
Adds Sphinx doctest examples to utility API docstrings in
utility_functions.py:read_pickle,read_json,concat,minimum,maximum, andPatsyFormula(includingTweedieGLM/DevelopmentMLworkflows). Each block uses.. testsetup::,.. testcode::, and.. testoutput::so docs can execute and verify the snippets.Tests in
test_utilities.pyback related behavior: pickle round-trip for a fittedDevelopment(test_to_pickle_read_pickle),cl.maximumvs NumPy (test_maximum), and Friedland USPP sample keys loading with the expected three value columns (test_load_sample_uspp).Reviewed by Cursor Bugbot for commit f23fe84. Bugbot is set up for automated code reviews on this repo. Configure here.