Skip to content

docs: add workflow doctest examples#803

Open
EKtheSage wants to merge 4 commits into
casact:mainfrom
EKtheSage:docs/704-workflow-examples
Open

docs: add workflow doctest examples#803
EKtheSage wants to merge 4 commits into
casact:mainfrom
EKtheSage:docs/704-workflow-examples

Conversation

@EKtheSage

@EKtheSage EKtheSage commented May 16, 2026

Copy link
Copy Markdown
Contributor

Summary: Add Sphinx doctest examples for GridSearch, Pipeline, and VotingChainladder workflow docs. Split from the larger #792 work and intentionally excludes .github/workflows/sync-main-to-docs.yml. Refs #704


Note

Low Risk
Documentation and doctest-only changes with no runtime or API behavior modifications.

Overview
Adds Sphinx doctest Examples sections to workflow API docs so narrative and executable snippets appear in the built reference.

GridSearch gets a med-mal example that grids dev__average (simple vs volume) and uses a custom scoring callable to compare sigma_ by development age, plus prose on scale and interpretability.

Pipeline gets a CLRD example contrasting Development(groupby="LOB") vs standalone development and printing IBNR by line, with a short note on unstable standalone othliab IBNR. A small Attributes docstring formatting fix closes the named_steps description properly.

VotingChainladder extends the existing RAA example with actuarial framing for accident-year-specific weights, and adds a second doctest that omits weights and uses default_weighting=(2, 1, 1) to blend all three estimators every year.

No changes to fit/predict behavior—only docstrings and doctest blocks (refs #704, split from #792).

Reviewed by Cursor Bugbot for commit b98388e. Bugbot is set up for automated code reviews on this repo. Configure here.

@codecov

codecov Bot commented May 16, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.79%. Comparing base (b24f209) to head (b98388e).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #803   +/-   ##
=======================================
  Coverage   88.79%   88.79%           
=======================================
  Files          89       89           
  Lines        5060     5060           
  Branches      646      646           
=======================================
  Hits         4493     4493           
  Misses        423      423           
  Partials      144      144           
Flag Coverage Δ
unittests 88.79% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread chainladder/workflow/gridsearch.py Outdated
)
param_grid = {"benk__n_iters": [1, 4]}
scoring = {
"IBNR": lambda m: float(np.nansum(m.named_steps.benk.ibnr_.values))

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please change this to sigma? we should look like we understand the basics of machine learning in the docstring.

Comment thread chainladder/workflow/gridsearch.py Outdated
)
pipe.set_params(dev__average="volume")
ib_volume = int(
round(float(np.nansum(pipe.fit_predict(tri).ibnr_.values)), 0)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need np here?

@EKtheSage EKtheSage force-pushed the docs/704-workflow-examples branch from 28e6914 to b3dd40f Compare May 17, 2026 04:20
Comment thread chainladder/workflow/voting.py Outdated

import numpy as np

raa = cl.load_sample("raa")

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a lot of duplicate code from previous example. hide under testsetup?

Comment thread chainladder/workflow/voting.py Outdated

.. testoutput::

19694.23

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confusing to the user to see one example with full vector of ultimates, followed by another example that only shows a couple

Comment thread chainladder/workflow/voting.py Outdated
1989 20004.502125
1990 21605.832631

``weights`` and ``default_weighting`` change how sub-model ultimates are

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

narratively this doesn't build on the first example

Comment thread chainladder/workflow/gridsearch.py Outdated
.. testoutput::

2
1.422

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kennethshsu does this feel like a bug to you? going from simple avg to volume weighted somehow introduced such a gargantuan increase in sigma.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into this, and I don’t think this is a bug. Most of the sigma_ difference appears to be driven by the 12-24 factor.

Since the LDFs are shifting over time as we move down the origin years, the volume-weighted factors are being pulled toward the more recent origin years. That, in turn, is driving the larger sigma_ values relative to the older origin years.

That said, I’m not entirely sure what this example is intended to demonstrate. Also, summing the sigma_ values does not really seem meaningful here. I think it would make more sense to compare the underlying arrays directly.

Comment thread chainladder/workflow/gridsearch.py Outdated

Examples
--------
Use ``Pipeline`` when the same triangle should pass through several

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this example doesn't motivate why pipeline is useful. said in another way, pipeline is overkill for a straightforward chainladder(development) on a single triangle.

one instructive narrative line would be to actually compare the groupby pipeline from the user guide to a pipeline without groupby

@kennethshsu

Copy link
Copy Markdown
Collaborator

@EKtheSage are you interested in finishing up this PR?

Ethan Kang added 2 commits June 11, 2026 14:49
GridSearch: score the full per-age sigma_ vector instead of a summed
sigma so candidates are compared on the underlying arrays, and explain
why the simple and volume rows sit on different scales.

Pipeline: motivate the estimator by comparing the user guide groupby
pipeline against the same pipeline without groupby, showing pooled
line-of-business patterns versus unstable standalone company patterns.

VotingChainladder: give the first example an actuary-facing narrative,
make the second example build on the first by reusing its estimators
and apriori, and print the full ultimate vector instead of two scalars.
@EKtheSage

Copy link
Copy Markdown
Contributor Author

@henrydingliu updated per your comments and @kennethshsu's. GridSearch now scores the full per-age sigma_ vector so candidates are compared on the underlying arrays, with an explanation of why the two averaging methods sit on different scales. The Pipeline example now motivates the estimator by comparing the user guide groupby pipeline against the same pipeline without groupby. The voting examples now build on each other and print full ultimate vectors. Main is merged in as well.


.. testoutput::

simple [1.163, 0.102, 0.057, 0.038, 0.026, 0.016, 0.007, 0.01, 0.003]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the difference in magnitude of these sigmas is not intuitive to me. i've opened a separate issue #952 to address it (or for someone to talk some sense into me). if you are interested/have time, please help me figure out whether there's an actual bug or not. otherwise, would you mind changing this example to showcase gridsearch through a different statistic?

the same pipeline without ``groupby`` shows why that pooling matters:
standalone patterns are fit to each company's own thin data.

.. testsetup::

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a powerful example. but is it actually showing people how to use pipeline? i wonder if it makes more sense to be included instead under Development to showcase groupby.

For pipeline, i think it would be more instructive to demonstrate a workflow where a development, a tail, and an ibnr method are chained together

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants