Skip to content

fix: discard cached gram matrix when refitting CostRbf/CostCosine#373

Open
gaoflow wants to merge 2 commits into
deepcharles:masterfrom
gaoflow:fix/372-refit-stale-gram
Open

fix: discard cached gram matrix when refitting CostRbf/CostCosine#373
gaoflow wants to merge 2 commits into
deepcharles:masterfrom
gaoflow:fix/372-refit-stale-gram

Conversation

@gaoflow

@gaoflow gaoflow commented Jun 9, 2026

Copy link
Copy Markdown

Closes #372.

CostRbf and CostCosine compute their Gram matrix lazily and cache it in self._gram, but fit() never invalidates the cache. Detection estimators create the cost object once in __init__ and call cost.fit(signal) on every fit, so refitting the same estimator on a different signal silently reuses the previous signal's Gram matrix:

m = rpt.Pelt(model="rbf")
m.fit(s1).predict(pen=3)  # [50, 100]
m.fit(s2).predict(pen=3)  # [50, 100]  <- stale; a fresh instance gives [30, 100]

For CostRbf there is a second layer: when gamma=None, the median-heuristic gamma is derived inside the gram property and stays pinned to the first signal, so resetting the cache alone is not enough.

fit() now discards _gram and re-derives the heuristic gamma per signal, keeping a user-supplied gamma — the same pattern CostMl already uses with has_custom_metric.

Tests: a parametrized refit-equals-fresh-fit check over all cost models, a gamma re-derivation/persistence test, and an estimator-level check for rbf/cosine (5 of these fail without the fix). Note: #369 touches costrbf.py for memory optimization but does not address this — its fit also keeps the cached state.

CostRbf and CostCosine compute their gram matrix lazily and cache it in
self._gram, but fit() never invalidated the cache. Since detection
estimators create the cost object once and call cost.fit(signal) on
every fit, refitting the same estimator on a different signal silently
reused the previous signal's gram matrix and returned its breakpoints.
For CostRbf the median-heuristic gamma stayed pinned to the first
signal as well.

Reset the cache in fit() and re-derive the heuristic gamma per signal,
keeping a user-supplied gamma (same pattern as CostMl.has_custom_metric).

Closes deepcharles#372
@github-actions github-actions Bot added the Type: Fix Bug or Bug fixes label Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Type: Fix Bug or Bug fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Instanciating once a Pelt model and fitting it with different signals yields same result but not when instanciating the model for each fit

1 participant