fix: discard cached gram matrix when refitting CostRbf/CostCosine#373
Open
gaoflow wants to merge 2 commits into
Open
fix: discard cached gram matrix when refitting CostRbf/CostCosine#373gaoflow wants to merge 2 commits into
gaoflow wants to merge 2 commits into
Conversation
CostRbf and CostCosine compute their gram matrix lazily and cache it in self._gram, but fit() never invalidated the cache. Since detection estimators create the cost object once and call cost.fit(signal) on every fit, refitting the same estimator on a different signal silently reused the previous signal's gram matrix and returned its breakpoints. For CostRbf the median-heuristic gamma stayed pinned to the first signal as well. Reset the cache in fit() and re-derive the heuristic gamma per signal, keeping a user-supplied gamma (same pattern as CostMl.has_custom_metric). Closes deepcharles#372
for more information, see https://pre-commit.ci
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #372.
CostRbfandCostCosinecompute their Gram matrix lazily and cache it inself._gram, butfit()never invalidates the cache. Detection estimators create the cost object once in__init__and callcost.fit(signal)on every fit, so refitting the same estimator on a different signal silently reuses the previous signal's Gram matrix:For
CostRbfthere is a second layer: whengamma=None, the median-heuristic gamma is derived inside thegramproperty and stays pinned to the first signal, so resetting the cache alone is not enough.fit()now discards_gramand re-derives the heuristic gamma per signal, keeping a user-supplied gamma — the same patternCostMlalready uses withhas_custom_metric.Tests: a parametrized refit-equals-fresh-fit check over all cost models, a gamma re-derivation/persistence test, and an estimator-level check for
rbf/cosine(5 of these fail without the fix). Note: #369 touchescostrbf.pyfor memory optimization but does not address this — itsfitalso keeps the cached state.