feat(metric): add Symmetric Relevance Gain (SRG)#374
Draft
adrhill wants to merge 5 commits into
Draft
Conversation
Implements the SRG faithfulness metric (Blücher et al., TMLR 2024): the area between the LIF and MIF pixel-flipping curves computed from a shared feature ordering. The random-ordering baseline cancels in the difference, making rankings robust to the occlusion strategy. - `quantus.SymmetricRelevanceGain` evaluates both curves with one concatenated forward pass per occlusion step, plus a torch-resident fast path that keeps perturbed inputs on-device - `n_steps` parameter for coarse stepping; the baseline imputer is computed once per batch from the unperturbed input - registered in `AVAILABLE_METRICS`, docs page added, fixture-based tests and invariant tests (sign-flip antisymmetry, shared curve endpoints, torch path vs. numpy path equivalence) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Review follow-ups for the SRG metric: - declare TensorFlow support in `model_applicability`: the numpy path only uses the framework-agnostic `ModelInterface` API - make the constant imputer the single perturbation contract: `perturb_func` is applied once per batch to the unperturbed input and all occlusion steps copy from that snapshot, so passing the default function explicitly now matches `perturb_func=None` and the torch fast path works with any perturbation function (regression test added) - replace `n_steps` with the conventional `features_in_step` knob and assert against `x_batch.shape[2:]` like `PixelFlipping` - drop the `last_mif_curves`/`last_lif_curves` accessors and per-call curve storage so `get_params()` only reports configuration - add the `warn_perturbation_caused_no_change` check, document the "random" baseline option, drop the redundant `int` from the `perturb_baseline` annotation Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
For inputs normalized to zero channel mean (standard ImageNet preprocessing), imputing zeros exactly reproduces the paper's channel-wise data set mean imputer, whereas the previous default `perturb_baseline="mean"` (per-sample mean over flattened features) only approximated it. Documented the normalization assumption and the alternatives for unnormalized inputs. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- Drop "random" from the documented `perturb_baseline` options: it is deprecated and `get_baseline_value` rejects it with a `ValueError`. - Replace the "array of channel means" advice: with the all-indices call pattern of `batch_baseline_replacement_by_indices`, an `np.ndarray` baseline must be 0-dimensional; also clarify that "mean" is the per-sample mean over all features, not the paper's channel-wise mean. - Make the `abs` guidance method-conditional in the docstring and the parameterisation warning: signed ranking presumes the attribution's sign encodes evidence for/against the class (LRP, Shapley, IG); for sensitivity maps whose sign is a direction in color space (raw gradients), `abs=True` or channel aggregation is appropriate. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The `warn_parameterisation` template hardcodes "is likely to be
sensitive to the choice of {sensitive_params}", so listing
`perturb_baseline` and `features_in_step` followed by "(SRG rankings
are designed to be robust to both)" rendered as a sentence that negated
itself. Headline `abs` as the genuinely sensitive parameter instead,
and state the robustness to occlusion-strategy choices — SRG's main
selling point — in its own sentence, qualified by the paper's finding
that absolute scores still vary across setups.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
SRG = AUC(LIF) − AUC(MIF). The random-ordering baseline cancels in the difference, which makes attribution rankings largely insensitive to the occlusion strategy (baseline value, step size).Implemented changes
quantus.SymmetricRelevanceGaininquantus/metrics/faithfulness/symmetric_relevance_gain.py, registered inAVAILABLE_METRICSand the faithfulness__init__.pyfeatures_in_step,perturb_func,perturb_baseline, ...); supports both PyTorch and TensorFlow modelsperturb_funcis applied once per batch to the unperturbed input and every occlusion step copies values from that snapshot, so stochastic baselines are drawn once and results are independent of how the default perturbation function is passedperturb_baseline=0.0reproduces the paper's channel-wise data set mean imputer for inputs normalized to zero channel mean (e.g. standard ImageNet preprocessing); the docstring documents this assumption and the alternatives for unnormalized inputsabsdocstring and parameterisation warning give method-conditional guidance: keepabs=Falsewhere the attribution's sign encodes evidence for/against the class (e.g. LRP, Shapley, IG); useabs=Truefor sensitivity maps whose sign reflects a direction in color space (e.g. raw gradients)tests/metrics/test_faithfulness_metrics.py(sign-flip antisymmetry, shared curve endpoints, torch-path vs. numpy-path equivalence, explicit defaultperturb_funcequalsperturb_func=None)docs/source/docs_api/CC @bluecher31 (paper author)
🤖 Generated with Claude Code