Skip to content

[REVIEW] secrets-management: fixture and classification conflict with placeholder filtering #3013

Description

@0xca1x

Skill Being Reviewed

Skill name: secrets-management
Skill path: skills/devsecops/secrets-management/

False Positive Analysis

The skill has a contradiction between its false-positive guidance and its current fixture expectations.

Benign/placeholder value that the skill says should not be flagged:

PAYMENT_API_KEY=sk_test_FAKE_DO_NOT_USE_1234567890
PAYMENT_API_URL=https://payments.example.test

This value appears in tests/fixtures/secrets-management/hardcoded-test-secret-vulnerable/settings.env and is expected as a critical CWE-798 finding in that fixture's manifest.yaml.

Why this is a false positive:
skills/devsecops/secrets-management/SKILL.md explicitly says placeholder/example values such as test, dummy, fake, and example should not be flagged as leaked secrets. The fixture value contains both sk_test and FAKE_DO_NOT_USE, so it is intentionally non-production-looking test data. Keeping it as the primary vulnerable fixture teaches the harness and contributors that placeholder Stripe-style test keys are critical findings, which conflicts with the skill's own filtering rule.

Coverage Gaps

Missed variant 1: Real-looking provider token without explicit fake/test markers

PAYMENT_API_KEY=sk_live_51N1xYzAbCdEfGhIjKlMnOpQrStUvWxYz1234567890abcdef

Why it should be caught:
A vulnerable fixture should use a high-entropy, non-placeholder provider-shaped value so the fixture exercises the intended detection behavior without contradicting the false-positive guidance.

Missed variant 2: Tooling-status finding classification ambiguity

Repository has no .gitleaks.toml, .secrets.baseline, TruffleHog config, or CI secret scan.

Why it should be caught or not caught consistently:
The skill currently says absence of detection tooling is not a numbered secrets finding and should be noted in the tooling table/recommendations, but later says "No secret detection tooling deployed is Critical." This makes the expected output inconsistent: one reviewer may include a critical finding while another may only add a recommendation.

Edge Cases

  • Stripe-style sk_test_* values can be real test-environment credentials in some repositories, but this fixture explicitly includes FAKE_DO_NOT_USE. The skill should distinguish provider test-mode secrets from obvious placeholders and document when test-mode keys are still sensitive.
  • Example files such as .env.example often include variable references like ${PAYMENT_API_KEY}; the existing benign fixture correctly covers this, but the vulnerable fixture should not undermine the placeholder filtering rule.

Remediation Quality

  • Fix resolves the vulnerability
  • Fix doesn't introduce new security issues
  • Fix doesn't break functionality
  • Issues found: The current remediation fixture removes an obvious fake placeholder, which is not a strong regression test for real secret handling.

Comparison to Other Tools

Tool Catches this? Notes
Gitleaks Partial Provider-shaped rules may match test-looking values, but allowlists commonly suppress obvious fake/example tokens.
TruffleHog Partial Verification/entropy and detector-specific validation reduce some placeholder findings.
detect-secrets Partial Baselines and allowlists are commonly used to separate known examples from true leaks.

Overall Assessment

Strengths:
The skill is well-structured and has good safety language: do not print secret values, separate architectural observations from actual secret findings, and require evidence-backed findings.

Needs improvement:
The regression fixture and the findings classification section conflict with the false-positive guidance, which can cause noisy reviews and inconsistent scoring.

Priority recommendations:

  1. Replace the vulnerable fixture value with a realistic high-entropy provider-shaped secret that does not include test, FAKE, DO_NOT_USE, example, or similar placeholder markers.
  2. Add a separate benign fixture for an obvious fake provider-shaped token, e.g. sk_test_FAKE_DO_NOT_USE_..., to lock in the false-positive guidance.
  3. Resolve the classification conflict by deciding whether "no detection tooling" is a numbered finding or a tooling-status/recommendation item, then update the Findings Classification section accordingly.

Bounty Info

  • I have read the CONTRIBUTING.md terms and noticed the paid bounty program is currently marked as paused.
  • Preferred payment method: Can provide if this review is accepted when bounties resume.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions