Skip to content

Add Prompt Shield to Code section#32

Open
mthamil107 wants to merge 1 commit into
DeepSpaceHarbor:masterfrom
mthamil107:add-prompt-shield
Open

Add Prompt Shield to Code section#32
mthamil107 wants to merge 1 commit into
DeepSpaceHarbor:masterfrom
mthamil107:add-prompt-shield

Conversation

@mthamil107

Copy link
Copy Markdown

Adds Prompt Shield to the Code section.

Open-source prompt injection detection engine with novel cross-domain techniques (Smith-Waterman sequence alignment, stylometric discontinuity, adversarial fatigue tracking). 27 input detectors, 6 output scanners, 10 languages. Apache-2.0.

Research paper: arXiv:2604.18248

donna-matt pushed a commit to donna-matt/Awesome-AI-Security that referenced this pull request May 17, 2026
…canner

Add Prism Scanner - AI agent security scanner
@mthamil107

Copy link
Copy Markdown
Author

Checking in on this PR.

Since opening, prompt-shield has shipped two significant releases that may be worth re-evaluating against the list's bar:

  • v0.5.0 (June 2026): 4 new input detectors (custom YAML rules, language enforcement, denied topics, multi-turn topic drift) and 3 new output scanners (sentiment, bias/fairness, hallucination/grounding) — bringing the total to 33 input detectors + 9 output scanners with 1040 tests passing. Also adds a Prometheus /metrics endpoint, a sliding-window rate limiter, an NFKC + homoglyph normalization pipeline, and a multi-encoding preprocessor (base64/hex/URL/HTML/ROT13). Full notes in the CHANGELOG.
  • v0.5.1 (June 2026): patch release for README/PyPI rendering.

The novel techniques are now formally documented as prior art with a Zenodo DOI: 10.5281/zenodo.20809165. Main paper remains at arXiv:2604.18248.

Happy to adjust the Code-section listing if there's anything specific you'd like changed before merging — just let me know. Thanks for maintaining this list.

@mthamil107

Copy link
Copy Markdown
Author

Quick follow-up: prompt-shield just shipped v0.6.0 with what may be the angle that most strengthens this listing — the first open-source federated threat-intel feed for prompt-injection defense.

Companion repo: prompt-shield-signatures

  • 56 ed25519-signed attack signatures across 7 attack classes (direct + indirect injection, social engineering, multi-turn, multilingual, encoded, adversarial suffix)
  • CC0-licensed data, Apache 2.0 code
  • Pure-Python ed25519 verification (no minisign binary required at runtime)
  • Maintainer's signing key offline; never in CI
  • Client auto-fetches + verifies + falls back to local cache on outage

This is structurally different from competitors: Lakera, ProtectAI, and Cisco AI Defense keep their attack-pattern catalogs proprietary because the catalog is their business model. prompt-shield doesn't depend on selling intel, so we can ship it for free under CC0.

Also worth noting since the prior bump: prompt-shield now publishes coverage scores on 9 datasets / 9,150+ samples (8 public — Garak, InjecAgent, HarmBench, Liu/USENIX, deepset, NotInject, ablation set, PINT example) and just landed a MITRE ATLAS mapping (9/9 techniques covered).

The federated-feed angle alone is unique in this space — happy to add a line to the listing about it if helpful. Let me know if there's anything you'd like rephrased.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant