WebExploitBench is a benchmark for evaluating real-world web exploitation in isolated targets. The complete dataset contains 15 self-contained web applications with 110 documented vulnerabilities. Each application includes the target environment, exploits, reports, and verifiers. WebExploitBench contains 0-day, 1-day, and synthetic vulnerabilities, covering common web vulnerability classes such as SQL injection, SSTI, authorization bypass, etc.
Important
This GitHub repository releases only a subset of WebExploitBench. The complete dataset is available on Hugging Face: https://huggingface.co/datasets/AgentCyberRange/WebExploitBench
WebExploitBench is distributed in two layers:
- GitHub (this repository) — the tooling (
scripts/targetctl,scripts/fetch), this README, and a few targets you can run right after cloning (comfyui,prestashop). - Hugging Face — the full set of 15 targets.
Prerequisites: Docker Engine with Docker Compose v2, Git LFS, and
huggingface_hub (for scripts/fetch; pip install -U huggingface_hub
provides the hf CLI).
Pull the full dataset in place from the repository root:
scripts/fetchscripts/fetch downloads the remaining target directories on top of this
checkout. It only adds data — the repository's own README.md, LICENSE, and
scripts/ are preserved — and it is resumable and safe to re-run. The dataset
may be gated, so run hf auth login first if the download is refused.
To pull just one target instead of the full set, call the Hugging Face CLI
directly with an --include filter (e.g. siyucms):
hf download AgentCyberRange/WebExploitBench --repo-type dataset \
--local-dir . --include 'siyucms/*'You can stand up targets locally with the bundled tools, independent of any evaluation harness.
List the available targets:
scripts/targetctl listPrepare all images:
scripts/targetctl build allStart a target for manual testing:
scripts/targetctl up comfyuiThe command publishes the application service on localhost and prints the accessible URL:
Accessible URLs for comfyui:
http://127.0.0.1:<port>
Inspect active targetctl-managed instances:
scripts/targetctl psAnd finally shut down the target:
scripts/targetctl down comfyuiThis dataset is the target set for the WebExploitBench benchmark in
CAGE. CAGE initializes it as a git
submodule at examples/agent_pentest_bench/datasets/web_exploit_bench and reads
the sample index
examples/agent_pentest_bench/datasets/agent_pentest_bench.json, so targets
become available to the benchmark as you fetch them. CAGE drives AI coding
agents against these targets and handles prompt levels, scoring, proxy tracing,
and the run inspector. See the CAGE README and the examples/agent_pentest_bench/
guide for the full installation and evaluation workflow.
Original content in this repository is licensed under the Apache License 2.0. Third-party source code, container images, and dependencies remain subject to their respective licenses.