WebExploitBench

Overview

WebExploitBench is a benchmark for evaluating real-world web exploitation in isolated targets. The complete dataset contains 15 self-contained web applications with 110 documented vulnerabilities. Each application includes the target environment, exploits, reports, and verifiers. WebExploitBench contains 0-day, 1-day, and synthetic vulnerabilities, covering common web vulnerability classes such as SQL injection, SSTI, authorization bypass, etc.

Important

This GitHub repository releases only a subset of WebExploitBench. The complete dataset is available on Hugging Face: https://huggingface.co/datasets/AgentCyberRange/WebExploitBench

Dataset

WebExploitBench is distributed in two layers:

GitHub (this repository) — the tooling (scripts/targetctl, scripts/fetch), this README, and a few targets you can run right after cloning (comfyui, prestashop).
Hugging Face — the full set of 15 targets.

Prerequisites: Docker Engine with Docker Compose v2, Git LFS, and huggingface_hub (for scripts/fetch; pip install -U huggingface_hub provides the hf CLI).

Pull the full dataset in place from the repository root:

scripts/fetch

scripts/fetch downloads the remaining target directories on top of this checkout. It only adds data — the repository's own README.md, LICENSE, and scripts/ are preserved — and it is resumable and safe to re-run. The dataset may be gated, so run hf auth login first if the download is refused.

To pull just one target instead of the full set, call the Hugging Face CLI directly with an --include filter (e.g. siyucms):

hf download AgentCyberRange/WebExploitBench --repo-type dataset \
  --local-dir . --include 'siyucms/*'

Local Target Management

You can stand up targets locally with the bundled tools, independent of any evaluation harness.

List the available targets:

scripts/targetctl list

Prepare all images:

scripts/targetctl build all

Start a target for manual testing:

scripts/targetctl up comfyui

The command publishes the application service on localhost and prints the accessible URL:

Accessible URLs for comfyui:
http://127.0.0.1:<port>

Inspect active targetctl-managed instances:

scripts/targetctl ps

And finally shut down the target:

scripts/targetctl down comfyui

Evaluating with CAGE

This dataset is the target set for the WebExploitBench benchmark in CAGE. CAGE initializes it as a git submodule at examples/agent_pentest_bench/datasets/web_exploit_bench and reads the sample index examples/agent_pentest_bench/datasets/agent_pentest_bench.json, so targets become available to the benchmark as you fetch them. CAGE drives AI coding agents against these targets and handles prompt levels, scoring, proxy tracing, and the run inspector. See the CAGE README and the examples/agent_pentest_bench/ guide for the full installation and evaluation workflow.

License

Original content in this repository is licensed under the Apache License 2.0. Third-party source code, container images, and dependencies remain subject to their respective licenses.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
_common		_common
assets		assets
comfyui		comfyui
prestashop		prestashop
scripts		scripts
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WebExploitBench

Overview

Dataset

Local Target Management

Evaluating with CAGE

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

WebExploitBench

Overview

Dataset

Local Target Management

Evaluating with CAGE

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages