Prerequisites

Invariant-based test input generator for Deep Learning Library APIs, i.e. Pytorch, and TensorFlow. The goal is to find bugs on these APIs.

The tool is structured in two parts:

Offline: Inferring invariants from a list of candidate rules generated by LLM and deriving models (abstract inputs) from them to build a corpus
Online: Concretizing the abstract inputs from the corpus and running the oracles (crash, differential) to detect bugs

The code is organized as follow:

- 📁 bug_reports       # code snippets containing the bugs reported, as well as lists of reported bugs
- 📁 eval              # code related to evaluation (e.g. oracle, coverage)
- 📁 generator         # invariant-based input (abstract and concrete) generator
- 📁 invariants_{lib}  # mined invariants from APIs (libraries: torch and tf)
- 📁 learner           # invariant learner
- 📁 llm               # code to generate valid inputs (for invariant inference) and signatures using gemini
- 📁 plots             # figures and data presented in the paper
- 📁 rulegen           # code for generating candidate rules using LLM
- 📁 rules-{lib}       # candidate rules per api
- 📁 scripts           # scripts (e.g., fuzz, infer_invariants etc.)
- 📁 utils             # utility functions
- 📄 pipeline.sh       # script that will run the tool end-to-end (param: `tf` or `torch`)
- 📄 requirements.txt  # dependencies of this project
- 📄 {lib}_apis.txt    # list of supported PyTorch and Tensorflow APIs
- 📄 signatures.json   # signatures for the supported APIs

Folders generated during execution:

- 📁 .tmp              # contains all outputs (fuzzing, inferencing, coverage etc.)
- 📁 logs              # contains all logs
- 📁 corpus_{lib}      # corpus of abstract inputs/models

Prerequisites

python: The tool uses python 3.12
venv: sudo apt install python3.12-venv
libopenmp: sudo apt-get install libomp-dev
libopenblas: sudo apt-get install libopenblas-dev
clang-15:

wget https://github.com/llvm/llvm-project/releases/download/llvmorg-15.0.0/clang+llvm-15.0.0-x86_64-linux-gnu-rhel-8.4.tar.xz
tar xf clang*
cd clang*
sudo cp -R * /usr/local/

Steps to run

1. Learn invariants (offline)

Slurm (all variants)

To run invariant inference for all variants (variations of the apis from torch_variations.txt for PyTorch and tf_variations.txt for Tensorflow.), run the following. Be sure to install and configure slurm before running this.

(venv) ~/dll-fuzzing-with-input-invariants$ bash scripts/infer_invariants_with_slurm.sh <duration> <regen> <lib> <reduce> <seed>

Example:

(venv) ~/dll-fuzzing-with-input-invariants$ bash scripts/infer_invariants_with_slurm.sh 300 1 torch 1 42

This will generate (regenerate if already exists since 1 is passed as regen) the invariants for the variations of apis and it will use a time budget of 300 seconds to do so.

duration: Max time budget per variation to learn invariants
regen: 1 to regenerate invariants, 0 to learn invariants only if they do not exist
lib: torch or tf
reduce: 1 to perform rule reduction, 0 otherwise
seed: default 42

Without slurm (one variant)

To run invariant inference for a single variant, run the following *(under the venv)*:

(venv) ~/dll-fuzzing-with-input-invariants$ python -m learner.invariant_inference <variant> <time budget> <1 to regenerate invariants 0 otherwise> <lib: torch/tf> <1 to enable rule-reduction 0 otherwise> <seed>

2. Generate models (offline)

Slurm (all apis/variants)

To generate models by solving the constraints, the script scripts/generate_models_with_slurm.sh needs to be used. Be sure to install and configure slurm before running this. This runs model generation for all variations of the apis from torch_variations.txt for PyTorch and tf_variations.txt for Tensorflow. Since this is an offline mode, running this once is enough to run online fuzzing campaigns.

(venv) ~/dll-fuzzing-with-input-invariants$ bash scripts/generate_models_with_slurm.sh <duration> <n_max> <lib> <seed> <regen>

duration: Time budget for generating model for each api variation in seconds.
n_max: Passing 0 (default) means no max on number of models. Anything > 0 will limit the number of models to that number (if it can reach that number before the time budget duration runs out).
lib: torch for PyTorch, tf for Tenosrflow
seed: Seed for the generator, default 200.
regen: Pass 1 to regenerate models that already exist. Default: 0.

Example:

(venv) ~/dll-fuzzing-with-input-invariants$ bash scripts/generate_models_with_slurm.sh 3600 1000 torch 42 1

This will generate models for each variation of torch apis until 1h passes or 1000 max models are generated, even if models exist. 42 will be used as the seed.

Without slurm (one variant)

To run model generation for one variation or variant (unique signature of an api, full list under <lib>_variations.txt) (under the venv):

(venv) ~/dll-fuzzing-with-input-invariants$ python -m generator.z3 <variant> <duration> <n_max> <lib> <seed> <regen>

3. Fuzzing (online)

Slurm (all apis)

To run fuzzing campaings, use the scripts/fuzz_with_slurm.sh. Be sure to install and configure slurm before running this.. This runs the fuzzing campaign on apis from the file torch_apis.txt parallelly.

(venv) ~/dll-fuzzing-with-input-invariants$ bash scripts/fuzz_with_slurm.sh <duration> <n_max> <lib> <seed>

duration: Duration to fuzz each api in seconds.
n_max: Passing 0 (default) means no max on number of inputs. Anything > 0 will limit the number of inputs to that number (if it can reach that number before the time budget duration runs out).
lib: torch for PyTorch, tf for Tenosrflow
seed: Seed for the generator, default 200.

Example:

(venv) ~/dll-fuzzing-with-input-invariants$ bash scripts/fuzz_with_slurm.sh 3600 0 torch 42

This will run the z3 based generator parallelly on all apis in torch_apis.txt with seed=42, each with a time budget of 1 hour with no limits on the number of inputs or models generated.

Without slurm (one api)

To fuzz for a single api (under the venv):

(venv) ~/dll-fuzzing-with-input-invariants$ python -m generator.harness_z3 <api> <duration> <n_max> <lib> <seed>

4-1. Compute Coverage: PyTorch (evaluation)

Slurm (all apis)

To compute coverage for all apis, run the following. Be sure to install and configure slurm before running this.

(venv) ~/dll-fuzzing-with-input-invariants$ bash scripts/coverage_with_slurm.sh <n_inputs> torch

n_inputs: Number of inputs per api used for coverage calculation. Passing 0 will cause it to calculate for all inputs.

Without slurm (one api)

To compute coverage for a single api (under the venv), there are two steps.

Downloading instrumented library (the script above would download it, if that was never run, download it using these commands):

(venv) ~/dll-fuzzing-with-input-invariants$ pip install gdown
(venv) ~/dll-fuzzing-with-input-invariants$ gdown --fuzzy <link> -O instrumented_torch/

link: https://drive.google.com/file/d/1z6ijvUGN-EhsluMHksod7jH9B1CO--9_/view?usp=sharing

Patching:

(venv) ~/dll-fuzzing-with-input-invariants$ python -m eval.patching <api> <n_inputs> <lib>

Coverage:

(venv) ~/dll-fuzzing-with-input-invariants$ pip install instrumented_<lib>/<lib>*
(venv) ~/dll-fuzzing-with-input-invariants$ python -m eval.coverage <api> <lib>

4-2. Compute Coverage: Tensorflow (evaluation)

Docker (all apis)

To build and run the docker containing the instrumented Tensorflow:

(venv) ~/dll-fuzzing-with-input-invariants$ bash scripts/build_docker_tf_cov.sh

Once inside the docker:

/workspace/repo$ bash scripts/coverage_parallel.sh <n_inputs> tf <n_proc>

n_proc: Number of parallel processes (default: 100)

5. Run Oracle (bug detection)

Slurm (all apis)

To run oracle on all apis, run the following. Be sure to install and configure slurm before running this.

(venv) ~/dll-fuzzing-with-input-invariants$ bash scripts/run_oracle_with_slurm.sh <lib> <low, default: -1> <high, default -1>

low: the index to start running the oracle from. passing -1 will start from the beginning
high: the index to run oracle until. passing -1 will go through all inputs.

Without slurm (one api)

To run oracle for a single api *(under the venv)*:

(venv) ~/dll-fuzzing-with-input-invariants$ python -m eval.oracle <api> <lib>

Random Generation

To use random generation instead of the invariant-based approach, run:

(venv) ~/dll-fuzzing-with-input-invariants$ python -m generator.random_generation <api> <duration> <n_max> <lib>

It will run the code for duration seconds unless n_max is specified. With n_max, it will run until whichever comes first (duration seconds or generation of n_max inputs)

Name		Name	Last commit message	Last commit date
Latest commit History 766 Commits
ablation		ablation
appendix		appendix
asan		asan
bug_reports		bug_reports
debugging		debugging
drivers		drivers
eval		eval
generator		generator
instrumented_tf		instrumented_tf
instrumented_torch		instrumented_torch
invariants_tf		invariants_tf
invariants_torch		invariants_torch
learner		learner
llm		llm
plots		plots
reference_invariants_torch		reference_invariants_torch
references		references
rulegen		rulegen
rules-tf		rules-tf
rules-torch		rules-torch
scripts		scripts
tests		tests
utils		utils
.dockerignore		.dockerignore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
True_invariants_torch		True_invariants_torch
ablation_tf.sh		ablation_tf.sh
ablation_torch.sh		ablation_torch.sh
apis.txt		apis.txt
drivers_to_api.csv		drivers_to_api.csv
llm_exp.sh		llm_exp.sh
map_tf_api_to_raw_ops.csv		map_tf_api_to_raw_ops.csv
pipeline.sh		pipeline.sh
pytest.ini		pytest.ini
requirements.txt		requirements.txt
requirements_coverage.txt		requirements_coverage.txt
requirements_titanfuzz.txt		requirements_titanfuzz.txt
signatures.json		signatures.json
test_one_api.sh		test_one_api.sh
tf_ablation_apis.txt		tf_ablation_apis.txt
tf_apis.txt		tf_apis.txt
tf_llm_exp_apis.txt		tf_llm_exp_apis.txt
tf_variations.txt		tf_variations.txt
torch_ablation_apis.txt		torch_ablation_apis.txt
torch_apis.txt		torch_apis.txt
torch_llm_exp_apis.txt		torch_llm_exp_apis.txt
torch_variations.txt		torch_variations.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Prerequisites

Steps to run

1. Learn invariants (offline)

Slurm (all variants)

Without slurm (one variant)

2. Generate models (offline)

Slurm (all apis/variants)

Without slurm (one variant)

3. Fuzzing (online)

Slurm (all apis)

Without slurm (one api)

4-1. Compute Coverage: PyTorch (evaluation)

Slurm (all apis)

Without slurm (one api)

4-2. Compute Coverage: Tensorflow (evaluation)

Docker (all apis)

5. Run Oracle (bug detection)

Slurm (all apis)

Without slurm (one api)

Random Generation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Prerequisites

Steps to run

1. Learn invariants (offline)

Slurm (all variants)

Without slurm (one variant)

2. Generate models (offline)

Slurm (all apis/variants)

Without slurm (one variant)

3. Fuzzing (online)

Slurm (all apis)

Without slurm (one api)

4-1. Compute Coverage: PyTorch (evaluation)

Slurm (all apis)

Without slurm (one api)

4-2. Compute Coverage: Tensorflow (evaluation)

Docker (all apis)

5. Run Oracle (bug detection)

Slurm (all apis)

Without slurm (one api)

Random Generation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages