SING: Improving the efficiency of Secure Multi-Party Computation Protocol Assignment using Neural Networks
This repository contains the full code and a small excerpt of our dataset for reproducing our training and evaluation results.
python -m venv venv
source venv/bin/activate
# may differ depending on your platform
#
# see https://pytorch.org/get-started/locally/
pip install torch torchvision torchaudio
pip install -r requirements.txt
# unpack dataset
tar xf dataset-excerpt.tar.gz
tar xf dataset-excerpt-c.tar.gzOptional: Compiling ABY
This step is not necessary for training or evaluation of our models. ABY is required for benchmarking MPC performance.
cd ABY-vendor
mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -DABY_BUILD_EXE=On
makeOptional: Compiling Silph
This step is not necessary for training, evaluation, or benchmarking. Compiling Silph is a precondition for comparing SING and Silph performance. We have applied minor fixes to the build system of the original Silph repository. The build system as a whole is identical to the original release. We thus refer to the documentation and paper for more detailed instructions.
Silph is written in Rust and thus requires a stable Rust toolchain which is commonly installed with rustup.
Furthermore, an installation of coinor-cbc is required for ILP
solving. This library can be found in some distribution repositories
or built from source. Other
libraries that Silph depends on (e.g., KaHyPar) will be built from
source automatically.
cd silph
python driver.py --features aby bench c lp r1cs smt
python driver.py --install
python driver.py --build --mode releaseNote that this build process may take multiple hours as tests will be run as part of the build process.
Due to the large size of the dataset, we only provide a small excerpt in this repository.
dataset-excerpt-ccontains the original C source files for the circuits.dataset-excerptcontains a processed excerpt of the dataset consisting of compiled circuits, alternative share assignments, and benchmarked metrics.
Training scripts will process the dataset further into the PyTorch Geometric format. This happens automatically. We refer to the following sections on training for more details.
Note that the training and evaluation performance of this dataset excerpt will vary from the results in our paper.
Optional: Bootstrapping the dataset from C source files
The dataset processing starts with a directory containing C source
files (dataset-excerpt-c).
Compilation requires a compiled version of Silph in
$CARGO_MANIFEST_DIR (cf. Initial Setup).
scripts/generate_dataset.shpushd dataset-compiled
scripts/find_duplicate_circuits.sh failed.log
popdThis is a necessary step for training our cost prediction model.
python generate_alternative_share_assignments.pyThis step is necessary for training our cost prediction model on real-world benchmarks (SING 3).
Benchmarking requires ABY (cf. Initial Setup).
python benchmark_mpc.py --mode dataset --network-setting LAN --cost-name runtime-neon-lan --metric runtimeTraining and evaluation requires splitting the dataset into training, validation, and test sets. This is done as follows:
python generate_dataset_split.pygenerate_dataset_split.py includes many configuration options that
influence the resulting split (e.g., setting a maximum size threshold
for circuits included in the dataset). Run
python generate_dataset_split.py --helpfor a detailed overview of all command-line options.
Optional: Generating circuits with LLMs
We use Ollama as a local LLM inference engine. Instructions on installing Ollama can be found on the official website.
Once Ollama is set up, various open-source LLMs can be downloaded, e.g.,
ollama pull gemma3:4bUsing a script, all locally downloaded models can be queried with all available prompts respectively.
cd llm-generate
bash generate_missing_combinations.shOptional: Generating random circuits
cd grammar-generate
python generate.pySeveral options of the generation process (e.g., operation budget) can be configured via command-line options. Run
python generate.py --helpfor a full list of options.
Our cost prediction model
We provide model checkpoints used in our evaluation in the
pretrained directory.
Depending on whether the model predicts Silph costs or benchmarked runtimes, the training process uses a different directory to store the dataset in PyTorch Geometric format.
- Silph costs:
dataset-cost-prediction - Benchmarked runtimes:
dataset-cost-prediction-measured
# train on Silph costs
mkdir -p dataset-cost-prediction
ln -s dataset-excerpt dataset-cost-prediction/raw
python train_cost_prediction.py --lr 0.001 --cost-name silph
# train on benchmarked costs (e.g., runtime-neon-lan)
mkdir -p dataset-cost-prediction-measured
ln -s dataset-excerpt dataset-cost-prediction-measured/raw
python train_cost_prediction.py --lr 0.001 --cost-name runtime-neon-lanTrained models will be saved in the checkpoints directory.
Use eval_cost_prediction.py to calculate metrics on how the share
assignment of SING differs from that of Silph (e.g., MSE, R2-score).
python eval_cost_prediction.pyUsing the --checkpoint <PATH> flag, the evaluation can be performed
on a specific model checkpoint.
This script supports multiple command-line options to load a specific model checkpoint, filter the dataset, or configure visulization. Run
python eval_cost_prediction.py --helpfor a detailed overview of all command-line options.
Our share assignment model
We provide model checkpoints used in our evaluation in the
pretrained directory.
The dataset in PyTorch Geometric format will be stored in dataset.
mkdir -p dataset
ln -s dataset-excerpt dataset/raw
# supervised (SING 1)
python train.py --lr 0.01 --alpha 0.5
# semi-supervised (SING 2, SING 3)
python train.py --lr 0.01 --alpha 0.1 --predicted-costTrained models will be saved in the checkpoints directory.
Use eval.py to calculate metrics on how the share assignment of SING
differs from that of Silph (e.g., accuracy, confusion matrix).
python eval.pyUse benchmark_share_assignment.py to compare the SING and Silph
runtimes of generating share assignments for circuits. This step
requires a compiled version of Silph (cf. Initial Setup).
python benchmark_share_assignment.py --circuits-file paper_benchmark_c.txtUse bechmark_mpc.py to benchmark runtimes and communication amounts
of SING and Silph share assignments. The result will be written to a
CSV file. This step requires a compiled version of ABY (cf. Initial
Setup).
python benchmark_mpc.py --mode benchmark --hashes paper_benchmark_hashes.txtFor setting up network simulations, benchmark_mpc.py needs to be run
as a user with sudo ability, i.e., the user needs to be in the
wheel group.
Using the --checkpoint <PATH> flag, the evaluation can be performed
on a specific model checkpoint.
All plots and tables in our paper can be reproduced from measured data
using the benchmark-results/plot.py script.