Towards More Complete Constraints for Deep Learning Library Testing via Complementary Set Guided Refinement

Welcome to the artifact repository of the DeepConstr paper which is accepted by ISSTA 2024.

Source Code Structure

|-- build        # Directory for compiling PyTorch and TensorFlow
|-- data         # Data directory, contains records of constraints and intersected operator names
|-- deepconstr   # Main implementation of DeepConstr
|   |-- error.py     # Error handling module for DeepConstr
|   |-- gen          # Implementation for test case generation from SMT-expression
|   |-- grammar      # Implementation for SMT-expression grammar to convert natural language into SMT-expression
|   |-- train        # Implementation for constraint extraction and refinement
|   |-- logger.py    # Logging module for DeepConstr
|   `-- utils.py     # Utility functions for DeepConstr
|-- docs         # Documentation for the project
|-- experiments  # Scripts for conducting experiments
|-- nnsmith      # Main implementation of NNSmith
|-- requirements.txt # List of Python dependencies
|-- outputs      # Directory for output files generated by the project
|-- tests        # Test scripts for verifying the functionality of the project
|-- collect_cov.sh
|-- collect_env.py
|-- fuzz.sh
|-- LICENSE
|-- README.md
`-- requirements.txt

Bug Finding Evidence (RQ3)

You can find the bug finding evidence here.

Get Ready

Before you start, please make sure you have Docker installed. To check the installation:

docker --version # Test docker availability

Get Docker image from Docker Hub

docker pull gwihwan/artifact-issta24:latest

Navigate to the DeepConstr project directory.

cd ../DeepConstr

Start fuzzing

You can start fuzzing with the fuzz.py script.

Note

Command usage of: python nnsmith/cli/fuzz.py

Arguments:

mgen.max_nodes: the number of operators in each generated graph.
mgen.method: approach of generated constraints, choose from ["deepconstr", "neuri", "symbolic-cinit"].
model.type: generated model type, choose from ["tensorflow", "torch"].
backend.type: generated backend type, choose from ["xla", "torchjit"].
fuzz.time: fuzzing time in formats such as 4h, 1m, 30s.
mgen.record_path: the directory that constraints are saved, such as $(pwd)/data/records/torch.
fuzz.save_test: the directory that generated test cases are saved, such as $(pwd)/bugs/${model.type}-${mgen.method}-n${mgen.max_nodes}.
fuzz.root: the directory that buggy test cases are saved, such as $(pwd)/bugs/${model.type}-${mgen.method}-n${mgen.max_nodes}-buggy.
mgen.test_pool(Optional): specific API for fuzzing. If not specified, fuzzing will be conducted across all prepared APIs.

Outputs: The buggy test cases will be saved in the directory specified by fuzz.root, while every generated test case will be saved in the directory specified by fuzz.save_test.

Quick start for PyTorch

First, activate the conda environment created for this project.

conda activate std

For PyTorch, you can specify the APIs to be tested by setting the mgen.test_pool argument, such as [torch.abs,torch.add]. For example, following code will fuzz torch.abs and torch.add for 15 minutes.

PYTHONPATH=$(pwd):$(pwd)/deepconstr:$(pwd)/nnsmith python nnsmith/cli/fuzz.py fuzz.time=15m \
mgen.record_path=/DeepConstr/data/records/torch fuzz.root=/DeepConstr/outputs/torch-deepconstr-n5-torch.abs-torch.add \
fuzz.save_test=/DeepConstr/outputs/torch-deepconstr-n5-torch.abs-torch.add.models \
model.type=torch backend.type=torchcomp filter.type=[nan,dup,inf] \
debug.viz=true hydra.verbose=['fuzz'] fuzz.resume=true \
mgen.method=deepconstr mgen.max_nodes=5 mgen.test_pool=[torch.abs,torch.add] mgen.pass_rate=10

If the mgen.test_pool is not specified, the program will fuzz all APIs that deepconstr supports. Following code will fuzz all APIs that deepconstr support for 4 hours.

PYTHONPATH=$(pwd):$(pwd)/deepconstr:$(pwd)/nnsmith python nnsmith/cli/fuzz.py fuzz.time=4h \
mgen.record_path=/DeepConstr/data/records/torch \
fuzz.root=/DeepConstr/outputs/torch-deepconstr-n5-torch.abs-torch.add \
fuzz.save_test=/DeepConstr/outputs/torch-deepconstr-n5-torch.abs-torch.add.models \
model.type=torch backend.type=torchcomp filter.type=[nan,dup,inf] debug.viz=true hydra.verbose=['fuzz'] fuzz.resume=true mgen.method=deepconstr mgen.max_nodes=5 mgen.pass_rate=10

Quick start for TensorFlow

First, activate the conda environment created for this project.

conda activate std

Then, execute the following commands to start fuzzing. Following code will fuzz all APIs that deepconstr supports for 4 hours.

PYTHONPATH=$(pwd):$(pwd)/deepconstr:$(pwd)/nnsmith python nnsmith/cli/fuzz.py fuzz.time=4h \
mgen.record_path=/DeepConstr/data/records/tf \
fuzz.root=/DeepConstr/outputs/tensorflow-deepconstr-n5- fuzz.save_test=/DeepConstr/outputs/tensorflow-deepconstr-n5-.models \
model.type=tensorflow backend.type=xla filter.type=[nan,dup,inf] \
debug.viz=true hydra.verbose=['fuzz'] \
fuzz.resume=true mgen.method=deepconstr mgen.max_nodes=5 mgen.pass_rate=10

Generate python code

The test case of deepconstr is saved as the format of gir.pkl. To convert the git.pkl into python code, you can utilize below code. You can specify the code with the option of compiler. For now, we support "torchcomp" compiler with pytorch. You can use following code to convert the gir.pkl which is saved at code_saved_dir into python code. If you followed above quick start, you can use below code to convert the code.

PYTHONPATH=$(pwd):$(pwd)/deepconstr:$(pwd)/nnsmith python nnsmith/materialize/torch/program.py /DeepConstr/outputs/torch-deepconstr-n5-torch.abs-torch.add torchcomp

Extract Constraints

Setup Instructions

(optional) If you are not using docker, install required packages:

pip install -r requirements.txt

Generate a .env file in your workspace directory $(pwd)/.env and populate it with your specific values:

OpenAI API Key: OPENAI_API_KEY1 ='sk-********'
Proxy Setting (Optional): MYPROXY ='166.***.***.***:****'

Testing Your Configuration: After setting your environment variables, you can verify your configuration by running:

PYTHONPATH=$(pwd):$(pwd)/deepconstr:$(pwd)/nnsmith python tests/proxy.py
# INFO    llm    - Output(Ptk12-OtkPtk9) : 
# Hello! How can I assist you today? 
# Time cost : 1.366152286529541 seconds

If configured correctly, you will receive a response from the OpenAI API, such as: "Hello! How can I assist you today?"

Start Extraction

You can extract constraints by running deepconstr/train/run.py script.

Note

Command usage of: python deepconstr/train/run.py

Important Arguments:

tran.target: Specifies the API name or path to extract. This can be a single API name (e.g., "torch.add"), a list containing multiple API names (e.g., ["torch.add", "torch.abs"]), or a JSON file path containing the list.
train.retrain: A boolean value that determines whether to reconduct constraint extraction. If set to false, the tool will only collect APIs that haven't been extracted. If set to true, the tool collects all APIs except those where the pass rate exceeds the preset target pass rate (train.pass_rate).
train.pass_rate: The target pass rate to filter out APIs that have a pass rate higher than this target.
train.parallel: The number of parallel processes used to validate the constraints. We do not recommend to set this argument to 1.
train.record_path: The path where the extracted constraints are saved. This directlry should be the same as the mgen.record_path in the fuzzing step.
hydra.verbose: Set the logging level of Hydra for specific modules ("smt", "train", "convert", "constr", "llm", etc). If you want to see all the log messages, you can set it to True.
train.num_eval: The number of evaluations performed to validate the constraints (default: 500).
model.type: Choose from ["tensorflow", "torch"].
backend.type: Choose from ["xla", "torchjit"].

Other Arguments: For additional details, refer to the values under train at /DeepConstr/nnsmith/config/main.yaml.

Outputs:

$(pwd)/${train.record_path}/torch if model.type is torch
$(pwd)/${train.record_path}/tf if model.type is tensorflow

Quick Start :

Please set your train.record_path to the desired location that you want to store. For instance, $(pwd)/repro/records/torch

for PyTorch

Below command will extract constraints from "torch.add","torch.abs". The extracted constraints are stored to $(pwd)/repro/records/torch. We recommand to set train.parallel to larger than 1. The approximate cost will be $0.40 for extracting constraints from 2 operators.

PYTHONPATH=/DeepConstr/:/DeepConstr/nnsmith/:/DeepConstr/deepconstr/:$PYTHONPATH \
python deepconstr/train/run.py train.record_path=repro/records/torch backend.type=torchcomp \
model.type=torch hydra.verbose=train train.parallel=1 train.num_eval=500 \
train.pass_rate=95 hydra.verbose=['train'] \
train.retrain=false train.target='["torch.add","torch.abs"]'

By specifying the path to a JSON file, you can target a specific set of APIs for processing. This JSON file should contain a list of API names.

PYTHONPATH=/DeepConstr/:/DeepConstr/nnsmith/:/DeepConstr/deepconstr/:$PYTHONPATH \
python deepconstr/train/run.py train.record_path=repro/records/torch backend.type=torchcomp \
model.type=torch hydra.verbose=train train.parallel=1 train.num_eval=500 \
train.pass_rate=95 hydra.verbose=['train'] \
train.retrain=false train.target='/your/json/path'

for TensorFlow

Below command will extract constraints from "tf.add", "tf.abs". The extracted constraints are stored to $(pwd)/repro/records/tf. The approximate cost will be less than $0.01 since these operators does not contain complicate constraints.

PYTHONPATH=/DeepConstr/:/DeepConstr/nnsmith/:/DeepConstr/deepconstr/:$PYTHONPATH \
python deepconstr/train/run.py train.record_path=repro/records/tf backend.type=xla \
model.type=tensorflow hydra.verbose=train train.parallel=1 train.num_eval=300 train.pass_rate=95 hydra.verbose=['train'] \
train.retrain=false train.target='["tf.add","tf.abs"]'

For NumPy

Below command will extract constraints from "numpy.add" and extracted constraints are stored to $(pwd)/repro/records/numpy. The approximate cost will be less than $0.01 for running below commands.

PYTHONPATH=/DeepConstr/:/DeepConstr/nnsmith/:/DeepConstr/deepconstr/:$PYTHONPATH \
python deepconstr/train/run.py train.record_path=repro/records/numpy backend.type=numpy \
model.type=numpy hydra.verbose=train train.parallel=1 train.num_eval=300 train.pass_rate=95 hydra.verbose=['train'] \
train.retrain=false train.target='["numpy.add"]'

Reproduce Experiments

Comparative Experiment (RQ1)

Check trained operators( table 1)

You can inspect the number of trained APIs by executing the following commands:

python experiments/apis_overview.py /DeepConstr/data/records
# Number of trained tf apis:  258
# Number of trained torch apis:  843

Coverage Comparison Experiment

Note

To reproduce this experiment, please pull our Docker image.

We have four baselines for conducting experiments. Additionally, approximately 700 operators (programs) require testing for PyTorch and 150 operators for TensorFlow. Given that each operator needs to be tested for 15 minutes, completing the experiment will be time-intensive. To expedite the process, we recommend using the exp.parallel argument to enable multiple threads during the experiment(We set this to 16 when running the experiment). The experiment results will be saved in the folder specified by exp.save_dir.

for PyTorch

First, change the environment to the conda environment created for this project. We strongly recommend to set exp.parallel larger than 1.

conda activate cov

PYTHONPATH=/DeepConstr/:/DeepConstr/nnsmith/:/DeepConstr/deepconstr/:$PYTHONPATH \
python experiments/evaluate_apis.py \
exp.save_dir=exp/torch mgen.record_path=$(pwd)/data/records/torch/ mgen.pass_rate=0.05 model.type=torch backend.type=torchjit fuzz.time=15m exp.parallel=16 mgen.noise=0.8 exp.targets=/DeepConstr/data/torch_dc_neuri.json exp.baselines="['deepconstr', 'neuri', 'symbolic-cinit', 'deepconstr_2']"

for TensorFlow

PYTHONPATH=/DeepConstr/:/DeepConstr/nnsmith/:/DeepConstr/deepconstr/:$PYTHONPATH \
python experiments/evaluate_apis.py \
exp.save_dir=exp/tf mgen.record_path=$(pwd)/data/records/tf/ mgen.pass_rate=0.05 model.type=tensorflow backend.type=xla fuzz.time=15m exp.parallel=16 mgen.noise=0.8 exp.targets=/DeepConstr/data/tf_dc_neuri.json exp.baselines="['deepconstr', 'neuri', 'symbolic-cinit', 'deepconstr_2']"

Summarize the results

Specify the folder name that you used in a previous experiment. Use the -o option to name the output file. The final experiment results will be saved in the path that is specified through -o.

For example, to specify a folder named pt_gen and save the results to pt_gen.csv, use the following command:

python experiments/summarize_merged_cov.py -f exp/torch -o torch_exp -p deepconstr -k torch
# Result will be saved at /DeepConstr/results/torch_exp.csv
python experiments/summarize_merged_cov.py -f exp/tf -o tf_exp -p deepconstr -k tf
# Result will be saved at /DeepConstr/results/tf_exp.csv

When encounters with unnormal values

Occasionally, you may encounter abnormal coverage values, such as 0. In such cases, please refer to the list of abnormal values saved at $(pwd)/results/unnormal_val*. To address these issues, re-run the experiment with the following adjustments to your arguments: mode=fix exp.targets=$(pwd)/results/unnormal_val*.

PYTHONPATH=/DeepConstr/:/DeepConstr/nnsmith/:/DeepConstr/deepconstr/:$PYTHONPATH \
python experiments/evaluate_apis.py \
exp.save_dir=pt_gen mgen.record_path=$(pwd)/data/records/torch/ mgen.pass_rate=0 model.type=torch backend.type=torchjit fuzz.time=15m exp.parallel=1 mgen.noise=0.8 exp.targets=/DeepConstr/results/unnormal_val_deepconstr_torch.json

Constraint Assessment (RQ2)

You can review the overall scores of constraints by executing the following script: You can look into the overall scores of constraints by running below scripts.

python experiments/eval_constr.py
#######  torch  ####### 
#DeepConstr
## Num of Sub Constraints :  7072 from 929 number of operators
## Mean :  7.61248654467169  Median  6
#DeepConstr^s
## Num of Sub Constraints :  7540 from 855 number of operators
## Mean :  8.818713450292398  Median  7
#...

This script will automatically gather the constraints from the default locations(/DeepConstr/data/records/). The resulting plots will be saved at/DeepConstr/results/5_dist_tf.png for TensorFlow and /DeepConstr/results/5_dist_torch.png for PyTorch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Towards More Complete Constraints for Deep Learning Library Testing via Complementary Set Guided Refinement

Source Code Structure

Bug Finding Evidence (RQ3)

Get Ready

Start fuzzing

Quick start for PyTorch

Quick start for TensorFlow

Generate python code

Extract Constraints

Setup Instructions

Start Extraction

Quick Start :

for PyTorch

for TensorFlow

For NumPy

Reproduce Experiments

Comparative Experiment (RQ1)

Check trained operators( table 1)

Coverage Comparison Experiment

for PyTorch

for TensorFlow

Summarize the results

When encounters with unnormal values

Constraint Assessment (RQ2)

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 258 Commits
build		build
data		data
deepconstr		deepconstr
docs		docs
experiments		experiments
nnsmith		nnsmith
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
collect_cov.sh		collect_cov.sh
collect_env.py		collect_env.py
fuzz.sh		fuzz.sh
requirements.txt		requirements.txt

License

THU-WingTecher/DeepConstr

Folders and files

Latest commit

History

Repository files navigation

Towards More Complete Constraints for Deep Learning Library Testing via Complementary Set Guided Refinement

Source Code Structure

Bug Finding Evidence (RQ3)

Get Ready

Start fuzzing

Quick start for PyTorch

Quick start for TensorFlow

Generate python code

Extract Constraints

Setup Instructions

Start Extraction

Quick Start :

for PyTorch

for TensorFlow

For NumPy

Reproduce Experiments

Comparative Experiment (RQ1)

Check trained operators( table 1)

Coverage Comparison Experiment

for PyTorch

for TensorFlow

Summarize the results

When encounters with unnormal values

Constraint Assessment (RQ2)

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages