Skip to content

Latest commit

 

History

History
194 lines (133 loc) · 5.3 KB

README.md

File metadata and controls

194 lines (133 loc) · 5.3 KB

PPR: Pairwise Program Reduction

In many compiler testing scenarios, the bug-triggering programs (referred to as variants) are generated by randomly mutating existing programs (referred to as seeds) from a pool. Traditional program reducers only aim to reduce the bug-triggering variants, but neglect the potential information from seeds, even though they are available.

PPR is a pairwise program reducer built upon Perses. Given a seed program $P_{s}$, a variant $P_{v}$ derived from $P_{s}$, and the properties $P_{s}$ and $P_{v}$ exhibit separately (e.g., $P_{v}$ crashes a compiler whereas $P_{s}$ does not), PPR not only reduces the sizes of $P_{s}$ and $P_{v}$, but also minimizes the differences between $P_{s}$ and $P_{v}$.

The final result of PPR is a pair of minimized programs $p_{s}$ and $p_{v}$. $p_{v}$ still crashes a compiler whereas $p_{s}$ still does not. The minimized differences between $p_{s}$ and $p_{v}$ reveal the program elements that are more relevant to the bug.

PPR are designed for several purposes:

  • To help compiler developers identify bug-triggering changes on program inputs
  • To help compiler fuzzer developers identify effective changes/mutations

PPR requires the following conditions to be satisfied:

  • The seed program is available

  • The seed and variant programs share some commonalities. In the implementation, PPR can tolerate that at most 90% of the seed and variant programs are different. We will make this threshold value toggleable in future updates.

  • The seed and variant programs exhibit different properties, e.g., the variant program triggers a compiler bug while the seed program does not.

Important Flags

  • --test-script <test-script.sh>:

    The script encodes the constraints for both seed program and variant program. It checks whether both programs satisfy their corresponding constraints. The reduced version of seed program and variant program should still satisfy the constraints.

    The script should return 0 if the constraints are satisfied.

    Required.

  • --input-file <seed-program-file>:

    The seed program needs to be reduced.

    Required.

  • --variant-file <variant-program-file>:

    The variant program needs to be reduced.

    Required.

  • --output-dir <directory-for-results>:

    The directory to save the results.

    Required.

  • --min-tdiff <bool>:

    Whether to enable tree-based diff reducer.

    Optional. Default: true

  • --min-ldiff <bool>:

    Whether to enable list-based diff reducer.

    Optional. Default: true

  • --min-commonality <bool>:

    Whether to enable commonality reducer.

    Optional. Default: true

Check all available command line arguments

bazelisk run //ppr/src/org/perses/ppr:main -- --help

Build and Run

To build and run PPR

# build
bazelisk build //ppr/src/org/perses/ppr:main
# run
# bazel creates a sandbox for execution, so make sure to pass absolute paths
bazelisk run //ppr/src/org/perses/ppr:main -- --input-file /absolute/path/to/seed.c --variant-file /absolute/path/to/variant.c --test-script /absolute/path/to/r.sh --output-dir /absolute/path/to/output-dir

Running example

seed.c

// original seed.c
#include <stdio.h>
int main(void)
{
    int a = 1;
    int b = 2;
    printf("%d", a);
    return 0;
}

variant.c

// original variant.c
#include <stdio.h>
int main(void)
{
    int a = 1;
    int b = 2;
    a++;
    b--;
    printf("%d", a);
    return 0;
}

r.sh

#! /bin/bash
# r.sh check the property of both seed.c and variant.c

# the compiled seed.c should print "1"
gcc seed.c -o seed
./seed > seed.txt
if ! grep -e "^1$" seed.txt ; then
    exit 1
fi

# the compiled variant.c should print "2"
gcc variant.c -o variant
./variant > variant.txt
if ! grep -e "^2$" variant.txt ; then
    exit 1
fi
exit 0

After running PPR, the final results are given in the output-dir:

Minimized seed.c

// minimized seed.c
main() {
  int a = 1;
  printf("%d", a);
}

Minimized variant.c

// minimized variant.c
main() {
  int a = 1;
  // the only critical difference
  a++;
  printf("%d", a);
}

PPR removed all code elements that are irrelevant to the property (r.sh). The only critical difference is a++;, making the variant.c print "2", different from "1" of the seed.c.

In real-world compiler testing scenarios, r.sh may contain a broader range of checking processes, including checks for undefined behavior (UB).

Supported Languages

PPR is built upon Perses, so it supports any language supported by Perses. If you want to add a new language grammar, please follow install_new_languages.md.

References

PPR was published at ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) 2023 as a full research paper.

Link to the pdf.

Link to the benchmarks.

@inproceedings{ppr,
  title={PPR: Pairwise Program Reduction},
  author={Zhang, Mengxiao and Xu, Zhenyang and Tian, Yongqiang and Jiang, Yu and Sun, Chengnian},
  booktitle={Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering},
  pages={338--349},
  year={2023}
}