Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelized selected benchmarks using GNU Parallel #49

Closed
wants to merge 34 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
8de402e
Complete vps-audit and makeself benchmarks for CI integration + minor…
Geoka1 Dec 23, 2024
4f64a85
set-up-to-date
Geoka1 Dec 27, 2024
78a0ede
deps fix
Geoka1 Dec 27, 2024
1e864e6
fix deps
Geoka1 Dec 27, 2024
ef73a25
Merge branch 'complete-vps-makeself'
Geoka1 Dec 27, 2024
e5d652c
fix deps v2
Geoka1 Dec 27, 2024
d7be5ce
Merge branch 'binpash:main' into main
Geoka1 Dec 27, 2024
952a8e2
fixes
Geoka1 Dec 27, 2024
0fa7a0f
fixes
Geoka1 Dec 27, 2024
1ae7d6d
remove extra files
Geoka1 Dec 27, 2024
d685eec
vps-audit verification changes
Geoka1 Dec 30, 2024
e20dcc1
fetched changes
Geoka1 Dec 30, 2024
235baf4
Added makeself, vps-audit and vps-audit with negation
Geoka1 Jan 2, 2025
7308b9d
change verification for vps, added vps-negate to ci
Geoka1 Jan 2, 2025
51765ff
Make vps-audit work with existing Docker image
Geoka1 Jan 3, 2025
f9461b0
Changed README.md for iptables to work
Geoka1 Jan 3, 2025
70a04e2
vps verification changes
Geoka1 Jan 3, 2025
8926a2a
fix infotest because of varying bin sizes
Geoka1 Jan 4, 2025
9243c16
Make verify.sh fit common format
vagos Jan 4, 2025
3d60532
Bring up-to-date with upstream
Geoka1 Jan 4, 2025
45bc648
vps-audit-negate verification fixes
Geoka1 Jan 5, 2025
0ccc54d
Minor fixes on makeself
Geoka1 Jan 5, 2025
cd30d0e
Added new benchmarks on tests.yml
Geoka1 Jan 5, 2025
855f633
Merge branch 'binpash:main' into main
Geoka1 Jan 6, 2025
1860099
Merge remote-tracking branch 'upstream/main'
Geoka1 Jan 6, 2025
4d737da
Parallelizing benchmarks using GNU-Parallel
Geoka1 Jan 6, 2025
fa2f37d
Introduced GNU Parallel to more benchmarks
Geoka1 Jan 8, 2025
70778dc
Added parallelized VPS audit
Geoka1 Jan 9, 2025
5eb8c40
Changes in GNU parallel benchmarks and added Shark
Geoka1 Jan 10, 2025
b5dacee
Merge branch 'binpash:main' into adding-systems
Geoka1 Jan 11, 2025
e7dfaee
multiple changes in systems
Geoka1 Jan 11, 2025
edd30b2
Changed transformations
Geoka1 Jan 11, 2025
6b9f8e2
Removed time command
Geoka1 Jan 11, 2025
d734976
fixes
Geoka1 Jan 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 2 additions & 0 deletions infrastructure/systems/GNU-Parallel/aurpkg/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
input
outputs
6 changes: 6 additions & 0 deletions infrastructure/systems/GNU-Parallel/aurpkg/cleanup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
IN=input/
OUT=outputs/

rm -rf ${IN}/packages
rm -rf ${OUT}
exit
16 changes: 16 additions & 0 deletions infrastructure/systems/GNU-Parallel/aurpkg/deps.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
REPO_TOP=$(git rev-parse --show-toplevel)
IN=$REPO_TOP/aurpkg/input

mkdir -p ${IN}/deps/
pkgs='ffmpeg unrtf imagemagick libarchive-tools libncurses5-dev libncursesw5-dev zstd liblzma-dev libbz2-dev zip unzip nodejs tcpdump makedeb'

# Add the makedeb repository
apt-get install gpg
wget -qO - 'https://proget.makedeb.org/debian-feeds/makedeb.pub' | gpg --dearmor | sudo tee /usr/share/keyrings/makedeb-archive-keyring.gpg 1> /dev/null
echo 'deb [signed-by=/usr/share/keyrings/makedeb-archive-keyring.gpg arch=all] https://proget.makedeb.org/ makedeb main' | sudo tee /etc/apt/sources.list.d/makedeb.list
sudo apt update

if ! dpkg -s $pkgs >/dev/null 2>&1 ; then
sudo apt-get install $pkgs -y
echo 'Packages Installed'
fi
19 changes: 19 additions & 0 deletions infrastructure/systems/GNU-Parallel/aurpkg/input.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/bin/bash

REPO_TOP=$(git rev-parse --show-toplevel)
IN=$REPO_TOP/aurpkg/input

cd $REPO_TOP || exit 1

mkdir -p ${IN}

# download the packages for the package building
if [ ! -f ${IN}/packages ]; then
wget https://atlas.cs.brown.edu/data/packages --no-check-certificate -O ${IN}/packages
echo "Package dataset downloaded"
fi

if [[ "$@" == *"--small"* ]]; then
head -n 10 ${IN}/packages > ${IN}/packages_small
mv ${IN}/packages_small ${IN}/packages
fi
14 changes: 14 additions & 0 deletions infrastructure/systems/GNU-Parallel/aurpkg/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
#!/bin/bash
REPO_TOP=$(git rev-parse --show-toplevel)
IN=$REPO_TOP/aurpkg/input/packages
OUT=${OUT:-$REPO_TOP/aurpkg/outputs}
BENCHMARK_SHELL=${BENCHMARK_SHELL:-bash}
mkdir -p ${OUT}

script="./scripts/pacaur.sh"

# Switch to user "user" to avoid permission issues

echo "$script"
$BENCHMARK_SHELL "$script" "$IN" "$OUT"
echo "$?"
61 changes: 61 additions & 0 deletions infrastructure/systems/GNU-Parallel/aurpkg/scripts/pacaur.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
#!/bin/bash

# IN="$1"
# OUT="$2"

# mkcd() { mkdir -p "$1" && cd "$1"; }

# # check if not running as root
# # test "$UID" -gt 0 || { info "don't run this as root!"; exit; }

# # set link to plaintext PKGBUILDs
# pkgbuild="https://aur.archlinux.org/cgit/aur.git/plain/PKGBUILD?h"

# run_tests() {
# pgk=$1
# mkcd "${OUT}/$pkg"

# curl --insecure -o PKGBUILD "$pkgbuild=$pkg" 2> /dev/null || echo ' '

# #info "fetch required pgp keys from PKGBUILD"
# #gpg --recv-keys $(sed -n "s:^validpgpkeys=('\([0-9A-Fa-fx]\+\)').*$:\1:p" PKGBUILD)
# # Some failure is expected here, so we ignore the return code
# makedeb -d >> ../$pkg.txt 2>&1
# cd -
# }
# export -f run_tests

# # loop over required packages
# for pkg in $(cat ${IN} | tr '\n' ' ' );
# do
# echo "$pkg"
# run_tests $pkg
# done

# Using GNU Parallel:

IN="$1"
OUT="$2"

mkcd() { mkdir -p "$1" && cd "$1"; }

# Set link to plaintext PKGBUILDs
pkgbuild="https://aur.archlinux.org/cgit/aur.git/plain/PKGBUILD?h"

run_tests() {
pkg="$1"
mkcd "${OUT}/$pkg" || exit 1

curl --insecure -o PKGBUILD "$pkgbuild=$pkg" 2>/dev/null || echo ' '

# Fetch required pgp keys from PKGBUILD (optional)
# gpg --recv-keys $(sed -n "s:^validpgpkeys=('\([0-9A-Fa-fx]\+\)').*$:\1:p" PKGBUILD)
# Some failure is expected here, so we ignore the return code
makedeb -d >> "../$pkg.txt" 2>&1
cd - > /dev/null || exit 1
}
export -f run_tests mkcd
export pkgbuild

# Read package names from the input file and process them in parallel
parallel run_tests :::: "$IN"
64 changes: 64 additions & 0 deletions infrastructure/systems/GNU-Parallel/aurpkg/verify.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
#!/bin/bash

# Exit immediately if a command exits with a non-zero status
# set -e

cd "$(realpath $(dirname "$0"))"
mkdir -p hashes/small

[ ! -d "outputs" ] && echo "Directory 'outputs' does not exist" && exit 1

if [[ "$@" == *"--small"* ]]; then
hash_folder="hashes/small"
else
hash_folder="hashes"
fi

directory="outputs"

if [[ "$@" == *"--generate"* ]]; then
# Directory to iterate over

# Loop through all PKGBUILD files in the directory and its subdirectories
find "$directory" -maxdepth 1 -type f -name "*.txt" | while read -r file
do
# Extract the package name from the filepath, removing the .txt extension
package_name=$(basename "$file" .txt)

# Generate SHA-256 hash
hash=$(shasum -a 256 "$file" | awk '{ print $1 }')

# Save the hash to a file with the package name
echo "$hash" > "$hash_folder/$package_name.hash"

# Print the filename and hash
echo "$hash_folder/$package_name.hash $hash"
done

exit 0
fi

# Loop through all PKGBUILD files in the directory and its subdirectories
find "$directory" -maxdepth 1 -type f -name "*.txt" | while read -r file
do
package_name=$(basename "$file" .txt)

if [ ! -f "$hash_folder/$package_name.hash" ]; then
echo "Hash file for $package_name does not exist."
continue
fi

# Generate SHA-256 hash
hash=$(shasum -a 256 "$file" | awk '{ print $1 }')

# Read the stored hash
stored_hash=$(cat "$hash_folder/$package_name.hash")

diff <(echo "$hash") <(echo "$stored_hash") > /dev/null
match=$?

# echo "$package_name $match"
# Because of fluctuations in the makepkg output, we will ignore the hash mismatch
echo "$package_name 0"

done
2 changes: 2 additions & 0 deletions infrastructure/systems/GNU-Parallel/bio/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
inputs
outputs
16 changes: 16 additions & 0 deletions infrastructure/systems/GNU-Parallel/bio/Gene_locs.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
COL1A1 17 48260650 48291501 COL1A1_HUMAN.fa
COL1A2 7 94011365 94060544 COL1A2_HUMAN.fa
ALB 4 74262831 74287129 ALB_HUMAN.fa
AMBN 4 71457973 71473005 AMBN_HUMAN.fa
AMELY Y 6733959 6742068 AMELY_HUMAN.fa
AMELX X 11311533 11318881 AMELX_HUMAN.fa
ENAM 4 71494461 71552533 ENAM_HUMAN.fa
TUFT1 1 151512781 151556059 TUFT1_HUMAN.fa
KLK4 19 51409608 51413994 KLK4_HUMAN.fa
MMP20 11 102447566 102496063 MMP20_HUMAN.fa
AMTN 4 71384257 71398459 AMTN_HUMAN.fa
ODAM 4 71062213 71070293 ODAM_HUMAN.fa
COL17A1 10 105791044 105845760 COHA1_HUMAN.fa
WAS X 48535995 48550826 WASP_HUMAN.fa
XIRP2 2 167744997 168116263 XIRP2_HUMAN.fa
SCAF1 19 50145382 50161899 SFR19_HUMAN.fa
4 changes: 4 additions & 0 deletions infrastructure/systems/GNU-Parallel/bio/README
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Papers
1 https://www.nature.com/articles/s41586-019-1728-8
2 https://www.nature.com/articles/s41586-019-1555-y
3 https://www.nature.com/articles/s41586-020-2153-8
1 change: 1 addition & 0 deletions infrastructure/systems/GNU-Parallel/bio/cleanup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
rm -rf input output
46 changes: 46 additions & 0 deletions infrastructure/systems/GNU-Parallel/bio/deps.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# install dependencies
required_version="1.7"

# Check if Samtools is already installed and matches the required version
if command -v samtools &>/dev/null; then
installed_version=$(samtools --version | head -n 1 | awk '{print $2}')
if [[ "$installed_version" == "$required_version" ]]; then
echo "Samtools version $required_version is already installed."
else
echo "A different version of Samtools is installed: $installed_version."
echo "Proceeding to install the required version: $required_version."
fi
else
echo "Samtools is not installed. Proceeding with the installation."
# Update and install prerequisites
echo "Installing prerequisites..."
sudo apt update
sudo apt install -y build-essential libncurses5-dev libncursesw5-dev libbz2-dev liblzma-dev libcurl4-openssl-dev libssl-dev wget zlib1g-dev

# Download Samtools version 1.7
echo "Downloading Samtools version $required_version..."
wget https://github.com/samtools/samtools/releases/download/$required_version/samtools-$required_version.tar.bz2

# Extract the downloaded file
echo "Extracting Samtools..."
tar -xvjf samtools-$required_version.tar.bz2
cd samtools-$required_version

# Compile and install
echo "Compiling and installing Samtools..."
./configure
make
sudo make install

sudo ln -s /usr/local/bin/samtools /usr/bin/samtools

# Verify the installation
echo "Verifying the installation..."
installed_version=$(samtools --version | head -n 1 | awk '{print $2}')
if [[ "$installed_version" == "$required_version" ]]; then
echo "Samtools version $required_version has been successfully installed."
else
echo "Failed to install the correct version of Samtools."
exit 1
fi
fi
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
c2676d3400c9fa55f4e68992a5089a25224193397c14b4da88f5e49b46ffe47b
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0acea3d03b7c4a98f8e636822be8e9e1b041abb329a5ff9d54ee0a52d77b2fea
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
f5675a92a97e3374af03a24d2361796ec8a845e26f8f1ef0b6d5d6895da552da
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
673f7af882f20312db2fed7df1c01f9d6f086c1044fc5cdda7348c0c8ef68c81
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
cb69075e00315c3139486502bb3fc6d2787daf3fee48889f490185d3232b2a72
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
aecef6ebb48b8f4b04ba633bc4c8230d9788e9d1cd7542f366921a087c19c445
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
f0e99d7edf2676cfe7ab5b564b5f796fa9324ca70c89b784bb7c2a5c1055e6d5
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
79a95f667457e379fb71310cf00fa643608f3d7f6dfb7421f4f28cf76669505d
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
a90edec10a5c9f1c42dd744359475712ebd9759a12dc43f6a10ed9340cbb21f8
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
55090abdc3c9d37f51a22e47460162a1ba2b45078b944e060a3ba4dc28a68a5d
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
b231993a86d988bc2f24e342e6ac32c58651e9cc121d5c6961ec24c3a12a3624
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fec413b3018fec127ff33d93219582195fd5286c9e6f01f1d58f365a62fbaecf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fec413b3018fec127ff33d93219582195fd5286c9e6f01f1d58f365a62fbaecf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fec413b3018fec127ff33d93219582195fd5286c9e6f01f1d58f365a62fbaecf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fec413b3018fec127ff33d93219582195fd5286c9e6f01f1d58f365a62fbaecf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fec413b3018fec127ff33d93219582195fd5286c9e6f01f1d58f365a62fbaecf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fec413b3018fec127ff33d93219582195fd5286c9e6f01f1d58f365a62fbaecf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fec413b3018fec127ff33d93219582195fd5286c9e6f01f1d58f365a62fbaecf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fec413b3018fec127ff33d93219582195fd5286c9e6f01f1d58f365a62fbaecf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fec413b3018fec127ff33d93219582195fd5286c9e6f01f1d58f365a62fbaecf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fec413b3018fec127ff33d93219582195fd5286c9e6f01f1d58f365a62fbaecf
28 changes: 28 additions & 0 deletions infrastructure/systems/GNU-Parallel/bio/input.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
IN=inputs
IN_NAME=input.txt

if [[ "$@" == *"--small"* ]]; then
IN_NAME=input_small.txt
fi

if [[ $1 == "-c" ]]; then
rm -rf *.bam
rm -rf *.sam
rm -rf ../output
exit
fi

cd "$(realpath $(dirname "$0"))"

mkdir -p inputs
mkdir -p outputs

cat ${IN_NAME} | while read s_line;
do
sample=$(echo $s_line |cut -d " " -f 2);
if [[ ! -f "inputs/$sample".bam ]]; then
pop=$(echo $s_line |cut -f 1 -d " ");
link=$(echo $s_line |cut -f 3 -d " ");
wget -O "${IN}/$sample".bam "$link"
fi
done;
Loading
Loading