PanVC 3

PanVC 3 is a set of tools to be used as part of a variant calling workflow that uses short reads as its input. The reads are aligned to an index generated from a multiple sequence alignment. A suitable index may be built from founder sequences.

Running a variant calling workflow that utilises PanVC may consist of e.g. the following phases:

Generating founder sequences from known variants
Indexing the founder sequences
Running the read alignment and variant calling workflow

The founder sequences may be generated with vcf2multialign.

Academic Use

If you use the software in an academic setting, we kindly ask you to cite Tackling reference bias in genotyping by using founder sequences with PanVC 3.

@article{Norri2024TacklingReferenceBias,
  author = {Norri, Tuukka and Mäkinen, Veli},
  title = {Tackling reference bias in genotyping by using founder sequences with PanVC 3},
  journal = {Bioinformatics Advances},
  volume = {4},
  number = {1},
  pages = {vbae027},
  year = {2024},
  month = {03},
  issn = {2635-0041},
  doi = {10.1093/bioadv/vbae027},
  url = {https://doi.org/10.1093/bioadv/vbae027},
  eprint = {https://academic.oup.com/bioinformaticsadvances/article-pdf/4/1/vbae027/56912765/vbae027.pdf},
}

Running

A simple example workflow and test data are provided in the test-workflow subdirectory. The workflow downloads PanVC 3 as well as other required software automatically from Anaconda. Please see README.md in the subdirectory.

A more complex workflow that uses Bowtie 2 and loads the settings using Snakemake’s configuration (e.g. a YAML file) is in the bowtie2-workflow subdirectory. Please see README.md in the subdirectory.

index_msa builds a co-ordinate transformation data structure from a multiple sequence alignment, as well as the sequences as unaligned FASTA to be used as input for a read aligner.
project_alignments uses the co-ordinate transformation data structure to project alignments in BAM or SAM format to well-known co-ordinates, rewrites the CIGAR strings to match the new reference sequence, and realigns parts of the reads if needed.
recalculate_mapq recalculates the mapping qualities of the alignments given as input, taking into account the projected co-ordinate of each alignment.
subset_alignments subsets the given alignments by some criteria, e.g. selecting the (paired) alignment with the best mapping quality for each read.
count_supporting_reads counts the number of aligned reads that support some known variants. From the output, reference bias can be calculated with calculate_reference_bias.py.
rewrite_cigar replaces sequence match operations in CIGAR strings (= and X) with alignment match operations (M) and vice-versa.

Please use the --help option with each of the tools for usage. See also the workflow written for the test data.

Installing

Binaries for Linux on x86-64 are available on Anaconda. PanVC 3 may be installed with conda install -c tsnorri -c conda-forge panvc3=v1.0. glibc 2.28 or newer is required. (ldd --version may be used to check the version installed with your operating system.)

Building

To clone the repository with submodules, please use git clone --recursive https://github.com/tsnorri/panvc3.git.

With conda-build

A conda package can be built with conda-build as follows. The build script has been tested with conda-build 3.25.0. glibc 2.28 or newer is required.

cd conda
./conda-build.sh

Conda-build will then report the location of the package from which binaries may be extracted.

By Hand

The following software and libraries are required to build PanVC 3. The tested versions are also listed.

The following are needed to build libdispatch (provided as a Git submodule) on Linux:

After installing the prerequisites, please do the following:

Create a file called local.mk in the root of the cloned repository to specify build variables. One of the files linux-static.local.mk and conda/local.mk.m4 may be used as a starting point.
Run Make with e.g. make -j16.

Name		Name	Last commit message	Last commit date
Latest commit History 467 Commits
alignment-statistics		alignment-statistics
bowtie2-workflow		bowtie2-workflow
conda		conda
convert-bed-positions		convert-bed-positions
count-supporting-reads		count-supporting-reads
include/panvc3		include/panvc3
index-msa		index-msa
lib		lib
libpanvc3		libpanvc3
make		make
project-alignments		project-alignments
recalculate-mapq		recalculate-mapq
rewrite-cigar		rewrite-cigar
split-alignments-by-reference		split-alignments-by-reference
subset-alignments		subset-alignments
test-workflow		test-workflow
tests		tests
tools		tools
.editorconfig		.editorconfig
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
linux-static.local.mk		linux-static.local.mk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PanVC 3

Academic Use

Running

Contents

Installing

Building

With conda-build

By Hand

About

Languages

License

tsnorri/panvc3

Folders and files

Latest commit

History

Repository files navigation

PanVC 3

Academic Use

Running

Contents

Installing

Building

With conda-build

By Hand

About

Topics

Resources

License

Stars

Watchers

Forks

Languages