- This repository contains bash developed pipelines: gatk_cohort.sh,the GATK cohort training model, and gatk_case.sh, the pipeline for germline CNV discovery in case samples.
- These scripts have been developed at the Institute of Biological Chemistry of the School of Exact and Natural Science (IQUIBICEN), University of Buenos Aires.
Generates the Cohort model for exome germline GATK-CNV discovery,according to gGATK-CNVs guidelines.
- Requirements:
- Tools:
- GATK 4.1.0.0
- gCNV computational python module gcnvkernel installed with user-conda(for details,look at: https://gatk.broadinstitute.org/hc/en-us/articles/360035889851) *Input:
- the Reference genome used to generate sample BAMs
- the Exome capture kit used for library preparation
- COHORT sample BAMs
- Output:
- A COHORT model to be used for CASE CNVs discovery
- VCFs of CNVs calls for the COHORT samples.
Identify CNVs in case samples using the cohort model generated by gatk_cohort.sh.These cases should be samples sequenced under the same conditions of the cohort.
- Requirements:
- Tools:
- GATK 4.1.0.0
- gCNV computational python module gcnvkernel installed with user-conda(for details,look at https://gatk.broadinstitute.org/hc/en-us/articles/360035889851)
- Input:
- Reference genome used to generate sample BAMs
- ref.hg38.interval_list, generated in the COHORT mode, by gatk_cohort.sh
- CASE sample BAMs
- COHORT model
- Output:
- VCF of CNV calls for CASE samples.