Skip to content

sgul-genetics-centre-bioinformatics/SGUL-Exome-Pipeline-on-HPC

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

SGUL-Exome-Pipeline-on-HPC

Scripts for QC, Alignemnt, Variant Calling and Joint Genotyping of Exome Sequence Data. Following the GATK Best Practices guidelines.

Sample fastq data (paired end R1 and R2 files) is stored in individual sample folders in Unaligned as input for pipeline

e.g.

  • ./Unaligned

    • ./MyProject1

      • ./MYsample1
      • ./MYsample2
      • ./MYsample3
    • Directory/file structure:

      • /Exomes

        • ./Aligned
        • ./Unaligned
        • ./VCF
        • ./Annotated
        • ./gVCF
        • ./Jobs
    • Needs the following resources to run: For more information check dependencies.py

      • FastQC
      • MultiQC
      • python2.7.5 These scripts have been developed and tested using python2.7. Feel free to update them.
      • bwa
      • human_g1k_v37.fasta
      • samtools
      • java
      • picard-2.815
      • gatk-4.0.4.0
      • Genome_reference_files/common_all.vcf
      • Mills_and_1000G_gold_standard.indels.hg19_modified.vcf
      • 1000G_phase1.snps.high_confidence.hg19.sites.vcf
      • BroadExACExomeIntervlas.bed (Exome Target, the this case the Broad definition)

      Usage

      in the Exome directory:

      python pipeline_Exome_SGUL_HPC_v1.1.py MyProject1

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%