Skip to content

Latest commit

 

History

History
180 lines (149 loc) · 11.4 KB

README.md

File metadata and controls

180 lines (149 loc) · 11.4 KB

Expanding the transcriptomic toolbox in prokaryotes by Nanopore sequencing of RNA and cDNA molecules

Felix Grünberger, Sébastien Ferreira-Cerca2, and Dina Grohmann

1 Department of Biochemistry, Genetics and Microbiology, Institute of Microbiology, Single-Molecule Biochemistry Lab & Biochemistry Centre Regensburg, University of Regensburg, Universitätsstraße 31, 93053 Regensburg, Germany

2 Biochemistry III – Institute for Biochemistry, Genetics and Microbiology, University of Regensburg, Universitätsstraße 31, 93053 Regensburg, Germany.

° Corresponding authors


About this repository

This is the repository for the manuscript “Expanding the transcriptomic toolbox in prokaryotes by Nanopore sequencing of RNA and cDNA molecules.” In this study, we applied and benchmarked all currently available RNA-seq kits from Oxford Nanopore technologies to analyse RNAs in the prokaryotic model organism Escherichia coli K-12. These include:

  • Direct sequencing of native RNAs (DRS) using RNA001 & RNA002 chemistry
  • Native cDNA sequencing (cDNA) using DCS109 chemistry
  • PCR-cDNA sequencing (PCR-cDNA) using PCB109 chemistry

Full documentation here

https://felixgrunberger.github.io/microbepore/

Preprint

Preprint will soon be available at bioRxiv.
In case you are interested, have a look at our previous work:Exploring prokaryotic transcription, operon structures, rRNA maturation and modifications using Nanopore-based native RNA sequencing.

What can you find here

A description of the workflow using publicly available tools used to basecall, demultiplex, trim, map and count data can be found in the pipeline section. Downstream analysis, including quality control, annotation of transcript boundaries, gene body coverage analysis, transcriptional unit annotation are based on custom Rscripts.

Figures

Here you can find links to the scripts used to make all of the figures based on numeric data.

click to expand
Main 1 A NA
Main 1 B NA
Main 1 C NA
Main 1 D NA
Main 2 A Rscripts/salmon_analysis.R
Main 2 B Rscripts/salmon_analysis.R
Main 2 C Rscripts/salmon_analysis.R
Main 3 A Rscripts/example_coverage_plots.R
Main 3 B Rscripts/end5_detection.R
Main 3 C Rscripts/end3_detection.R
Main 4 A Rscripts/gene_body_coverage.R
Main 4 B NA
Main 4 C Rscripts/gene_body_coverage.R
Main 4 D Rscripts/gene_body_coverage.R
Main 4 E Rscripts/gene_body_coverage.R
Main 4 F Rscripts/gene_body_coverage.R
Main 5 A Rscripts/example_coverage_plots.R
Main 5 B Rscripts/operon_analysis.R
Main 5 C Rscripts/operon_analysis.R
Supplementary 1 NA NA
Supplementary 2 A Rscripts/bioanalyzer_analysis.R
Supplementary 2 B Rscripts/bioanalyzer_analysis.R
Supplementary 3 A Rscripts/raw_read_analysis.R
Supplementary 3 B Rscripts/raw_read_analysis.R
Supplementary 4 A Rscripts/raw_read_analysis.R
Supplementary 4 B Rscripts/raw_read_analysis.R
Supplementary 4 C Rscripts/raw_read_analysis.R
Supplementary 5 A Rscripts/mapped_read_analysis.R
Supplementary 5 B Rscripts/mapped_read_analysis.R
Supplementary 6 A Rscripts/mapped_read_analysis.R
Supplementary 6 B Rscripts/mapped_read_analysis.R
Supplementary 6 C Rscripts/mapped_read_analysis.R
Supplementary 6 D Rscripts/mapped_read_analysis.R
Supplementary 7 A Rscripts/mapped_read_analysis2.R
Supplementary 7 B Rscripts/mapped_read_analysis2.R
Supplementary 8 A Rscripts/mapped_read_analysis2.R
Supplementary 8 B Rscripts/mapped_read_analysis2.R
Supplementary 9 A Rscripts/mapped_read_analysis2.R
Supplementary 9 B Rscripts/mapped_read_analysis2.R
Supplementary 9 C Rscripts/mapped_read_analysis2.R
Supplementary 9 D Rscripts/mapped_read_analysis2.R
Supplementary 9 E Rscripts/mapped_read_analysis2.R
Supplementary 10 A Rscripts/seq_depth.R
Supplementary 10 B Rscripts/seq_depth.R
Supplementary 10 C Rscripts/seq_depth.R
Supplementary 11 A Rscripts/salmon_analysis.R
Supplementary 11 B Rscripts/salmon_analysis.R
Supplementary 12 A NA
Supplementary 12 B Rscripts/pychopper_trimming.R
Supplementary 12 C Rscripts/pychopper_trimming.R
Supplementary 13 A Rscripts/read_end_identities.R
Supplementary 13 B Rscripts/read_end_identities.R
Supplementary 14 A Rscripts/end5_detection.R
Supplementary 14 B Rscripts/end5_detection.R
Supplementary 15 A Rscripts/end5_detection.R
Supplementary 15 B Rscripts/end5_detection.R
Supplementary 15 C Rscripts/end5_detection.R
Supplementary 16 A Rscripts/end5_detection.R
Supplementary 16 B Rscripts/end5_detection.R
Supplementary 17 A Rscripts/end5_detection.R
Supplementary 17 B Rscripts/end5_detection.R
Supplementary 18 A Rscripts/end3_detection.R
Supplementary 18 B Rscripts/end3_detection.R
Supplementary 19 Rscripts/end3_detection.R
Supplementary 20 A Rscripts/end3_detection.R
Supplementary 20 B Rscripts/end3_detection.R
Supplementary 21 A Rscripts/operon_analysis.R
Supplementary 21 B Rscripts/operon_analysis.R
Supplementary 22 A NA
Supplementary 22 B Rscripts/example_coverage_plots.R
Supplementary 22 C Rscripts/example_coverage_plots.R
Supplementary 22 D Rscripts/example_coverage_plots.R
Supplementary 23 A NA
Supplementary 23 B Rscripts/example_coverage_plots.R
Supplementary 23 C Rscripts/example_coverage_plots.R
Supplementary 23 D Rscripts/example_coverage_plots.R

Data availability

Sequencing files in original FAST5 format

Sequencing files in original FAST5 format are publicly available in the Sequence Read Archive SRA (RNA001: PRJNA632538, all other datasets: PRJNA731531).

Basecalled and demultiplexed FASTQ files

For easier access, basecalled & demultiplexed FASTQ files are available in a Google Drive Folder and on Zenodo.

Mapped BAM files

Minimap2-mapped untrimmed reads are also available in the Google Drive Folder and on Zenodo.


License

This project is under the general MIT License - see the LICENSE file for details

References