All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- merge_gff_compare failing with empty GFF files.
- v1.5.0 bug; access to undefined channel output bug when using precomputed transcriptome.
- Bug where incorrect gene_id assigned in the DE tables.
- Workflow report updated to use
ezcharts
.
- Exons per isoforms histogram reporting incorrect numbers.
- Output the
results_dexseq.tsv
file when--de_analysis
enabled.
- per-class gffcompare tracking files as there exists a combine tracking file.
--igv
parameter (default: false) for outputting IGV config allowing visualisation of read alignments in the EPI2ME App.- If required for IGV, reference indexes are output in to a
igv_reference
directory
- BAMS are output in to a BAMS directory.
- Reconcile with template 5.2.6.
- Fusion detection subworkflow, as the functionality is not robust enough for general use at this time.
- Updated pychopper to 2.7.10
- new
cdna_kit
options: PCS114 and PCB111/114
- Increase some memory and CPU allocations.
- Workflow now accepts BAM or FASTQ files as input (using the --bam or --fastq parameters, respectively).
- MA plot in the
results_dge.pdf
has been updated to match the MA plot in the report.
- Error message when running in
de_analysis
mode andref_annotation
input file contains unstranded annotations.
- Improved handling of different annotation file types (eg.
.gtf/.gff/.gff3
) inde_analysis
mode. - Improved handling of annotation files that do not contain version numbers in transcript_id (such as gtf's from Ensembl).
- Differential expression failing with 10 or more samples.
- Regression causing the DE analysis numeric parameters to not be evaluated correctly.
- Improve documentation around filtering of transcripts done before DTU analysis.
- Renamed files:
de_analysis/all_counts_filtered.tsv
tode_analysis/filtered_transcript_counts_with_genes.tsv
de_analysis/de_tpm_transcript_counts.tsv
tode_analysis/unfiltered_tpm_transcript_counts.tsv
- Minimum memory requirements to
32 GB
.
- Published isoforms table to output directory.
- Output additional
de_analysis/cpm_gene_counts.tsv
with counts per million gene counts. - Output additional
de_analysis/unfiltered_transcript_counts_with_genes.tsv
with unfiltered transcript counts with associated gene IDs. - Add gene name column to the de_analysis counts TSV files.
- Mapping stage using a single thread only.
- More memory assigned to the fusion detection process.
- When no
--ref_annotation
is provided the workflow will still run but the output transcripts will not be annotated. However--de_analysis
mode still requires a--ref_annotation
.
- Published minimap2 and pychopper results to output directory.
- Two extra pychopper parameters
--cdna_kit
and--pychopper_backend
.--pychopper_options
is still available to define any other options. - Memory requirements for each process.
- Documentation.
- When Jaffa is run only output one report.
- Sample sheet must include a
control
type to indicate which samples are the reference for the differential expression pipeline.
- Default local executor CPU and RAM limits.
- Updated docker container with Pychopper to support LSK114.
- Remove dead links from README
- Denovo
--transcriptome_source
option.
- Handling for input reference transcriptome headers that contain
|
- Improve differential expression outputs.
- Include transcript and gene count tables in DE_final folder.
- If differential expression subworkflow is used a non redundant transcriptome will be output which includes novel transcripts.
- Added wording to the report about how to identify novel transcripts in the DE tables.
- Nextflow minimum required version to 23.04.2
--minimap_index_opts
parameter has been changed tominimap2_index_opts
for consistency.
- An additional gene name column to the differential gene expression results. This is especially handy for transcriptomes where the gene ID is not the same as gene name (e.g. Ensembl).
- Wording to the report about how to identify novel transcripts in the DE tables.
- Any sample aliases that contain spaces will be replaced with underscores.
- Updated documentation to explain we only support Ensembl, NCBI and ENCODE annotation file types.
- Documentation parameter examples corrected.
- Handling for annotation files that use gene as gene_id attribute.
- Handling for Ensembl annotation files.
- GitHub issue templates
- Condition sheet is no longer required. The sample sheet is now used to indicate condition instead.
- For differential expression, the sample sheet must have a
condition
column to indicate which condition group each sample in the sample sheet belongs to. - Values for the condition may be any two distinct strings, for example: treated/untreated; sample/control etc.
- For differential expression, the sample sheet must have a
- Remove default of null for
--ref_transcriptome
. - Read mapping summary table in the report has correct sample_ids.
- Handling for GFF3 reference_annotation file type.
- Warning for the
--transcriptome_source
denovo pipeline option.
- Enum choices are enumerated in the
--help
output - Enum choices are enumerated as part of the error message when a user has selected an invalid choice
- Bumped minimum required Nextflow version to 22.10.8
- Replaced
--threads
option in fastqingress with hardcoded values to remove warning about undefinedparam.threads
- Fix for the
--transcriptome_source
denovo pipeline option.
- Handling for GFF3 reference_annotation file type.
- Handling gzip input reference and annotation parameters.
- Handling for NCBI gtfs that contain some empty transcript ID fields.
- LICENSE to Oxford Nanopore Technologies PLC. Public License Version 1.0.
- Configuration for running demo data in AWS
- Condition sheet parameter description fixed to CSV
- Update fastqingress
- Simplify JAFFAL docs
- Description in manifest
-profile conda
is no longer supported, users should use-profile standard
(Docker) or-profile singularity
insteadnextflow run epi2me-labs/wf-transcriptomes --version
will now print the workflow version number and exit- Use parameter
--transcriptome-source
to define precalculated, reference-based or denovo
- Removed sanitize option
- Reduce size of differential expression data.
- Improved DE explanation in docs
- Option to turn off transcript assembly steps with param transcript_assembly
- Fix JAFFAL terminating workflow when no fusions found.
- Error if condition sheet and sample sheet don't match.
- Failed to plot DE graphs when one of data sets is 0 length.
- Differential transcript and gene expression subworkflow
- JAFFAL fusion detection subworkflow
- Args parser for fastqingress
- Set out_dir option type to ensure output is written to correct directory on Windows
- Skip unnecessary conversion to fasta from fastq
- Fastqingress metadata map
- Changed workflow name to wf-transcriptomes
- Better help text on cli
- Use EPI2ME Labs-maintained version of pychopper
- direct_rna option
- Some extra error handling
- Minor report display improvements
- Incorrect numbers and of transcripts caused by merging gff files with same gene and transcript ids
- Error handling in de novo pipeline. Skip clusters in build_backbones that cause an isONclust2 error
- Several small fixes in report plotting
- Added the denovo pipeline
- Updates to the report plots
- First release
- Initial port of Snakemake WF from https://github.com/nanoporetech/pipeline-nanopore-ref-isoforms