From 27612056023990187584db4102337151ddc8f586 Mon Sep 17 00:00:00 2001 From: JeanMainguy Date: Thu, 14 Mar 2024 10:56:24 +0100 Subject: [PATCH 1/2] add missing fasta cmd documentation --- docs/index.md | 1 + docs/user/writeFasta.md | 76 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 77 insertions(+) create mode 100644 docs/user/writeFasta.md diff --git a/docs/index.md b/docs/index.md index 2ffd38c2..84ced103 100644 --- a/docs/index.md +++ b/docs/index.md @@ -73,6 +73,7 @@ user/PangenomeAnalyses/pangenomeAnalyses user/RGP/rgpAnalyses user/Modules/moduleAnalyses user/writeGenomes +user/writeFasta user/align user/projection user/genomicContext diff --git a/docs/user/writeFasta.md b/docs/user/writeFasta.md new file mode 100644 index 00000000..ba7db52d --- /dev/null +++ b/docs/user/writeFasta.md @@ -0,0 +1,76 @@ + +# Fasta + +This command can be used to write fasta sequences of the pangenome or specific parts of the pangenome. + +Most options require a partition. + +Available partitions are: +* 'all' for the entire pangenome. +* 'Persistent' for persistent families +* 'Shell' for shell genes or families +* 'Cloud' for cloud genes or families +* 'rgp' for genes or families found in RGPs +* 'core' for core genes or families +* 'softcore' for softcore genes or families + +When using the 'softcore' filter, the '--soft_core' option can be used to modify the threshold used to determine what is part of the softcore. It is set to 0.95 by default. + +## Genes + +This option can be used to write the nucleotide CDS sequences. It can be used as such, to write all of the genes of the pangenome for example: + +```bash +ppanggolin fasta -p pangenome.h5 --output MY_GENES --genes all +``` + +Or to write only the persistent genes: + +```bash +ppanggolin fasta -p pangenome.h5 --output MY_GENES --genes persistent +``` + + +## Protein families + +This option can be used to write the protein sequences of the representative sequences for each family. It can be used as such for all families: + +```bash +ppanggolin fasta -p pangenome.h5 --output MY_PROT --prot_families all +``` + +or for all of the shell families for example: + +```bash +ppanggolin fasta -p pangenome.h5 --output MY_PROT --prot_families shell +``` + + +## Gene families + +This option can be used to write the gene sequences of the representative sequences for each family. It can be used as such: + +```bash +ppanggolin fasta -p pangenome.h5 --output MY_GENES_FAMILIES --gene_families all +``` + +or for the cloud families for example: + +```bash +ppanggolin fasta -p pangenome.h5 --output MY_GENES_FAMILIES --gene_families cloud +``` + +## Regions + +This option can be used to write the nucleotide sequences of the detected RGPs. +It requires the fasta sequences used to compute the pangenome, as originally provided when you computed your pangenome. + +This command has only two filters: +* all, for all regions +* complete, for only the 'complete' regions which are not on a contig border + +It can be used as such: + +```bash +ppanggolin fasta -p pangenome.h5 --output MY_REGIONS --regions all --fasta genomes.fasta.list +``` From fcb189c8f39a66b6864a30153bd1391e4ebea58b Mon Sep 17 00:00:00 2001 From: JeanMainguy Date: Thu, 14 Mar 2024 11:06:46 +0100 Subject: [PATCH 2/2] improve fasta command title --- docs/user/writeFasta.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/docs/user/writeFasta.md b/docs/user/writeFasta.md index ba7db52d..ac1ece26 100644 --- a/docs/user/writeFasta.md +++ b/docs/user/writeFasta.md @@ -1,20 +1,20 @@ -# Fasta +# Write pangenome sequences -This command can be used to write fasta sequences of the pangenome or specific parts of the pangenome. +The `fasta` command can be used to write sequences of the pangenome or specific parts of the pangenome in FASTA format. Most options require a partition. Available partitions are: -* 'all' for the entire pangenome. -* 'Persistent' for persistent families -* 'Shell' for shell genes or families -* 'Cloud' for cloud genes or families -* 'rgp' for genes or families found in RGPs -* 'core' for core genes or families -* 'softcore' for softcore genes or families - -When using the 'softcore' filter, the '--soft_core' option can be used to modify the threshold used to determine what is part of the softcore. It is set to 0.95 by default. +* `all` for the entire pangenome. +* `Persistent` for persistent families +* `Shell` for shell genes or families +* `Cloud` for cloud genes or families +* `rgp` for genes or families found in RGPs +* `core` for core genes or families +* `softcore` for softcore genes or families + +When using the `softcore` filter, the `--soft_core` option can be used to modify the threshold used to determine what is part of the softcore. It is set to 0.95 by default. ## Genes