Merge pull request #257 from fchapoton/fix_typos_in_docs

fix some typos in docs
labgem · Aug 2, 2024 · 8651498 · 8651498
2 parents b5ab07a + 52e1456
commit 8651498
Show file tree

Hide file tree

Showing 11 changed files with 15 additions and 16 deletions.
diff --git a/docs/user/MSA.md b/docs/user/MSA.md
@@ -1,6 +1,6 @@
 # Multiple Sequence Alignment
 
-The commande `msa` compute multiple sequence alignement of any partition of the pangenome. The command uses [mafft](https://mafft.cbrc.jp/alignment/software/) with default options to perform the alignment. Using multiple cpus with the `--cpu` argument is recommended as multiple alignment can be quite demanding in computational resources.
+The commande `msa` compute multiple sequence alignment of any partition of the pangenome. The command uses [mafft](https://mafft.cbrc.jp/alignment/software/) with default options to perform the alignment. Using multiple cpus with the `--cpu` argument is recommended as multiple alignment can be quite demanding in computational resources.
 
 This command can be used as follow:
 
@@ -34,10 +34,10 @@ ppanggolin msa -p pangenome.h5 --source dna
 
 ### Write a single whole MSA file with `--phylo` 
 
-It is also possible to write a single whole genome MSA file, which many phylogenetic softwares accept as input, by using the `--phylo` option as such:
+It is also possible to write a single whole genome MSA file, which many phylogenetic software accept as input, by using the `--phylo` option as such:
 
 ```bash
 ppanggolin msa -p pangenome.h5 --phylo
 ```
 
-This will contatenate all of the family MSA into a single MSA, with one sequence for each genome.
+This will concatenate all of the family MSA into a single MSA, with one sequence for each genome.
diff --git a/docs/user/Modules/moduleOutputs.md b/docs/user/Modules/moduleOutputs.md
@@ -112,7 +112,7 @@ Modules:
 	Number_of_modules: 380
 	Families_in_Modules: 2242
 	Partition_composition:
-		Persitent: 0.27
+		Persistent: 0.27
 		Shell: 37.69
 		Cloud: 62.04
 	Number_of_Families_per_Modules:
@@ -122,4 +122,3 @@ Modules:
 		mean: 5.9
 
 ```
-
diff --git a/docs/user/Modules/modulePrediction.md b/docs/user/Modules/modulePrediction.md
@@ -59,12 +59,12 @@ ppanggolin panmodule --fasta GENOME_LIST_FILE
 ```
 Replace `GENOME_LIST_FILE` with a tab-separated file listing the genome names, and the fasta file path of their genomic sequences as described [here](../PangenomeAnalyses/pangenomeAnnotation.md#annotate-from-fasta-files). Alternatively, you can provide a list of GFF/GBFF files as input by using the `--anno` parameter, similar to how it is used in the workflow and annotate commands.
 
-The panmodule workflow predicts modules using default parameters. To fine-tune the detection, you can use the `module` command on a partioned pangenome acquired through the `workflow` for example or use a configuration file, as described [here](../practicalInformation.md#configuration-file). 
+The panmodule workflow predicts modules using default parameters. To fine-tune the detection, you can use the `module` command on a partitioned pangenome acquired through the `workflow` for example or use a configuration file, as described [here](../practicalInformation.md#configuration-file). 
 
 
 ## Predict conserved module
 
-The `module` command predicts conserved modules on an partioned pangenome. The command has several options for tuning the prediction. Details about each parameter are available in the related [preprint](https://www.biorxiv.org/content/10.1101/2021.12.06.471380v1).
+The `module` command predicts conserved modules on an partitioned pangenome. The command has several options for tuning the prediction. Details about each parameter are available in the related [preprint](https://www.biorxiv.org/content/10.1101/2021.12.06.471380v1).
 
 The command can be used simply as such:
 

diff --git a/docs/user/PangenomeAnalyses/pangenomeCluster.md b/docs/user/PangenomeAnalyses/pangenomeCluster.md
@@ -141,7 +141,7 @@ Family_C    Gene_6  Gene_6
 ```{mermaid}
 
 ---
-title: "Pangenome gene families when specifing representative gene"
+title: "Pangenome gene families when specifying representative gene"
 align: center
 ---
 

diff --git a/docs/user/PangenomeAnalyses/pangenomeGraphOut.md b/docs/user/PangenomeAnalyses/pangenomeGraphOut.md
@@ -3,7 +3,7 @@
 The pangneome graph can be given through the `.gexf` and through the `_light.gexf` files. The `_light.gexf` file will contain the gene families as nodes and the edges between gene families describing their relationship, and the `.gexf` file will contain the same things but also include more details about each gene and each relation between gene families. 
 We have made two different files representing the same graph because, while the non-light file is exhaustive, it can be very heavy to manipulate and most of its content is not of interest to everyone. The `_light.gexf` file should be the one you use to manipulate the pangenome graph most of the time.
 
-These files can be manipulated and visualized for example through a software called [Gephi](https://gephi.org/), with which we have made extensive testings, or potentially any other softwares or libraries able to read gexf files such as [networkx](https://networkx.github.io/documentation/stable/index.html) or [gexf-js](https://github.com/raphv/gexf-js) among others. Gephi also have a web version able to open small pangenome graphs [gephi-lite](https://gephi.org/gephi-lite/).
+These files can be manipulated and visualized for example through a software called [Gephi](https://gephi.org/), with which we have made extensive testings, or potentially any other software or libraries able to read gexf files such as [networkx](https://networkx.github.io/documentation/stable/index.html) or [gexf-js](https://github.com/raphv/gexf-js) among others. Gephi also have a web version able to open small pangenome graphs [gephi-lite](https://gephi.org/gephi-lite/).
 
 Using Gephi, the layout can be tuned as illustrated below:
 

diff --git a/docs/user/QuickUsage/quickWorkflow.md b/docs/user/QuickUsage/quickWorkflow.md
@@ -101,7 +101,7 @@ genome_updater.sh -d "refseq"  -o "B_japonicum_genomes" -M "gtdb" -T "s__Bradyrh
 ```
 
 
-After the completion of the `all` command, all of your genomes have had their genes predicted, the genes have been clustered into gene families, a pangenome graph has been successfully constructed and partitioned into three distinct paritions: **persistent**, **shell**, and **cloud**. Additionally, **RGP, spots, and modules** have been detected within your pangenome.
+After the completion of the `all` command, all of your genomes have had their genes predicted, the genes have been clustered into gene families, a pangenome graph has been successfully constructed and partitioned into three distinct partitions: **persistent**, **shell**, and **cloud**. Additionally, **RGP, spots, and modules** have been detected within your pangenome.
 
 The results of the workflow is saved in the  **pangenome.h5** file, which is in the HDF-5 file format.
 When you run an analysis using this file as input, the results of that analysis will be added to the file to supplement the data that are already stored in it. 

diff --git a/docs/user/RGP/rgpClustering.md b/docs/user/RGP/rgpClustering.md
@@ -14,7 +14,7 @@ There are three modes available for calculating the GRR value: `min_grr`, `max_g
 - `incomplete_aware_grr` (default) mode: If at least one RGP is considered incomplete, which typically happens when it is located at the border of a contig, the `min_grr` mode is used. Otherwise, the `max_grr` mode is applied. This mode is useful to correctly cluster incomplete RGP.
 
 
-The resulting RGP clusters are stored in a tsv file with the folowing columns:
+The resulting RGP clusters are stored in a tsv file with the following columns:
 
 | column  | description                  |
 |---------|------------------------------|

diff --git a/docs/user/RGP/rgpPrediction.md b/docs/user/RGP/rgpPrediction.md
@@ -68,7 +68,7 @@ ppanggolin panrgp --fasta genomes.fasta.list
 ```
 
 Just like [workflow](../PangenomeAnalyses/pangenomeAnalyses.md#workflow), this command will deal with the [annotation](../PangenomeAnalyses/pangenomeAnalyses.md#annotation), [clustering](../PangenomeAnalyses/pangenomeAnalyses.md#compute-pangenome-gene-families), [graph](../PangenomeAnalyses/pangenomeAnalyses.md#graph) and [partition](../PangenomeAnalyses/pangenomeAnalyses.md#partition) commands by itself.
-Then, the RGP detection is ran using [rgp](#rgp-detection) after the pangenome partitionning. Once all RGP have been computed, those found in similar genomic contexts in the genomes are gathered into spots of insertion using [spot](#spot-prediction).
+Then, the RGP detection is ran using [rgp](#rgp-detection) after the pangenome partitioning. Once all RGP have been computed, those found in similar genomic contexts in the genomes are gathered into spots of insertion using [spot](#spot-prediction).
 
 If you want to tune the rgp detection, you can use the `rgp` command after the `workflow` command. If you wish to tune the spot detection, you can use the `spot` command after the `rgp` command. Additionally, you have the option to utilize a configuration file to customize each detection within the `panrgp` command.
 

diff --git a/docs/user/align.md b/docs/user/align.md
@@ -24,7 +24,7 @@ By default the command creates two output files:
 
 ### 2. 'input_to_pangenome_associations.blast-tab'
 
-'input_to_pangenome_associations.blast-tab' is a .tsv file that follows the tabular blast format which many alignment softwares (such as blast, diamond, mmseqs etc.) use, with two additional columns: the length of query sequence which was aligned, and the length of the subject sequence which was aligned (provided with qlen and slen with the softwares I previously named). You can find a detailed description of the format in [this blog post](https://www.metagenomics.wiki/tools/blast/blastn-output-format-6) for example (and there are many other descriptions of this format on internet, if you search for 'tabular blast format'). The query are the provided sequences, and the subjet are the pangenome gene families.
+'input_to_pangenome_associations.blast-tab' is a .tsv file that follows the tabular blast format which many alignment software (such as blast, diamond, mmseqs etc.) use, with two additional columns: the length of query sequence which was aligned, and the length of the subject sequence which was aligned (provided with qlen and slen with the software I previously named). You can find a detailed description of the format in [this blog post](https://www.metagenomics.wiki/tools/blast/blastn-output-format-6) for example (and there are many other descriptions of this format on internet, if you search for 'tabular blast format'). The query are the provided sequences, and the subject are the pangenome gene families.
 
 
 ### 3. Optional outputs 

diff --git a/docs/user/practicalInformation.md b/docs/user/practicalInformation.md
@@ -52,7 +52,7 @@ If you want, verbosity can be reduced in several ways.
 First, you can specify the verbosity level with the `--verbose` option. 
 With `0` will show only warnings and errors, `1` will add the information (default value), and if you encounter any problem you can use the debug level with value `2`.
 Then you can also remove the progress bars with the option `--disable_prog_bar`
-Finaly, you can also save PPanGGOLiN logs in a file by indicating its path with the option `--log`.
+Finally, you can also save PPanGGOLiN logs in a file by indicating its path with the option `--log`.
 
 ## Configuration file
 

diff --git a/docs/user/projection.md b/docs/user/projection.md
@@ -58,13 +58,13 @@ For Gene Family and Partition of Input Genes:
 For RGPs and Spots:
 
 - `plastic_regions.tsv`: This file contains information about RGPs within the input genome. Its format follows [this output](RGP/rgpOutputs.md#rgp-outputs).
-- `input_genome_rgp_to_spot.tsv`: It provides information about the association between RGPs and insertion spots in the input genome. Its format follows [this ouput](RGP/rgpOutputs.md#summarize-spots).
+- `input_genome_rgp_to_spot.tsv`: It provides information about the association between RGPs and insertion spots in the input genome. Its format follows [this output](RGP/rgpOutputs.md#summarize-spots).
 
 Optionally, you can generate a graph of the spots using the `--spot_graph` option. This graph resembles the one produced by the `ppanggolin draw --spots` command, which is detailed [here](RGP/rgpOutputs.md#draw-spots).
 
 For Modules:
 
-- `modules_in_input_genome.tsv`: This file lists the modules that have been found in the input genome. Its format follows [this ouput](Modules/moduleOutputs.md#module-outputs).
+- `modules_in_input_genome.tsv`: This file lists the modules that have been found in the input genome. Its format follows [this output](Modules/moduleOutputs.md#module-outputs).