Updated get_started docs

cgat-developers · Jan 1, 2025 · 76807f9 · 76807f9
1 parent 145bf4e
commit 76807f9
Show file tree

Hide file tree

Showing 4 changed files with 115 additions and 10 deletions.
diff --git a/docs/getting_started/examples.md b/docs/getting_started/examples.md
@@ -222,3 +222,10 @@ P.run(statement)
 
 When running the pipeline, make sure to specify `--no-cluster` as a command line option.
 
+### Troubleshooting
+
+- **Common Issues**: If you encounter errors during pipeline execution, ensure that all dependencies are installed and paths are correctly set.
+- **Logs**: Check the log files generated during the pipeline run for detailed error messages.
+- **Support**: For further assistance, refer to the [CGAT-core documentation](https://cgat-developers.github.io/cgat-core/) or raise an issue on our [GitHub repository](https://github.com/cgat-developers/cgat-core/issues).
+
+
diff --git a/docs/getting_started/installation.md b/docs/getting_started/installation.md
@@ -12,6 +12,35 @@ The preferred method of installation is using Conda. If you do not have Conda in
 conda install -c conda-forge -c bioconda cgatcore
 ```
 
+### Prerequisites
+
+Before installing `cgatcore`, ensure that you have the following prerequisites:
+
+- **Operating System**: Linux or macOS
+- **Python**: Version 3.6 or higher
+- **Conda**: Recommended for dependency management
+
+### Troubleshooting
+
+- **Conda Issues**: If you encounter issues with Conda, ensure that the Bioconda and Conda-Forge channels are added and prioritized correctly.
+- **Pip Dependencies**: When using pip, manually install any missing dependencies listed in the error messages.
+- **Script Errors**: If the installation script fails, check the script's output for error messages and ensure all prerequisites are met.
+
+### Verification
+
+After installation, verify the installation by running:
+
+```bash
+python
+```
+
+```python
+import cgatcore
+print(cgatcore.__version__)
+```
+
+This should display the installed version of `cgatcore`.
+
 ## Pip installation
 
 We recommend installation through Conda because it manages dependencies automatically. However, `cgatcore` is generally lightweight and can also be installed using the `pip` package manager. Note that you may need to manually install other dependencies as needed:
@@ -86,5 +115,4 @@ To set this variable permanently, add the following line to your `.bashrc` file
 export DRMAA_LIBRARY_PATH=/usr/lib/libdrmaa.so.1.0
 ```
 
-[Conda documentation](https://conda.io)
-
+[Conda documentation](https://conda.io)
diff --git a/docs/getting_started/run_parameters.md b/docs/getting_started/run_parameters.md
@@ -6,18 +6,77 @@ This configuration file allows you to override the default settings. To view the
 
 For an example of configuring a PBSPro workload manager, see the provided [config example](https://github.com/AntonioJBT/pipeline_example/blob/master/Docker_and_config_file_examples/cgat.yml).
 
-The `.cgat.yml` file in your home directory will take precedence over the default cgatcore settings. For instance, adding the following configuration to `.cgat.yml` will implement cluster settings for PBSPro:
+The `.cgat.yml` file in your home directory will take precedence over the default cgatcore settings. For instance, adding the following configuration to `.cgat.yml` will implement cluster settings for SLURM:
 
 ```yaml
 memory_resource: mem
 
-options: -l walltime=00:10:00 -l select=1:ncpus=8:mem=1gb
+options: --time=00:10:00 --cpus-per-task=8 --mem=1G
 
-queue_manager: pbspro
+queue_manager: slurm
 
 queue: NONE
 
 parallel_environment: "dedicated"
 ```
 
-This setup specifies memory resource allocation (`mem`), runtime limits (`walltime`), selection of CPU and memory resources, and the use of the PBSPro queue manager, among other settings. Make sure to adjust the parameters according to your cluster environment to optimise the workload manager for your pipeline runs.
+This setup specifies memory resource allocation (`mem`), runtime limits (`walltime`), selection of CPU and memory resources, and the use of the PBSPro queue manager, among other settings. Make sure to adjust the parameters according to your cluster environment to optimise the workload manager for your pipeline runs.
+
+## Default Parameters
+
+The following are some of the default parameters in `cgatcore` that can be overridden in your `.cgat.yml` file:
+
+- **memory_resource**: Defines the memory resource name (e.g., `mem` for PBSPro).
+- **options**: Specifies additional options for job submission (e.g., `-l walltime=00:10:00`).
+- **queue_manager**: The queue manager to be used (e.g., `pbspro`, `slurm`).
+- **queue**: The default queue for job submission.
+- **parallel_environment**: Specifies the parallel environment settings.
+
+## Additional Parameters
+
+The following additional parameters can also be configured in your `.cgat.yml` file:
+
+- **cluster_queue**: Specifies the cluster queue to use (default: `all.q`).
+- **cluster_priority**: Sets the priority of jobs in the cluster queue (default: `-10`).
+- **cluster_num_jobs**: Limits the number of jobs to submit to the cluster queue (default: `100`).
+- **cluster_memory_resource**: Name of the consumable resource to request memory (default: `mem_free`).
+- **cluster_memory_default**: Default amount of memory allocated for each job (default: `4G`).
+- **cluster_memory_ulimit**: Ensures requested memory is not exceeded via ulimit (default: `False`).
+- **cluster_options**: General cluster options for job submission.
+- **cluster_parallel_environment**: Parallel environment for multi-threaded jobs (default: `dedicated`).
+- **cluster_queue_manager**: Specifies the cluster queue manager (default: `sge`).
+- **cluster_tmpdir**: Directory specification for temporary files on cluster nodes. If set to `False`, the general `tmpdir` parameter is used.
+
+These parameters allow you to customize the cluster environment to better suit your pipeline's needs.
+
+## Example Configurations
+
+### SLURM Configuration
+
+```yaml
+memory_resource: mem
+
+options: --time=00:10:00 --cpus-per-task=8 --mem=1G
+
+queue_manager: slurm
+
+queue: NONE
+
+parallel_environment: "dedicated"
+```
+
+### Torque Configuration
+
+```yaml
+memory_resource: mem
+
+options: -l walltime=00:10:00 -l nodes=1:ppn=8
+
+queue_manager: torque
+
+queue: NONE
+
+parallel_environment: "dedicated"
+```
+
+These configurations specify memory allocation, runtime limits, and other settings specific to each workload manager. Adjust these parameters to suit your cluster environment.
diff --git a/docs/getting_started/tutorial.md b/docs/getting_started/tutorial.md
@@ -40,7 +40,7 @@ This will generate a `pipeline.yml` file containing configuration parameters tha
 
 ### Step 3: Run the pipeline
 
-Run the pipeline using the following command:
+To run the pipeline, execute the following command in the directory containing the `pipeline.yml` file:
 
 ```bash
 cgatshowcase transdiffexpres make full -v5 --no-cluster
@@ -54,7 +54,19 @@ The `--no-cluster` flag will run the pipeline locally if you do not have access
 cgatshowcase --help
 ```
 
-### Step 4: Generate a report
+This will start the pipeline execution. Monitor the output for any errors or warnings.
+
+### Step 4: Review Results
+
+Once the pipeline completes, review the output files generated in the `showcase_test_data` directory. These files contain the results of the pseudoalignment.
+
+### Troubleshooting
+
+- **Common Issues**: If you encounter errors during execution, ensure that all dependencies are installed and paths are correctly set.
+- **Logs**: Check the log files generated during the pipeline run for detailed error messages.
+- **Support**: For further assistance, refer to the [CGAT-core documentation](https://cgat-core.readthedocs.io/en/latest/) or raise an issue on our [GitHub repository](https://github.com/cgat-developers/cgat-core/issues).
+
+### Step 5: Generate a report
 
 The final step is to generate a report to display the output of the pipeline. We recommend using `MultiQC` for generating reports from commonly used bioinformatics tools (such as mappers and pseudoaligners) and `Rmarkdown` for generating custom reports.
 
@@ -68,5 +80,4 @@ This will generate a `MultiQC` report in the folder `MultiQC_report.dir/` and an
 
 ## Conclusion
 
-This completes the tutorial for running the `transdiffexprs` pipeline for `cgat-showcase`. We hope you find it as useful as we do for writing workflows in Python.
-
+This completes the tutorial for running the `transdiffexprs` pipeline for `cgat-showcase`. We hope you find it as useful as we do for writing workflows in Python.