QIIME 2 plugin for microbiome count data normalization by scaling with ranked subsampling (SRS).
Read more about this normalization method in the SRS paper (Beule and Karlovsky, PeerJ 2020).
To more details on the usage of SRS, take a look at our software paper (Heidrich et al., Appl. Sci. 2021).
Activate your qiime2>=2021.4
environment by running (or equivalent):
conda activate qiime2-2021.4
- Option 1 - To install from conda, run:
conda install -c vitorheidrich q2_srs
- Option 2 - To install from this repository, run:
pip install git+https://github.com/vitorheidrich/q2-srs.git
Check for successful installation by running qiime srs
. A description of the q2-srs plugin will show up.
q2-srs features two qiime
commands:
qiime srs SRS
- performs SRS normalization at a user-defined number of reads per sampleqiime srs SRScurve
- draws alpha diversity rarefaction curves for SRS-normalized data (instead of rarefied data)
To see the full options of each command run qiime srs SRS --help
or qiime srs SRScurve --help
.
We strongly encourage you to explore the SRS Shiny app, that is specifically designed for q2-srs users.
In order to normalize your samples to the same number of reads using SRS, we recommend running SRScurve
first so you can determine a good normalization cut-off for your data. This normalization cut-off is called Cmin (see the SRS paper for details).
Alternatively (and complementarily) to SRScurve
, we strongly advise for the use of the SRS Shiny app for the determination of Cmin (see the SRS software paper for details).
Upload your ASV/OTU table (.qza) and the app will provide:
- A rug plot of the read counts per sample
- A simpler SRScurve plot (use the q2-srs SRScurve for the full experience)
- Summary statistics on trade-offs between Cmin and the number of retained samples
- Summary statistics on trade-offs between Cmin and the diversity retained per sample
Once you have chosen an adequate Cmin, run SRS
with the Cmin that suits your data.
The output of SRS
will be an OTU/ASV table SRS-normalized at Cmin reads per sample that is ready for the next steps of your pipeline.
Be aware: the first time you run any q2-srs command will take a little longer than usual. This is due to the installation of necessary R dependencies. Use the --verbose
flag (as in the following examples) to keep up with the installation process. After this first usage, q2-srs will run much faster.
In the following examples we are going to use the ASV table (DADA2 output) from the Moving Pictures tutorial. The table is summarized below:
To run SRScurve
the only required input is the OTU/ASV table. However, SRScurve
is highly customizable, allowing different alpha diversity indices, a comparison with repeated rarefying and many other analytical/aesthetic options. Please run qiime srs SRScurve --help
to see the full options.
SRScurve
usage example with the table.qza
from the example_data
folder (see table.qzv
for details):
qiime srs SRScurve \
--i-table example_data/table.qza \
--p-step 100 \
--p-max-sample-size 3500 \
--p-rarefy-comparison \
--p-rarefy-comparison-legend \
--p-rarefy-repeats 100 \
--p-srs-color 'blue' \
--p-rarefy-color '#333333' \
--o-visualization example_data/SRScurve-plot.qzv \
--verbose
You can see the plot output by running:
qiime tools view example_data/SRScurve-plot.qzv
Depending on the data properties (balance between rare and abundant OTUs/ASVs), you may observe a minor zigzag behaviour of SRS curves. This is due to the picking of the ranked fractional values (Cfrac): depending on the scaling factor, an OTU/ASV with an integer value (Cint) of zero may/may not be picked by ranked subsampling due to its Cfrac (see the SRS paper for details). This is causing the reproducible zigzag behaviour in the observed number of ASVs (richness) in this example.
Notice we are comparing SRS normalization with repeated rarefying by using --p-rarefy-comparison
.
Based on the SRScurve
output, a Cmin of 3000 will be used below.
To run SRS
the only required input are the OTU/ASV table and the chosen Cmin. Run qiime srs SRS --help
to see the full options.
SRS
usage example with the table.qza
from the example_data
folder and the Cmin as determined above:
qiime srs SRS \
--i-table example_data/table.qza \
--p-c-min 3000 \
--o-normalized-table example_data/norm-table.qza \
--verbose
Be aware: after running SRS
, the samples with less sequence counts than the chosen Cmin will have been discarded (use --verbose
to see the list of discarded samples).
Finally, we can confirm that all samples ended up with the same number of reads in the SRS-normalized artifact by running:
qiime feature-table summarize \
--i-table example_data/norm-table.qza \
--o-visualization example_data/norm-table.qzv
If you use this plugin in your research paper, please cite as:
Heidrich V, Karlovsky P, Beule L. 2021. ‘SRS’ R package and ‘q2-srs’ QIIME 2 plugin: Normalization of Microbiome Data Using Scaling with Ranked Subsampling (SRS). Appl. Sci. 11(23), 11473.
When referencing the SRS algorithm itself, please cite:
Beule L, Karlovsky P. 2020. Improved normalization of species count data in ecology by scaling with ranked subsampling (SRS): application to microbial communities. PeerJ 8:e9593.
We would like to thank Claire Duvallet (@cduvallet) for her great tutorials (I; II) on how to build a QIIME 2 plugin.