From dc9ae94a7d07f6f2de6e7a6e5b7485466254b2b5 Mon Sep 17 00:00:00 2001 From: Philippine Louail <127301965+philouail@users.noreply.github.com> Date: Mon, 28 Oct 2024 14:36:04 +0100 Subject: [PATCH] Update xcms.Rmd --- vignettes/xcms.Rmd | 110 +++++++++++++++++++++------------------------ 1 file changed, 51 insertions(+), 59 deletions(-) diff --git a/vignettes/xcms.Rmd b/vignettes/xcms.Rmd index 3f59a2e9..e2f6c5b7 100644 --- a/vignettes/xcms.Rmd +++ b/vignettes/xcms.Rmd @@ -62,7 +62,10 @@ be applied to the older *MSnbase*-based workflows (xcms version 3). Additional documents and tutorials covering also other topics of untargeted metabolomics analysis are listed at the end of this document. There is also a [xcms tutorial](https://jorainer.github.io/xcmsTutorials) available with more examples -and details. +and details. +To get a complete overview of LCMS-MS analysis, an end-to-end workflow +[Metabonaut website](https://rformassspectrometry.github.io/metabonaut/), which +integrate the *xcms* preprocessing steps with the downstream analysis, is available. # Preprocessing of LC-MS data @@ -1180,55 +1183,6 @@ defined above. The `filter` argument can accommodate various types of input, each determining the specific type of quality assessment and filtering to be performed. -The `RsdFilter` enable users to filter features based on their relative -standard deviation (coefficient of variation) for a specified `threshold`. It -is recommended to base the computation on quality control (QC) samples, -as demonstrated below: - -```{r} -# Set up parameters for RsdFilter -rsd_filter <- RsdFilter(threshold = 0.3, - qcIndex = sampleData(faahko)$sample_type == "QC") - -# Apply the filter to faakho object -filtered_faahko <- filterFeatures(object = faahko, filter = rsd_filter) - -# Now apply the same strategy to the res object -rsd_filter <- RsdFilter(threshold = 0.3, qcIndex = res$sample_type == "QC") -filtered_res <- filterFeatures(object = res, filter = rsd_filter, assay = "raw") -``` - -All features with an RSD (CV) strictly larger than 0.3 in QC samples were thus -removed from the data set. - -The `DratioFilter` can be used to filter features based on the D-ratio or -*dispersion ratio*, which compares the standard deviation in QC samples to that -in study samples. - -```{r} -# Set up parameters for DratioFilter -dratio_filter <- DratioFilter( - threshold = 0.5, - qcIndex = sampleData(filtered_faahko)$sample_type == "QC", - studyIndex = sampleData(filtered_faahko)$sample_type == "study") - -# Apply the filter to faahko object -filtered_faakho <- filterFeatures(object = filtered_faahko, - filter = dratio_filter) - -# Now same but for the res object -dratio_filter <- DratioFilter( - threshold = 0.5, - qcIndex = filtered_res$sample_type == "QC", - studyIndex = filtered_res$sample_type == "study") - -filtered_res <- filterFeatures(object = filtered_res, - filter = dratio_filter) -``` - -All features with an D-ratio strictly larger than 0.5 were thus removed from -the data set. - The `PercentMissingFilter` allows to filter features based on the percentage of missing values for each feature. This function takes as an input the parameter `f` which is supposed to be a vector of length equal to the length of the object @@ -1276,16 +1230,54 @@ samples. More information can be found in the documentation of the filter: ?BlankFlag ``` -## Normalization +The `RsdFilter` enable users to filter features based on their relative +standard deviation (coefficient of variation) for a specified `threshold`. It +is recommended to base the computation on quality control (QC) samples, +as demonstrated below: + +```{r} +# Set up parameters for RsdFilter +rsd_filter <- RsdFilter(threshold = 0.3, + qcIndex = sampleData(faahko)$sample_type == "QC") + +# Apply the filter to faakho object +filtered_faahko <- filterFeatures(object = faahko, filter = rsd_filter) + +# Now apply the same strategy to the res object +rsd_filter <- RsdFilter(threshold = 0.3, qcIndex = res$sample_type == "QC") +filtered_res <- filterFeatures(object = res, filter = rsd_filter, assay = "raw") +``` + +All features with an RSD (CV) strictly larger than 0.3 in QC samples were thus +removed from the data set. + +The `DratioFilter` can be used to filter features based on the D-ratio or +*dispersion ratio*, which compares the standard deviation in QC samples to that +in study samples. + +```{r} +# Set up parameters for DratioFilter +dratio_filter <- DratioFilter( + threshold = 0.5, + qcIndex = sampleData(filtered_faahko)$sample_type == "QC", + studyIndex = sampleData(filtered_faahko)$sample_type == "study") + +# Apply the filter to faahko object +filtered_faakho <- filterFeatures(object = filtered_faahko, + filter = dratio_filter) + +# Now same but for the res object +dratio_filter <- DratioFilter( + threshold = 0.5, + qcIndex = filtered_res$sample_type == "QC", + studyIndex = filtered_res$sample_type == "study") + +filtered_res <- filterFeatures(object = filtered_res, + filter = dratio_filter) +``` -Normalizing features' signal intensities is required, but at present not (yet) -supported in `xcms` (some methods might be added in near future). It is advised -to use the `SummarizedExperiment` returned by the `quantify()` method for any -further data processing, as this type of object stores feature definitions, -sample annotations as well as feature abundances in the same object. For the -identification of e.g. features with significant different -intensities/abundances it is suggested to use functionality provided in other R -packages, such as Bioconductor's excellent *limma* package. +All features with an D-ratio strictly larger than 0.5 were thus removed from +the data set. ## Alignment to an external reference dataset