Skip to content

Commit

Permalink
Merge pull request #753 from sneumann/import_methods_jo
Browse files Browse the repository at this point in the history
refactor: updates of documentation and extension of unit tests
  • Loading branch information
philouail authored Jun 6, 2024
2 parents 0408144 + 99291cb commit 9d27594
Show file tree
Hide file tree
Showing 4 changed files with 252 additions and 164 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Imports:
MsCoreUtils (>= 1.15.7),
MsFeatures,
MsExperiment (>= 1.5.4),
Spectra (>= 1.13.7),
Spectra (>= 1.15.3),
progress,
jsonlite,
RColorBrewer,
Expand Down
194 changes: 101 additions & 93 deletions R/PlainTextParam.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
#' @include XcmsExperiment.R
#'
#' @title Store contents of `MsExperiment` and `XcmsExperiment` objects as
#' plain text files
#'
Expand All @@ -9,76 +10,79 @@
#' @family xcms result export and import formats.
#'
#' @description
#' The `PlainTextParam` class and method enable users to save/load an
#' `MsExperiment` or `XcmsExperiment` object as a collections of plain text
#' files in/from a specified folder.
#' Note that, while for all xcms results within the `XcmsExperiment` can
#' and will be exported, the full raw MS data (of the object's `Spectra` object)
#' will currently not be exported in plain text format. For `Spectra` using the
#' [MsBackendMzR()] backend, the names of the raw data files will however be
#' exported (which enables to *load* the full `Spectra` respectively
#' `MsExperiment` objects).
#'
#' For an `MsExperiment` object, the exported files include:
#'
#' - The [sampleData()] stored as a text file named *sample_data.txt*.
#' The `storeResults()` and `loadResults()` methods with the `PlainTextParam`
#' option enable users to save/load an `MsExperiment` or `XcmsExperiment`
#' object as a collections of plain text files in/from a specified folder. This
#' folder, defined with the `path` parameter, will be created by the
#' `storeResults()` function. Any previous exports eventually present in that
#' folder will be overwritten.
#'
#' For a `Spectra` object, the exported files include:
#' For an `MsExperiment` object, the exported files, stored into the directory
#' specifyied with the `path` parameter, include:
#'
#' - The [spectraData()] stored in a tabular format in a text file named
#' *backend_data.txt*.
#'
#' - The `processingQueueVariables`, `processing`, [processingChunkSize()] and
#' `backend` class information of the object stored in a text file named
#' *spectra_slots.txt*.
#'
#' - Processing queue of the `Spectra` object, ensuring that any spectra data
#' modifications are retained. It is stored in a `json` file named
#' *spectra_processing_queue.json*.
#'
#' Note : The Spectra object will only be exported if it uses a
#' [MsBackendMzR()] backend. Other backends are no supported as of now.
#' - The [sampleData()] stored as a text file named *sample_data.txt*.
#'
#' For an `XcmsExperiment` object, the exported files are the same as those
#' for an `MsExperiment` object, with the addition of the following:
#'
#' - The [processHistory()] information of the object, stored in a `json` file
#' named *process_history.json*.
#' named *process_history.json*.
#'
#' - The chromatographic peak information obtained with [chromPeaks()] and
#' [chromPeaksData()], stored in tabular format in the text files
#' *chrom_peaks.txt* and *chrom_peak_data.txt* respectively.
#' [chromPeaksData()], stored in tabular format in the text files
#' *chrom_peaks.txt* and *chrom_peak_data.txt* respectively.
#'
#' - The retention time information obtained with [adjustedRtime()] stored
#' in the text file named *rtime_adjusted.txt*.
#' in a text file named *rtime_adjusted.txt*.
#'
#' - The [featureDefinitions()] stored in a text file named
#' *feature_definitions.txt*. Additionally, a second file named
#' *feature_peak_index.txt* is generated to connect the features' definitions
#' with their names.
#' *feature_definitions.txt*. Additionally, a second file named
#' *feature_peak_index.txt* is generated to connect the features' definitions
#' with their names.
#'
#' This `param` class and method are part of the possible dispatch of the
#' generic functions `storeResults()` and `loadResults()`.
#' The folder defined in the `path` parameter will be created by calling
#' `storeResults`. If the folder already exists, previous exports in that
#' folder might get overwritten.
#' For a `Spectra` object, the exported files include:
#'
#' @param path for `PlainTextParam` `character(1)`, defining where the files
#' are going to be stored/ should be loaded from. The default will be
#' `tempdir()`.
#' - The `processingQueueVariables`, `processing`, [processingChunkSize()] and
#' `backend` class information of the object stored in a text file named
#' *spectra_slots.txt*.
#'
#' - The processing queue of the `Spectra` object, ensuring that any spectra
#' data modifications are retained. It is stored in a `json` file named
#' *spectra_processing_queue.json*.
#'
#' Import/export of the MS data depends on the respective implementation of
#' the respective `MsBackend` object. For `MsBackendMzR`, the exported data
#' and related text files are:
#'
#' - The backend's [spectraData()] stored in a tabular format in a text file
#' named *backend_data.txt*.
#'
#' @param spectraPath for `loadResults` `character(1)`, defining the
#' absolute path where the spectra files should be imported from when loading
#' the object. The default will be set using the common file path of all the
#' spectra files when exporting.
#' @note
#'
#' The function relies on the `storeResults()` and `loadResults()` methods of
#' the [Spectra()] object and the used [MsBackend()] to store and restore the
#' MS data. These methods might not be available for all `MsBackend`
#' implementations. Also, it might be required to specify the path containing
#' the MS data files using the `spectraPath` parameter.
#'
#' @param path For `PlainTextParam()`: `character(1)`, defining where the files
#' are going to be stored/ should be loaded from. The default is
#' `path = tempdir()`.
#'
#' @param spectraPath For `loadResults()`: `character(1)` optionally allowing to
#' define the (absolute) path where the spectra files (*data storage files*)
#' can be found. This parameter is passed to the `loadResults()` method of
#' the [MsBackend()].
#'
#' @inheritParams storeResults
#'
#' @return for `PlainTextParam`: a `PlainTextParam` class. `storeResults` does
#' @return For `PlainTextParam`: a `PlainTextParam` class. `storeResults` does
#' not return anything but saves the object to collections of different plain
#' text files to a folder.
#' text files to a folder. The `loadResults()` method returns the restored
#' data as an instance of the class specified with parameter `object`.
#'
#' @author Philippine Louail, Johannes Rainer.
#' @author Philippine Louail
#'
#' @importFrom jsonlite serializeJSON write_json unserializeJSON read_json
#'
Expand All @@ -93,6 +97,7 @@
#' @importFrom stats setNames
#'
#' @examples
#'
#' ## Load test data set of class `MsExperiment`
#' library(MsExperiment)
#' fls <- dir(system.file("sciex", package = "msdata"), full.names = TRUE)
Expand Down Expand Up @@ -153,8 +158,8 @@ setMethod("storeResults",
function(object, param){
dir.create(path = param@path,
recursive = TRUE,
showWarnings = TRUE)
write.table(as.data.frame(sampleData(object)),
showWarnings = FALSE)
write.table(as.data.frame(sampleData(object)), sep = "\t",
file = file.path(param@path,
"sample_data.txt"))
## call export of individual other objects (not MsExperiment data)
Expand All @@ -167,7 +172,7 @@ setMethod("storeResults",
setMethod("storeResults",
signature(object = "XcmsExperiment",
param = "PlainTextParam"),
function(object, param){
function(object, param) {
callNextMethod()
.store_xcmsexperiment(x = object, path = param@path)
}
Expand All @@ -177,38 +182,36 @@ setMethod("storeResults",
setMethod("loadResults",
signature(object = "MsExperiment",
param = "PlainTextParam"),
function(object, param, spectraPath = character()){
function(object, param, spectraPath = character()) {
fl <- file.path(param@path, "sample_data.txt")
if (file.exists(fl)){ # should i have a error if does not exist ?
sd <- read.table(fl)
rownames(sd) <- NULL #read.table force numbering of rownames
}
s <- loadResults(Spectra(), param)
if (!file.exists(fl))
stop("No 'sample_data.txt' file found in the provided path.")
sd <- read.table(fl, sep = "\t")
rownames(sd) <- NULL #read.table force numbering of rownames
s <- loadResults(Spectra(), param, spectraPath = spectraPath)
res <- MsExperiment(sampleData = sd, spectra = s)
validObject(res)
res
}
)
})

#' @rdname PlainTextParam
setMethod("loadResults",
signature(object = "XcmsExperiment",
param = "PlainTextParam"),
function(object, param, spectraPath){
res <- callNextMethod() #check if need to add spectraPath = spectraPath
function(object, param, spectraPath = character()) {
res <- callNextMethod()
res <- .load_xcmsexperiment(res, path = param@path)
validObject(res)
res
}
)
})

#' @rdname PlainTextParam
setMethod("storeResults", signature(object = "Spectra",
param = "PlainTextParam"),
function(object, param) {
dir.create(path = param@path,
recursive = TRUE,
showWarnings = TRUE)
showWarnings = FALSE)
if (!existsMethod("storeResults", c(class(object@backend)[1L],
"PlainTextParam")))
stop("Can not store a 'Spectra' object with backend '",
Expand All @@ -234,9 +237,9 @@ setMethod("loadResults", signature(object = "Spectra",
"PlainTextParam")))
stop("Can not store a 'Spectra' object with backend '",
variables["backend"], "'")
b <- loadResults(object= do.call(what = variables[["backend"]],
args = list()),
param = param, spectraPath = spectraPath) ##better way to do this ?
b <- loadResults(object = do.call(what = variables[["backend"]],
args = list()),
param = param, spectraPath = spectraPath)
s <- Spectra(b)
s@processingQueueVariables <- unlist(strsplit(variables[["processingQueueVariables"]],
"|", fixed = TRUE))
Expand All @@ -249,37 +252,40 @@ setMethod("loadResults", signature(object = "Spectra",
s
})



# Notes: This and the Spectra method will be moved to it's respective package
#' @rdname PlainTextParam
setMethod("storeResults", signature(object = "MsBackendMzR",
param = "PlainTextParam"),
function(object, param) {
dir.create(path = param@path,
recursive = TRUE,
showWarnings = TRUE)
showWarnings = FALSE)
object <- dropNaSpectraVariables(object)
write.table(object@spectraData,
file = file.path(param@path, "backend_data.txt"),
sep = "\t", quote = FALSE)
fl <- file.path(param@path, "backend_data.txt")
if (file.exists(fl))
warning("Overwriting already present 'backend_data.txt' file")
writeLines(paste0("# ", class(object)[1L]), con = fl)
suppressWarnings(
write.table(object@spectraData,
file = fl, sep = "\t", quote = FALSE,
append = TRUE, row.names = FALSE))
})

#' @rdname PlainTextParam
setMethod("loadResults", signature(object = "MsBackendMzR",
param = "PlainTextParam"),
function(object, param, spectraPath = character()) {
b <- MsBackendMzR()
data <- read.table(file = file.path(param@path,
"backend_data.txt"),
sep = "\t", header = TRUE)
fl <- file.path(param@path, "backend_data.txt")
if (!file.exists(fl))
stop("No 'backend_data.txt' file found in the provided path.")
data <- read.table(file = fl, sep = "\t", header = TRUE)
rownames(data) <- NULL
data <- DataFrame(data)
b@spectraData <- data
if (length(spectraPath) > 0){
old <- MsCoreUtils::common_path(dataStorage(b))
if (nchar(old) > 0)
old <- paste0(old, "/")
if (length(spectraPath) > 0) {
old <- common_path(dataStorage(b))
## if (nchar(old) > 0)
## old <- paste0(old, "/")
dataStorage(b) <- sub(old, spectraPath, dataStorage(b))
}
b
Expand All @@ -296,13 +302,12 @@ setMethod("loadResults", signature(object = "MsBackendMzR",
writeLines(paste0("processingQueueVariables = ", paste(pq, collapse = "|")),
con = con)
p <- x@processing
writeLines(paste0("processing = ", paste(p,collapse = "|")), con = con)
writeLines(paste0("processing = ", paste(p, collapse = "|")), con = con)
writeLines(paste0("processingChunkSize = ", processingChunkSize(x)),
con = con)
writeLines(paste0("backend = ", class(x@backend)[1L]), con = con)
}


#' Processing queue
#' @param x `Spectra`
#'
Expand Down Expand Up @@ -367,19 +372,20 @@ setMethod("loadResults", signature(object = "MsBackendMzR",
#' Chromatographic peaks
#' @noRd
.export_chrom_peaks <- function(x, path = character()) {
write.table(chromPeaks(x), file = file.path(path, "chrom_peaks.txt"))
write.table(as.data.frame(chromPeakData(x)),
write.table(chromPeaks(x), file = file.path(path, "chrom_peaks.txt"),
sep = "\t")
write.table(as.data.frame(chromPeakData(x)), sep = "\t",
file = file.path(path, "chrom_peak_data.txt"))
}

#' @noRd
.import_chrom_peaks <- function(x, path = character()) {
f <- file.path(path, "chrom_peaks.txt")
pk <- as.matrix(read.table(f))
pk <- as.matrix(read.table(f, sep = "\t"))
f <- file.path(path, "chrom_peak_data.txt")
if (!file.exists(f))
stop("No \"chrom_peak_data.txt\" file found in ", path)
pkd <- read.table(f)
pkd <- read.table(f, sep = "\t")
x@chromPeaks <- pk
x@chromPeakData <- pkd
x
Expand All @@ -389,12 +395,12 @@ setMethod("loadResults", signature(object = "MsBackendMzR",
#' @noRd
.export_adjusted_rtime <- function(x, path = character()) {
write.table(adjustedRtime(x), file = file.path(path, "rtime_adjusted.txt"),
row.names = FALSE, col.names = FALSE)
row.names = FALSE, col.names = FALSE, sep = "\t")
}

#' @noRd
.import_adjusted_rtime <- function(x, file = character()) {
rts <- read.table(file)[, 1L]
rts <- read.table(file, sep = "\t")[, 1L]
x@spectra$rtime_adjusted <- as.numeric(rts)
x
}
Expand All @@ -407,18 +413,20 @@ setMethod("loadResults", signature(object = "MsBackendMzR",
feature_index = rep(seq_len(nrow(fts)), lengths(fts$peakidx)),
peak_index = unlist(fts$peakidx, use.names = FALSE))
fts$peakidx <- NA
write.table(fts, file = file.path(path, "feature_definitions.txt"))
write.table(pkidx, file = file.path(path, "feature_peak_index.txt"))
write.table(fts, file = file.path(path, "feature_definitions.txt"),
sep = "\t")
write.table(pkidx, file = file.path(path, "feature_peak_index.txt"),
sep = "\t")
}

#' @noRd
.import_features <- function(x, path = character()) {
f <- file.path(path, "feature_definitions.txt")
fts <- read.table(f)
fts <- read.table(f, sep = "\t")
f <- file.path(path, "feature_peak_index.txt")
if (!file.exists(f))
stop("No \"feature_peak_index.txt\" file found in ", path)
pkidx <- read.table(f)
pkidx <- read.table(f, sep = "\t")
fts$peakidx <- unname(split(pkidx$peak_index, pkidx$feature_index))
x@featureDefinitions <- fts
x
Expand Down
Loading

0 comments on commit 9d27594

Please sign in to comment.