Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: updates of documentation and extension of unit tests #753

Merged
merged 1 commit into from
Jun 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Imports:
MsCoreUtils (>= 1.15.7),
MsFeatures,
MsExperiment (>= 1.5.4),
Spectra (>= 1.13.7),
Spectra (>= 1.15.3),
progress,
jsonlite,
RColorBrewer,
Expand Down
194 changes: 101 additions & 93 deletions R/PlainTextParam.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
#' @include XcmsExperiment.R
#'
#' @title Store contents of `MsExperiment` and `XcmsExperiment` objects as
#' plain text files
#'
Expand All @@ -9,76 +10,79 @@
#' @family xcms result export and import formats.
#'
#' @description
#' The `PlainTextParam` class and method enable users to save/load an
#' `MsExperiment` or `XcmsExperiment` object as a collections of plain text
#' files in/from a specified folder.
#' Note that, while for all xcms results within the `XcmsExperiment` can
#' and will be exported, the full raw MS data (of the object's `Spectra` object)
#' will currently not be exported in plain text format. For `Spectra` using the
#' [MsBackendMzR()] backend, the names of the raw data files will however be
#' exported (which enables to *load* the full `Spectra` respectively
#' `MsExperiment` objects).
#'
#' For an `MsExperiment` object, the exported files include:
#'
#' - The [sampleData()] stored as a text file named *sample_data.txt*.
#' The `storeResults()` and `loadResults()` methods with the `PlainTextParam`
philouail marked this conversation as resolved.
Show resolved Hide resolved
#' option enable users to save/load an `MsExperiment` or `XcmsExperiment`
#' object as a collections of plain text files in/from a specified folder. This
#' folder, defined with the `path` parameter, will be created by the
#' `storeResults()` function. Any previous exports eventually present in that
#' folder will be overwritten.
#'
#' For a `Spectra` object, the exported files include:
#' For an `MsExperiment` object, the exported files, stored into the directory
#' specifyied with the `path` parameter, include:
#'
#' - The [spectraData()] stored in a tabular format in a text file named
#' *backend_data.txt*.
#'
#' - The `processingQueueVariables`, `processing`, [processingChunkSize()] and
#' `backend` class information of the object stored in a text file named
#' *spectra_slots.txt*.
#'
#' - Processing queue of the `Spectra` object, ensuring that any spectra data
#' modifications are retained. It is stored in a `json` file named
#' *spectra_processing_queue.json*.
#'
#' Note : The Spectra object will only be exported if it uses a
#' [MsBackendMzR()] backend. Other backends are no supported as of now.
#' - The [sampleData()] stored as a text file named *sample_data.txt*.
#'
#' For an `XcmsExperiment` object, the exported files are the same as those
#' for an `MsExperiment` object, with the addition of the following:
#'
#' - The [processHistory()] information of the object, stored in a `json` file
#' named *process_history.json*.
#' named *process_history.json*.
#'
#' - The chromatographic peak information obtained with [chromPeaks()] and
#' [chromPeaksData()], stored in tabular format in the text files
#' *chrom_peaks.txt* and *chrom_peak_data.txt* respectively.
#' [chromPeaksData()], stored in tabular format in the text files
#' *chrom_peaks.txt* and *chrom_peak_data.txt* respectively.
#'
#' - The retention time information obtained with [adjustedRtime()] stored
#' in the text file named *rtime_adjusted.txt*.
#' in a text file named *rtime_adjusted.txt*.
#'
#' - The [featureDefinitions()] stored in a text file named
#' *feature_definitions.txt*. Additionally, a second file named
#' *feature_peak_index.txt* is generated to connect the features' definitions
#' with their names.
#' *feature_definitions.txt*. Additionally, a second file named
#' *feature_peak_index.txt* is generated to connect the features' definitions
#' with their names.
#'
#' This `param` class and method are part of the possible dispatch of the
#' generic functions `storeResults()` and `loadResults()`.
#' The folder defined in the `path` parameter will be created by calling
#' `storeResults`. If the folder already exists, previous exports in that
#' folder might get overwritten.
#' For a `Spectra` object, the exported files include:
#'
#' @param path for `PlainTextParam` `character(1)`, defining where the files
#' are going to be stored/ should be loaded from. The default will be
#' `tempdir()`.
#' - The `processingQueueVariables`, `processing`, [processingChunkSize()] and
#' `backend` class information of the object stored in a text file named
#' *spectra_slots.txt*.
#'
#' - The processing queue of the `Spectra` object, ensuring that any spectra
#' data modifications are retained. It is stored in a `json` file named
#' *spectra_processing_queue.json*.
#'
#' Import/export of the MS data depends on the respective implementation of
#' the respective `MsBackend` object. For `MsBackendMzR`, the exported data
#' and related text files are:
#'
#' - The backend's [spectraData()] stored in a tabular format in a text file
#' named *backend_data.txt*.
#'
#' @param spectraPath for `loadResults` `character(1)`, defining the
#' absolute path where the spectra files should be imported from when loading
#' the object. The default will be set using the common file path of all the
#' spectra files when exporting.
#' @note
#'
#' The function relies on the `storeResults()` and `loadResults()` methods of
#' the [Spectra()] object and the used [MsBackend()] to store and restore the
#' MS data. These methods might not be available for all `MsBackend`
#' implementations. Also, it might be required to specify the path containing
#' the MS data files using the `spectraPath` parameter.
#'
#' @param path For `PlainTextParam()`: `character(1)`, defining where the files
#' are going to be stored/ should be loaded from. The default is
#' `path = tempdir()`.
#'
#' @param spectraPath For `loadResults()`: `character(1)` optionally allowing to
#' define the (absolute) path where the spectra files (*data storage files*)
#' can be found. This parameter is passed to the `loadResults()` method of
#' the [MsBackend()].
#'
#' @inheritParams storeResults
#'
#' @return for `PlainTextParam`: a `PlainTextParam` class. `storeResults` does
#' @return For `PlainTextParam`: a `PlainTextParam` class. `storeResults` does
#' not return anything but saves the object to collections of different plain
#' text files to a folder.
#' text files to a folder. The `loadResults()` method returns the restored
#' data as an instance of the class specified with parameter `object`.
#'
#' @author Philippine Louail, Johannes Rainer.
#' @author Philippine Louail
#'
#' @importFrom jsonlite serializeJSON write_json unserializeJSON read_json
#'
Expand All @@ -93,6 +97,7 @@
#' @importFrom stats setNames
#'
#' @examples
#'
#' ## Load test data set of class `MsExperiment`
#' library(MsExperiment)
#' fls <- dir(system.file("sciex", package = "msdata"), full.names = TRUE)
Expand Down Expand Up @@ -153,8 +158,8 @@ setMethod("storeResults",
function(object, param){
dir.create(path = param@path,
recursive = TRUE,
showWarnings = TRUE)
write.table(as.data.frame(sampleData(object)),
showWarnings = FALSE)
write.table(as.data.frame(sampleData(object)), sep = "\t",
file = file.path(param@path,
"sample_data.txt"))
## call export of individual other objects (not MsExperiment data)
Expand All @@ -167,7 +172,7 @@ setMethod("storeResults",
setMethod("storeResults",
signature(object = "XcmsExperiment",
param = "PlainTextParam"),
function(object, param){
function(object, param) {
callNextMethod()
.store_xcmsexperiment(x = object, path = param@path)
}
Expand All @@ -177,38 +182,36 @@ setMethod("storeResults",
setMethod("loadResults",
signature(object = "MsExperiment",
param = "PlainTextParam"),
function(object, param, spectraPath = character()){
function(object, param, spectraPath = character()) {
fl <- file.path(param@path, "sample_data.txt")
if (file.exists(fl)){ # should i have a error if does not exist ?
philouail marked this conversation as resolved.
Show resolved Hide resolved
sd <- read.table(fl)
rownames(sd) <- NULL #read.table force numbering of rownames
}
s <- loadResults(Spectra(), param)
if (!file.exists(fl))
stop("No 'sample_data.txt' file found in the provided path.")
sd <- read.table(fl, sep = "\t")
rownames(sd) <- NULL #read.table force numbering of rownames
s <- loadResults(Spectra(), param, spectraPath = spectraPath)
philouail marked this conversation as resolved.
Show resolved Hide resolved
res <- MsExperiment(sampleData = sd, spectra = s)
validObject(res)
res
}
)
})

#' @rdname PlainTextParam
setMethod("loadResults",
signature(object = "XcmsExperiment",
param = "PlainTextParam"),
function(object, param, spectraPath){
res <- callNextMethod() #check if need to add spectraPath = spectraPath
philouail marked this conversation as resolved.
Show resolved Hide resolved
function(object, param, spectraPath = character()) {
res <- callNextMethod()
res <- .load_xcmsexperiment(res, path = param@path)
validObject(res)
res
}
)
})

#' @rdname PlainTextParam
setMethod("storeResults", signature(object = "Spectra",
param = "PlainTextParam"),
function(object, param) {
dir.create(path = param@path,
recursive = TRUE,
showWarnings = TRUE)
showWarnings = FALSE)
philouail marked this conversation as resolved.
Show resolved Hide resolved
if (!existsMethod("storeResults", c(class(object@backend)[1L],
"PlainTextParam")))
stop("Can not store a 'Spectra' object with backend '",
Expand All @@ -234,9 +237,9 @@ setMethod("loadResults", signature(object = "Spectra",
"PlainTextParam")))
stop("Can not store a 'Spectra' object with backend '",
variables["backend"], "'")
b <- loadResults(object= do.call(what = variables[["backend"]],
args = list()),
param = param, spectraPath = spectraPath) ##better way to do this ?
philouail marked this conversation as resolved.
Show resolved Hide resolved
b <- loadResults(object = do.call(what = variables[["backend"]],
args = list()),
param = param, spectraPath = spectraPath)
s <- Spectra(b)
s@processingQueueVariables <- unlist(strsplit(variables[["processingQueueVariables"]],
"|", fixed = TRUE))
Expand All @@ -249,37 +252,40 @@ setMethod("loadResults", signature(object = "Spectra",
s
})



# Notes: This and the Spectra method will be moved to it's respective package
#' @rdname PlainTextParam
setMethod("storeResults", signature(object = "MsBackendMzR",
param = "PlainTextParam"),
function(object, param) {
dir.create(path = param@path,
recursive = TRUE,
showWarnings = TRUE)
showWarnings = FALSE)
object <- dropNaSpectraVariables(object)
write.table(object@spectraData,
file = file.path(param@path, "backend_data.txt"),
sep = "\t", quote = FALSE)
fl <- file.path(param@path, "backend_data.txt")
if (file.exists(fl))
warning("Overwriting already present 'backend_data.txt' file")
writeLines(paste0("# ", class(object)[1L]), con = fl)
suppressWarnings(
write.table(object@spectraData,
file = fl, sep = "\t", quote = FALSE,
append = TRUE, row.names = FALSE))
})

#' @rdname PlainTextParam
setMethod("loadResults", signature(object = "MsBackendMzR",
param = "PlainTextParam"),
function(object, param, spectraPath = character()) {
b <- MsBackendMzR()
data <- read.table(file = file.path(param@path,
"backend_data.txt"),
sep = "\t", header = TRUE)
fl <- file.path(param@path, "backend_data.txt")
if (!file.exists(fl))
stop("No 'backend_data.txt' file found in the provided path.")
data <- read.table(file = fl, sep = "\t", header = TRUE)
rownames(data) <- NULL
data <- DataFrame(data)
b@spectraData <- data
if (length(spectraPath) > 0){
old <- MsCoreUtils::common_path(dataStorage(b))
if (nchar(old) > 0)
old <- paste0(old, "/")
if (length(spectraPath) > 0) {
old <- common_path(dataStorage(b))
## if (nchar(old) > 0)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if we need this - I would skip since / and // work both, so better to be sure that there is a / before the file name.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm I need to test it, because I thought the result of common_path never resulted with a / at the end of the path, so i thought that was necessary

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After checking, isn't that how the function behave ? and we definitely need to add this / ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how I understood it (can be wrong), but yes, the result from common_path() never ends in a "/". so, assume we have "/some/path/some/file.txt", then the common path would be "/some/path/some" and assuming the user defines spectraPath = "/new/path". We would like to replace to get "/new/path/file.txt". If we always add a "/"at theold, oldwould be in this case"/some/path/some/"and thesub()call will replace"/some/path/some/"with"/new/path"- and we get"/new/pathfile.txt". that's why I would not append the trailing "/"` to the pattern we want to replace.

Also, "/new/path/file.txt" and "/new/path//file.txt" both work - so, if the user would provide spectraPath = "/new/path/" (i.e. with a trailing "/") it would/should still work.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooooh of course ! No that make complete sense, but ye I think that the less we have if statement the better so i will remove it.

## old <- paste0(old, "/")
dataStorage(b) <- sub(old, spectraPath, dataStorage(b))
}
b
Expand All @@ -296,13 +302,12 @@ setMethod("loadResults", signature(object = "MsBackendMzR",
writeLines(paste0("processingQueueVariables = ", paste(pq, collapse = "|")),
con = con)
p <- x@processing
writeLines(paste0("processing = ", paste(p,collapse = "|")), con = con)
writeLines(paste0("processing = ", paste(p, collapse = "|")), con = con)
writeLines(paste0("processingChunkSize = ", processingChunkSize(x)),
con = con)
writeLines(paste0("backend = ", class(x@backend)[1L]), con = con)
}


#' Processing queue
#' @param x `Spectra`
#'
Expand Down Expand Up @@ -367,19 +372,20 @@ setMethod("loadResults", signature(object = "MsBackendMzR",
#' Chromatographic peaks
#' @noRd
.export_chrom_peaks <- function(x, path = character()) {
write.table(chromPeaks(x), file = file.path(path, "chrom_peaks.txt"))
write.table(as.data.frame(chromPeakData(x)),
write.table(chromPeaks(x), file = file.path(path, "chrom_peaks.txt"),
philouail marked this conversation as resolved.
Show resolved Hide resolved
sep = "\t")
write.table(as.data.frame(chromPeakData(x)), sep = "\t",
file = file.path(path, "chrom_peak_data.txt"))
}

#' @noRd
.import_chrom_peaks <- function(x, path = character()) {
f <- file.path(path, "chrom_peaks.txt")
pk <- as.matrix(read.table(f))
pk <- as.matrix(read.table(f, sep = "\t"))
f <- file.path(path, "chrom_peak_data.txt")
if (!file.exists(f))
stop("No \"chrom_peak_data.txt\" file found in ", path)
pkd <- read.table(f)
pkd <- read.table(f, sep = "\t")
x@chromPeaks <- pk
x@chromPeakData <- pkd
x
Expand All @@ -389,12 +395,12 @@ setMethod("loadResults", signature(object = "MsBackendMzR",
#' @noRd
.export_adjusted_rtime <- function(x, path = character()) {
write.table(adjustedRtime(x), file = file.path(path, "rtime_adjusted.txt"),
row.names = FALSE, col.names = FALSE)
row.names = FALSE, col.names = FALSE, sep = "\t")
}

#' @noRd
.import_adjusted_rtime <- function(x, file = character()) {
rts <- read.table(file)[, 1L]
rts <- read.table(file, sep = "\t")[, 1L]
x@spectra$rtime_adjusted <- as.numeric(rts)
x
}
Expand All @@ -407,18 +413,20 @@ setMethod("loadResults", signature(object = "MsBackendMzR",
feature_index = rep(seq_len(nrow(fts)), lengths(fts$peakidx)),
peak_index = unlist(fts$peakidx, use.names = FALSE))
fts$peakidx <- NA
write.table(fts, file = file.path(path, "feature_definitions.txt"))
write.table(pkidx, file = file.path(path, "feature_peak_index.txt"))
write.table(fts, file = file.path(path, "feature_definitions.txt"),
sep = "\t")
write.table(pkidx, file = file.path(path, "feature_peak_index.txt"),
sep = "\t")
}

#' @noRd
.import_features <- function(x, path = character()) {
f <- file.path(path, "feature_definitions.txt")
fts <- read.table(f)
fts <- read.table(f, sep = "\t")
f <- file.path(path, "feature_peak_index.txt")
if (!file.exists(f))
stop("No \"feature_peak_index.txt\" file found in ", path)
pkidx <- read.table(f)
pkidx <- read.table(f, sep = "\t")
fts$peakidx <- unname(split(pkidx$peak_index, pkidx$feature_index))
x@featureDefinitions <- fts
x
Expand Down
Loading
Loading