Policy for `metadata` when combining objects #79

LTLA · 2021-04-08T07:30:12Z

Consider:

library(S4Vectors)
X <- DataFrame(X=1)
metadata(X)$X <- "WHEE"

Y <- DataFrame(Y=1)
metadata(Y)$Y <- "FOO"

metadata(cbind(X, Y))
## $X
## [1] "WHEE"

That's fine, I guess. But then:

library(SummarizedExperiment)
xx <- SummarizedExperiment()
metadata(xx)$X <- "WHEE"

yy <- SummarizedExperiment()
metadata(yy)$Y <- "FOO"

metadata(cbind(xx, yy))
## $X
## [1] "WHEE"
## 
## $Y
## [1] "FOO"

Should there be a consistent policy here? IMO it would make most sense to c the metadata lists, removing duplicate names (plus a warning if their values are not identical). This has the nice properties of:

Preserving most information, provided that they have different names in the various objects. TBH, the lost information might not be too bad; list elements with the same name but different values aren't that helpful in downstream analyses anyway, especially if we no longer have the knowledge about which of the original objects they came from.
Ensuring that, e.g., cbind(df[,0], df) would give back df. This wouldn't be the case if you just continually appended the metadata lists together, which would arbitrarily extend the metadata list in the bind'd object.

One could even imagine writing a combineMetadata() function that all Annotated subclasses can call, so as to easily combine the metadata() fields in a standard way for c, rbind, cbind, combineRows, combineCols, etc. etc.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Policy for `metadata` when combining objects #79

Policy for `metadata` when combining objects #79

LTLA commented Apr 8, 2021

Policy for metadata when combining objects #79

Policy for metadata when combining objects #79

Comments

LTLA commented Apr 8, 2021

Policy for `metadata` when combining objects #79

Policy for `metadata` when combining objects #79