Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No row/colSums for a DataFrame? #48

Open
LTLA opened this issue Jul 31, 2019 · 4 comments
Open

No row/colSums for a DataFrame? #48

LTLA opened this issue Jul 31, 2019 · 4 comments

Comments

@LTLA
Copy link
Contributor

LTLA commented Jul 31, 2019

I'm a bit surprised we don't have colSums and rowSums methods for DataFrame.

colSums(DataFrame(a=1:10))
## Error in base::colSums(x, na.rm = na.rm, dims = dims, ...) :
##  'x' must be an array of at least two dimensions
rowSums(DataFrame(a=1:10))
## Error in base::rowSums(x, na.rm = na.rm, dims = dims, ...) :
##   'x' must be an array of at least two dimensions

This should be as simple as:

setMethod("colSums", "DataFrame", function (x, na.rm = FALSE, dims = 1, ...) {
     # Some work required to respect 'dims', whatever that does.
     unlist(lapply(x, sum, na.rm=na.rm))
})

setMethod("rowSums", "DataFrame", function(x, na.rm=FALSE, dims=1, ...) {
     output <- integer(nrow(x))
     for (i in seq_along(x)) {
          y <- x[[i]]
            if (na.rm) {
              y[is.na(y)] <- 0
          }
          output <- output + y
     }
     output
})
@lawremi
Copy link
Collaborator

lawremi commented Jul 31, 2019

This was intentionally left out because those methods really only make sense on matrices. It's unfortunate that base::colSums() only coerces to matrix when x is a data.frame. Otherwise, it would just work, for better or worse.

I guess base::colSums() could coerce anything that has >= 2 dimensions to an array. One issue is that as.array() fails on a data.frame; it should probably just delegate to as.matrix().

@LTLA
Copy link
Contributor Author

LTLA commented Jul 31, 2019

I guess base::colSums() could coerce anything that has >= 2 dimensions to an array.

That would make sense. I have some DataFrames of statistics that I just want to take colSums over, and it is annoying to have to remember to as.matrix() them when they should just work.

@LTLA
Copy link
Contributor Author

LTLA commented Sep 4, 2019

Looking at base::colSums() indicates that it takes the time to check for array-ness anyway, so it should be a simple (and near cost-less) matter of just breaking up:

    if (!is.array(x) || length(dn <- dim(x)) < 2L) 
        stop("'x' must be an array of at least two dimensions")

and putting an as.matrix() call later under a lone !is.array(x) block.

@lawremi
Copy link
Collaborator

lawremi commented Sep 4, 2019

I think it should coerce to an array though, not necessarily a matrix, since the object could have more than two dimensions. The data.frame case could also be removed if as.array() gained a method to support data.frame.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants