rev on DataFrame does not reverse column order, as it does on data.frame #68

malcook · 2020-06-18T00:21:01Z

Perhaps I should infer this from the documentation.

Or it is natural consequence from underlying implementation.

Or perhaps it is an unintentional behavior.

Can I get a witness?

Is there a contract? Might this change?

Thanks!

hpages · 2020-06-18T17:17:32Z

Yes, a consequence of underlying implementation. More precisely, it's a consequence of the rev() method for Vector derivatives doing extractROWS(x, rev(seq_len(NROW(x)))). If we wanted rev() to reverse column order, we'd need to override this with a method for DataFrame derivatives. I don't mind either way. What others think? @lawremi @LTLA

2 things though:

DataFrame does not aim at 100% consistent behavior with data.frame. Some operations purposely diverge from data.frame (e.g. ==, order, sort). Don't know if rev() is one of them though.
Changing rev() now has the potential to break a few things around.

malcook · 2020-06-18T17:23:34Z

For me, most important is having a correct mental model so I don't make invalid assumptions. Possibly improving documentation somehow would serve, but I'm unsure how best to teach/explain where the analogy to data.frame should not be expected to hold.

LTLA · 2020-06-19T07:08:34Z

IMO the current system provides intuitive and useful behavior for sort() and match(), and I don't think rev() should be an exception. It makes a lot of sense to consider the rows of a DataFrame as the individual elements for vector-like semantics, we typically want to interrogate it row-wise rather than column-wise.

In terms of documentation, I wonder whether this row-based interpretation of a DataFrame should be the conceptual default, with list-like behaviors being the deviation. Off the top of my head, there's [[, mcols and length that exhibit list-like characteristics. The practical implication of this choice would be that all other methods would be expected to follow the row-based interpretation, and those that don't use extractROWS will probably not be compliant.

malcook · 2020-06-19T14:36:04Z

I generally agree with your intuitions and even recall in the past being surprised by some of the behavior of data.frame. But I got used to it...

The DataFrame documentation begins:

 On the whole, the DataFrame behaves very similarly to
 data.frame, in terms of construction, subsetting, splitting,
 combining, etc. The most notable exception is that the row names
 are optional.

Perhaps a section detailing other (perhaps less notable) exceptions might be in order, along the lines of:

Other exceptions that are a consequence of DataFrame extending S4Vector include:

rev - reverses the row order, not the column order.

... (c.f. S4Vectors:Vector-class)

However, reviewing S4Vectors:Vector-class documentation now, I find aspects of to not fully clarify. For example:

‘rev(x)’: Return a new Vector object made of the original elements in
the reverse order.
‘length(x)’: Get the number of elements in ‘x’.

... but length(DF) tells me the number of columns in the DataFrame and rev(DF) reverses the rows of the DF.

The point is that referring to the documentation of the underlying implementation does not fully clarify, at least for me, what I might expect the behavior of rev to be.

Am I missing some clarifying principle?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rev on DataFrame does not reverse column order, as it does on data.frame #68

rev on DataFrame does not reverse column order, as it does on data.frame #68

malcook commented Jun 18, 2020

hpages commented Jun 18, 2020

malcook commented Jun 18, 2020

LTLA commented Jun 19, 2020

malcook commented Jun 19, 2020 •

edited

Loading

rev on DataFrame does not reverse column order, as it does on data.frame #68

rev on DataFrame does not reverse column order, as it does on data.frame #68

Comments

malcook commented Jun 18, 2020

hpages commented Jun 18, 2020

malcook commented Jun 18, 2020

LTLA commented Jun 19, 2020

malcook commented Jun 19, 2020 • edited Loading

malcook commented Jun 19, 2020 •

edited

Loading