Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scale_shape_manual mislabels if label argument is given and names in values are not alphabetical #5208

Open
Istalan opened this issue Feb 27, 2023 · 4 comments · May be fixed by #6237
Open

Comments

@Istalan
Copy link

Istalan commented Feb 27, 2023

I found a problem in scale_(shape)_manual, presumably others as well, where when the alphabetical sorting of values happens automatically the labels argument does not get sorted. See my example below, where A, B, C have the shapes i want but in the legend As get labeled as Cs and Cs get labeled as As.

I expected that the values argument allow me to define which shapes i want for each of my string-values and the label argument would allow me to give more human readable versions of these strings and therefore any changes in the order of values gets applied to the labels as well. I know i can fix this, by either respecting alphabetical order or defining the offending variables as factors/ordered but I'd prefer that i didn't have to if I don't care what order the legend is in.

Most importantly : This should certainly not happen without warning, as it makes the creation of wrongly labelled plots very easy.

Here is the code to reproduce the bug:

library(tidyverse)
#> Warning: Paket 'tidyverse' wurde unter R Version 4.2.2 erstellt
#> Warning: Paket 'ggplot2' wurde unter R Version 4.2.2 erstellt
#> Warning: Paket 'tibble' wurde unter R Version 4.2.2 erstellt
#> Warning: Paket 'tidyr' wurde unter R Version 4.2.2 erstellt
#> Warning: Paket 'readr' wurde unter R Version 4.2.2 erstellt
#> Warning: Paket 'purrr' wurde unter R Version 4.2.2 erstellt
#> Warning: Paket 'dplyr' wurde unter R Version 4.2.2 erstellt
#> Warning: Paket 'stringr' wurde unter R Version 4.2.2 erstellt
#> Warning: Paket 'forcats' wurde unter R Version 4.2.2 erstellt
n <- 10
df <- data.frame(x = rnorm(n), y = rnorm(n), z = sample(c("A", "B", "C"), replace = T, size = n))

ggplot(data = df, aes(x = x, y= y, shape = z)) +
  geom_point(size = 3) +
  geom_text(aes(label = z), hjust = 1, vjust = 1) +
  scale_shape_manual(values = c("C" = 1, "B" = 2, "A" = 3), 
                     label = c("long Text about C", "long Text about  B", "long Text about  A")
                     )

Created on 2023-02-27 with reprex v2.0.2

@teunbrand
Copy link
Collaborator

I think setting labels without specifying breaks is always a risky move. Perhaps the confusing bit is that you gave the values argument in a particular order, and the breaks/labels don't follow that order. Therefore, I don't think this is a bug.

However, we might be able to better by deriving the 'breaks' argument from the names(values) if appropriate.

@clauswilke
Copy link
Member

@teunbrand I think it's risky to add more heuristics that try to infer break order. It'll lead to even more confusion, and possibly breaking plots when you change the ordering of breaks.

A better approach might be to simply issue a warning when people specify labels but not breaks. I have long operated under the rule of thumb that one should never specify labels without specifying breaks. We could just formalize that rule. Maybe a more nuanced way of stating it is one should not specify labels via a vector of specific values without specifying breaks. Providing a formatting function is fine of course.

@teunbrand
Copy link
Collaborator

Agreed, that might be a wiser decision. It should be relatively straightforward to adjust ggplot2:::check_breaks_labels() I presume.

@teunbrand teunbrand added the messages requests for improvements to error, warning, or feedback messages label Mar 27, 2023
@teunbrand
Copy link
Collaborator

I've tried throwing a warning when labels are atomic and breaks is missing, but it'll throw quite a lot of warnings in test code that is fine in principle. The most common example is that you're using a discrete scale for a factor where you know the level order, which makes a warning about not specifying breaks seem a little bit befuddling. We might just encourage specifying breaks in the documentation though.

@teunbrand teunbrand added documentation and removed messages requests for improvements to error, warning, or feedback messages labels Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants