Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering stats results breaks add_xy_position #197

Open
s-andrews opened this issue Oct 27, 2023 · 1 comment
Open

Filtering stats results breaks add_xy_position #197

s-andrews opened this issue Oct 27, 2023 · 1 comment

Comments

@s-andrews
Copy link

If I want to plot the stats results for only the significant results from a stats test I can filter the output tibble before running add_xy_position, however this results in the y positions being spaced inconsistently rather than all the brackets being evenly separated.

For example from the following data:

library(tidyverse)
library(rstatix)
library(ggpubr)

set.seed(1)

tibble(
  A = rnorm(10,mean=1),
  B = rnorm(10,mean=1),
  c = rnorm(10,mean=3),
  D = rnorm(10,mean=5),
  E = rnorm(10,mean=1)
) %>%
  pivot_longer(
    cols=everything(),
    names_to="group",
    values_to="value"
  ) %>%
  filter(!is.na(group)) -> data

If I do a test on all groups and calculate the y position all values are equally spaced:

data %>%
  tukey_hsd(value~group) %>%
  add_xy_position() -> stats_all

# All the differences are the same
diff(stats_all$y.position, lag=1)

# Gives [1] 0.4788 0.4788 0.4788 0.4788 0.4788 0.4788 0.4788 0.4788 0.4788

...and the plot shows equal spacing:

data %>%
  ggplot(aes(x=group, y=value)) +
  geom_boxplot() +
  stat_pvalue_manual(stats_all)

However if I filter for only significant results then I get unequal spacing:

data %>%
  tukey_hsd(value~group) %>%
  filter(p.adj<0.05) %>%
  add_xy_position() -> stats_significant

# Spacings are now different between different comparisons
# Space is left for other comparisons even though they're
# not there
diff(stats2$y.position, lag=1)

# Gives [1] 0.4788 0.9576 0.4788 0.9576 0.4788 0.4788

..the plot is therefore also unequal

data %>%
  ggplot(aes(x=group, y=value)) +
  geom_boxplot() +
  stat_pvalue_manual(stats_significant)

I should also note that the same effect exists if you use the hide.ns option to stat_pvalue_manual but this is likely a different issue with a different fix.

@SamGG
Copy link

SamGG commented Dec 10, 2023

Hi,
I got the same issue that is due to the fact that filtering occurs after positions have been computed.
My current workaround is to explicitly set the comparisons.

data %>%
  tukey_hsd(value~group) %>%
  filter(p.adj<0.05) %>%
  add_xy_position(comparisons = with(., Map(c, group1, group2))) -> stats_significant
diff(stats_significant$y.position)

There should be a better writing than my poor knowledge of tidyr.

@kassambara In add_y_position function, I think comparisons should be set before

positions <- get_y_position(

which should avoid the join/merge call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants