代码之家 › 专栏 › 技术社区 › LLL

覆盖/叠加ggplot2中的分组条形图

ggplot2 r

LLL · 技术社区 · 7 年前

在每个时间点,参与者被问到两个问题(“疼痛”和“恐惧”),他们会给出1、2或3分的答案。

我现有的代码很好地绘制了“之前”时间点的数据计数,但我似乎无法添加“之后”数据的计数。

这是我希望添加“after”数据后的绘图外观的草图,黑条表示“after”数据:

我想在ggplot2()中绘制这个图,我已经尝试修改 How to superimpose bar plots in R? 但我不能让它用于分组数据。

非常感谢!

#DATA PREP
library(dplyr)
library(ggplot2)
library(tidyr)


df <- data.frame(before_fear=c(1,1,1,2,3),before_pain=c(2,2,1,3,1),after_fear=c(1,3,3,2,3),after_pain=c(1,1,2,3,1))


df <- df %>% gather("question", "answer_option") # Get the counts for each answer of each question 
df2 <- df  %>%
  group_by(question,answer_option) %>%
  summarise (n = n()) 
df2 <- as.data.frame(df2)


df3 <- df2 %>% mutate(time = factor(ifelse(grepl("before", question), "before", "after"),
                                        c("before", "after"))) # change classes and split data into two data frames
df3$n <- as.numeric(df3$n)
df3$answer_option <- as.factor(df3$answer_option)
df3after <- df3[ which(df3$time=='after'), ]
df3before <- df3[ which(df3$time=='before'), ]


# CODE FOR 'BEFORE' DATA ONLY PLOT - WORKS  
    ggplot(df3before, aes(fill=answer_option, y=n, x=question)) + geom_bar(position="dodge", stat="identity")



# CODE FOR 'BEFORE' AND 'AFTER' DATA PLOT - DOESN'T WORK
ggplot(mapping = aes(x, y,fill)) +
  geom_bar(data = data.frame(x = df3before$question, y = df3before$n, fill= df3before$index_value), width = 0.8, stat = 'identity') +
  geom_bar(data = data.frame(x = df3after$question, y = df3after$n, fill=df3after$index_value), width = 0.4, stat = 'identity', fill = 'black') +
  theme_classic() + scale_y_continuous(expand = c(0, 0))

2 回复 | 直到 7 年前

Henrik plannapus 7 年前

我想线索是设置这个 width 但为了躲避他们犹如它们的宽度为0.9(即与“之前”条相同的(默认)宽度)。另外,因为我们没有地图 fill 在“后”栏中,我们需要使用 group 审美而不是达到逃避。

我希望只有一个数据集,并且在每次调用中只将其子集 geom_col .

ggplot(mapping = aes(x = question, y = n, fill = factor(ans))) +
  geom_col(data = d[d$t == "before", ], position = "dodge") +
  geom_col(data = d[d$t == "after", ], aes(group = ans),
           fill = "black", width = 0.5, position = position_dodge(width = 0.9))

数据:

set.seed(2)
d <- data.frame(t = rep(c("before", "after"), each = 6),
                question = rep(c("pain", "fear"), each = 3),
                ans = 1:3, n = sample(12))

替代数据准备使用 data.table ,从原始“df”开始:

library(data.table)
d <- melt(setDT(df), measure.vars = names(df), value.name = "ans")
d[ , c("t", "question") := tstrsplit(variable, "_")]

预先计算计数,然后按上述步骤进行 风水柱

# d2 <- d[ , .N, by = .(question, ans)]

还是让 geom_bar 进行计数:

ggplot(mapping = aes(x = question, fill = factor(ans))) +
  geom_bar(data = d[d$t == "before", ], position = "dodge") +
  geom_bar(data = d[d$t == "after", ], aes(group = ans),
           fill = "black", width = 0.5, position = position_dodge(width = 0.9))

数据:

df <- data.frame(before_fear = c(1,1,1,2,3), before_pain = c(2,2,1,3,1),
                     after_fear = c(1,3,3,2,3),after_pain = c(1,1,2,3,1))

Spacedman 7 年前

我的解决方案与@henrik非常相似,但我想指出一些事情。

首先,您要在 geom_col 这可能比你需要的更混乱。如果你已经创建了 df3after 等等,你也可以在你的 ggplot .

其次,我很难跟踪你的整理。我想有两个人 tidyr 函数可能会使此任务对您更简单,因此我选择了不同的路径,例如使用 separate 创建的列 time 和 measure ,而不是手动搜索它们,使其更具可扩展性。这也让你把“痛苦”和“恐惧”放在你的X轴上,而不是仍然有“痛苦之前”和“恐惧之前”,一旦你在图上有了“痛苦之后”的值,这不再是准确的表示。但你可以不理会这一点,坚持自己的方法。

library(tidyverse)

df <- data.frame(before_fear = c(1,1,1,2,3),
                 before_pain = c(2,2,1,3,1),
                 after_fear = c(1,3,3,2,3),
                 after_pain = c(1,1,2,3,1))
df_long <- df %>%
  gather(key = question, value = answer_option) %>%
  mutate(answer_option = as.factor(answer_option)) %>%
  count(question, answer_option) %>%
  separate(question, into = c("time", "measure"), sep = "_", remove = F)

df_long
#> # A tibble: 12 x 5
#>    question    time   measure answer_option     n
#>    <chr>       <chr>  <chr>   <fct>         <int>
#>  1 after_fear  after  fear    1                 1
#>  2 after_fear  after  fear    2                 1
#>  3 after_fear  after  fear    3                 3
#>  4 after_pain  after  pain    1                 3
#>  5 after_pain  after  pain    2                 1
#>  6 after_pain  after  pain    3                 1
#>  7 before_fear before fear    1                 3
#>  8 before_fear before fear    2                 1
#>  9 before_fear before fear    3                 1
#> 10 before_pain before pain    1                 2
#> 11 before_pain before pain    2                 2
#> 12 before_pain before pain    3                 1

我将其拆分为前后数据集,如您所做的,然后用2绘制它们 风水柱 S.I仍然 df_long 进入之内 格格图 把它当作一个假人来获得统一的X和Y美学。就像@henrik说的,你可以使用不同的 width 在 风水柱 在它里面 position_dodge 以90%的宽度躲避钢筋,但仅给钢筋本身提供40%的宽度。

df_before <- df_long %>% filter(time == "before")
df_after <- df_long %>% filter(time == "after")

ggplot(df_long, aes(x = measure, y = n)) +
  geom_col(aes(fill = answer_option), 
    data = df_before, width = 0.9, 
    position = position_dodge(width = 0.9)) +
  geom_col(aes(group = answer_option), 
    data = df_after, fill = "black", width = 0.4, 
    position = position_dodge(width = 0.9))

你可以不制作两个独立的数据帧,而是在每个数据帧内部进行过滤。 风水柱 . 这通常是我的偏好,除非过滤更复杂。此代码将获得与上面相同的绘图。

ggplot(df_long, aes(x = measure, y = n)) +
  geom_col(aes(fill = answer_option), 
    data = . %>% filter(time == "before"), width = 0.9, 
    position = position_dodge(width = 0.9)) +
  geom_col(aes(group = answer_option), 
    data = . %>% filter(time == "after"), fill = "black", width = 0.4, 
    position = position_dodge(width = 0.9))