Question

我有一个数据框

df <- data.frame(respondent = factor(c(1, 2, 3, 4, 5, 6, 7)),
                 location = factor(c("US: California", "US: Oregon", "Mexico",
                                     "US: Texas", "Canada", "Mexico", "Canada")))

与美国有三个不同的级别。我不想让它们崩溃，因为状态之间的区别对数据分析很有用。然而，我想要一个基本的条形图，将美国各州叠加在一起，以便在条形图中有三个条形图 - 加拿大，墨西哥和美国 - 最后一个分为三个状态，而不是这个：

ggplot(df, aes(location, 1))+
    geom_bar(stat = "identity")+
    theme(axis.text.x = element_text(angle = 45, hjust = 1),
          text = element_text(size=10))

给了我五个酒吧，其中三个为美国。

Stackoverflow（Grouping/stacking factor levels in ggplot bar chart）上有一个旧的类似问题，但给出的解决方案非常复杂。我希望有一种更简单的方法来实现这一目标。有什么想法可以做到吗？

Answer 1

tidyverse解决方案：使用separate中的tidyr将:的位置拆分为两列，一列用于国家，一列用于州。没有冒号分隔符的位置将获得NA状态，您可以将其替换为“在美国境外”。

我将此非美国级别移至最后，因此它会在图例中显示最后一个，但这可能不是您的目的所必需的。然后根据状态设置填充，这样您就可以看到US值按状态叠加。

您可能还想设置一个刻度，使非美国值的静音或灰色与叠加的颜色形成对比，但我会将设计问题留给您。

library(tidyverse)
df <- data.frame(respondent = factor(c(1, 2, 3, 4, 5, 6, 7)), location = factor(c("US: California", "US: Oregon", "Mexico", "US: Texas", "Canada", "Mexico", "Canada")))

with_states <- df %>%
  separate(location, into = c("Country", "State"), sep = ": ") %>%
  replace_na(list(State = "Outside US")) %>%
  mutate(State = as.factor(State) %>% fct_relevel("Outside US", after = Inf))

ggplot(with_states, aes(x = Country, y = 1, fill = State)) +
  geom_col()

由reprex package（v0.2.0）创建于2018-05-24。

ggplot中的组因子级别

1 个答案: