dplyr使用另一列汇总分组数据

时间:2017-10-24 04:06:45

标签: r group-by dplyr data-manipulation summarize

我有一个数据框pop.subset <-

state  location   pop
WA     Seattle    100
WA     Kent       20
OR     foo        30
CA     foo2       80

我需要在每个州拥有最低人口的城市存储在data.frame中。 我有:

result <- pop.subset %>% 
          group_by(state) %>%
          summarise(min = min(pop))

返回data.frame:

state   min
WA      20
...    .... etc

但我也需要这个城市。我尝试在group_by函数中包含位置,如下所示:group_by(state, location),但是这会给每个城市的每个城市配对一个州,而不是像城市这样的州:

state location pop
WA    Seattle  100
WA    Kent     20
foo   foo      foo

我错过了一个简单的解决方案吗?我希望我的结果如此:

state location pop
WA    Kent     20
...   ...      ... etc.

3 个答案:

答案 0 :(得分:0)

你尝试过这样的事吗?

result <- pop.subset %>% 
              group_by(state, location) %>%
              summarise(min = min(both_sexes_2012))

答案 1 :(得分:0)

我认为您希望按state分组,然后过滤min(pop)

pop.subset %>% 
  group_by(state) %>% 
  filter(pop == min(pop)) %>%
  ungroup()

# A tibble: 3 x 3
  state location   pop
  <chr>    <chr> <int>
1    WA     Kent    20
2    OR      foo    30
3    CA     foo2    80

答案 2 :(得分:0)

我理解,这解决了它:

library(tibble)

data<-tribble(~state,  ~location,   ~pop,
       "WA",     "Seattle",    100,
       "WA",    "Kent",       20,
       "OR",     "foo" ,       30,
       "CA",     "foo2" ,      80

)

library(dplyr)

data%>%group_by(state)%>%summarise(location=location[which.min(pop)]
                                   ,min=min(pop))


# A tibble: 3 x 3
  state location   min
  <chr>    <chr> <dbl>
1    CA     foo2    80
2    OR      foo    30
3    WA     Kent    20