Question

我有以下数据框：

df<-data.frame(Name= c(rep("A",3), rep("B",5)), Month = c(1,2,3,1,2,3,3,3), Volume = c(50,0,50,50,50,50,50,50))

我想更新一个专栏＆＃34; Count＆＃34;表示每个名称的唯一月数：

df<-df%>%
  group_by(Name) %>%
  mutate(Count = n_distinct(Month))

但是，如何添加过滤器以便我只计算其对应值＆gt;的月份。 0？这是我想要的输出：

df<-data.frame(Name= c(rep("A",3), rep("B",5)), Month = c(1,2,3,1,2,3,3,3), Volume = c(50,0,50,50,50,50,50,50), Count = c(2,2,2,3,3,3,3,3))

谢谢！

Answer 1

我们可以使用n_distinct函数以及在逻辑表达式中包含duplicated，而不是使用Volume > 0函数：

df %>%
    group_by(Name) %>%
    mutate(Count = sum(!duplicated(Month) & Volume > 0)) # not duplicated, Volume > 0

    Name Month Volume Count
  <fctr> <dbl>  <dbl> <int>
1      A     1     50     2
2      A     2      0     2
3      A     3     50     2
4      B     1     50     3
5      B     2     50     3
6      B     3     50     3
7      B     3     50     3
8      B     3     50     3

Answer 2

您只需要向Month ...

添加条件

df <- df %>%
      group_by(Name) %>%
      mutate(Count = n_distinct(Month[Volume>0]))

df
# A tibble: 8 x 4
# Groups:   Name [2]
    Name Month Volume Count
  <fctr> <dbl>  <dbl> <int>
1      A     1     50     2
2      A     2      0     2
3      A     3     50     2
4      B     1     50     3
5      B     2     50     3
6      B     3     50     3
7      B     3     50     3
8      B     3     50     3

Answer 3

尝试：

df%>%
  group_by(Name) %>%
  mutate(Count = n_unique(Month[Volume >0]))

用条件计算n_distinct

3 个答案: