在 R 中,找到给定特定列的相邻行

时间:2021-04-14 18:23:54

标签: r dplyr

我有一个数据框,其中包含客户和类别、产品以及开展外展活动的部门的列表:

Client    |  Category   |  Product      |  Department
Mike S.   |  Home Goods |  Carpet       |  Sales
Mike S.   |  Outdoor    |  Shovel       |  Sales
Mike S.   |  Outdoor    |  Garden Hose  |  Marketing
Bill T.   |  Outdoor    |  Garden Hose  |  Marketing
Bill T.   |  Outdoor    |  Garden Hose  |  Sales
Bill T.   |  Outdoor    |  Leaf Blower  |  Sales
Bill T.   |  Home Goods |  Recliner     |  Marketing

我希望提供一个营销电子邮件列表,其中 Sales 已在类别级别(而非产品)进行外展,但 Marketing 尚未进行。

这是所需的输出:

Client    |  Category   |  Product      |  Department          
Mike S.   |  Home Goods |  Carpet       |  Sales                   
Bill T.   |  Outdoor    |  Leaf Blower  |  Sales     

2 个答案:

答案 0 :(得分:1)

这是一种方法,它在每个客户类别组内添加营销和销售外展的数量,然后过滤那些销售联系但营销没有联系的人。

library(dplyr)
my_data %>%
  group_by(Client, Category) %>%
  mutate(Sales = sum(Department == "Sales"),
         Mktg  = sum(Department == "Marketing")) %>%
  ungroup() %>%
  filter(Sales >= 1, Mktg == 0)

结果

# A tibble: 1 x 6
  Client  Category   Product Department Sales  Mktg
  <chr>   <chr>      <chr>   <chr>      <int> <int>
1 Mike S. Home Goods Carpet  Sales          1     0

可加载格式的数据,粘贴到电子表格后,删除 |,然后使用 datapasta 包:

my_data <- data.frame(
  stringsAsFactors = FALSE,
             Client = c("Mike S.","Mike S.",
                        "Mike S.","Bill T.","Bill T.","Bill T.","Bill T."),
           Category = c("Home Goods","Outdoor",
                        "Outdoor","Outdoor","Outdoor","Outdoor","Home Goods"),
            Product = c("Carpet","Shovel",
                        "Garden Hose","Garden Hose","Garden Hose","Leaf Blower",
                        "Recliner"),
         Department = c("Sales","Sales","Marketing",
                        "Marketing","Sales","Sales","Marketing")
 )

答案 1 :(得分:1)

另一种选择是在分组过滤器中使用 allall(Department != "Marketing") 会完成这项工作:

library(dplyr)

my_data %>%
  group_by(Client, Category) %>%
  filter(all(Department != "Marketing")) 

#> # A tibble: 1 x 4
#> # Groups:   Client, Category [1]
#>   Client  Category   Product Department
#>   <chr>   <chr>      <chr>   <chr>     
#> 1 Mike S. Home Goods Carpet  Sales

# data used
# my_data <- data.frame(
#      stringsAsFactors = FALSE,
#      Client = c("Mike S.","Mike S.",
#                 "Mike S.","Bill T.","Bill T.","Bill T.","Bill T."),
#      Category = c("Home Goods","Outdoor",
#                   "Outdoor","Outdoor","Outdoor","Outdoor","Home Goods"),
#          Product = c("Carpet","Shovel",
#                  "Garden Hose","Garden Hose","Garden Hose","Leaf Blower",
#                  "Recliner"),
#     Department = c("Sales","Sales","Marketing",
#                     "Marketing","Sales","Sales","Marketing")
#    )

reprex package (v0.3.0) 于 2021 年 4 月 14 日创建