我有一个数据框,其中包含客户和类别、产品以及开展外展活动的部门的列表:
Client | Category | Product | Department
Mike S. | Home Goods | Carpet | Sales
Mike S. | Outdoor | Shovel | Sales
Mike S. | Outdoor | Garden Hose | Marketing
Bill T. | Outdoor | Garden Hose | Marketing
Bill T. | Outdoor | Garden Hose | Sales
Bill T. | Outdoor | Leaf Blower | Sales
Bill T. | Home Goods | Recliner | Marketing
我希望提供一个营销电子邮件列表,其中 Sales 已在类别级别(而非产品)进行外展,但 Marketing 尚未进行。
这是所需的输出:
Client | Category | Product | Department
Mike S. | Home Goods | Carpet | Sales
Bill T. | Outdoor | Leaf Blower | Sales
答案 0 :(得分:1)
这是一种方法,它在每个客户类别组内添加营销和销售外展的数量,然后过滤那些销售联系但营销没有联系的人。
library(dplyr)
my_data %>%
group_by(Client, Category) %>%
mutate(Sales = sum(Department == "Sales"),
Mktg = sum(Department == "Marketing")) %>%
ungroup() %>%
filter(Sales >= 1, Mktg == 0)
结果
# A tibble: 1 x 6
Client Category Product Department Sales Mktg
<chr> <chr> <chr> <chr> <int> <int>
1 Mike S. Home Goods Carpet Sales 1 0
可加载格式的数据,粘贴到电子表格后,删除 |
,然后使用 datapasta
包:
my_data <- data.frame(
stringsAsFactors = FALSE,
Client = c("Mike S.","Mike S.",
"Mike S.","Bill T.","Bill T.","Bill T.","Bill T."),
Category = c("Home Goods","Outdoor",
"Outdoor","Outdoor","Outdoor","Outdoor","Home Goods"),
Product = c("Carpet","Shovel",
"Garden Hose","Garden Hose","Garden Hose","Leaf Blower",
"Recliner"),
Department = c("Sales","Sales","Marketing",
"Marketing","Sales","Sales","Marketing")
)
答案 1 :(得分:1)
另一种选择是在分组过滤器中使用 all
。 all(Department != "Marketing")
会完成这项工作:
library(dplyr)
my_data %>%
group_by(Client, Category) %>%
filter(all(Department != "Marketing"))
#> # A tibble: 1 x 4
#> # Groups: Client, Category [1]
#> Client Category Product Department
#> <chr> <chr> <chr> <chr>
#> 1 Mike S. Home Goods Carpet Sales
# data used
# my_data <- data.frame(
# stringsAsFactors = FALSE,
# Client = c("Mike S.","Mike S.",
# "Mike S.","Bill T.","Bill T.","Bill T.","Bill T."),
# Category = c("Home Goods","Outdoor",
# "Outdoor","Outdoor","Outdoor","Outdoor","Home Goods"),
# Product = c("Carpet","Shovel",
# "Garden Hose","Garden Hose","Garden Hose","Leaf Blower",
# "Recliner"),
# Department = c("Sales","Sales","Marketing",
# "Marketing","Sales","Sales","Marketing")
# )
由 reprex package (v0.3.0) 于 2021 年 4 月 14 日创建