Success modified_date user_id description
<int> <chr> <int> <chr>
1 0 10/15/2015 13:12 158236 Phone Live
2 0 10/15/2015 13:21 158236 Phone Live
3 1 10/25/2015 20:11 240497 Phone Live
4 1 11/24/2015 17:05 240497 Phone Live
5 1 6/23/2015 10:40 240497 Phone Live
6 1 7/7/2015 8:59 240497 Phone Live
7 0 5/1/2015 11:00 243412 Phone Live
8 0 5/1/2015 11:00 243412 Phone Live
9 0 6/11/2016 9:19 289273 Webform
10 1 6/11/2016 9:23 289273 Webform
查看分组成功和user_id列,条件是,如果成功值从0更改为1,则user_id描述需要显示转换。如果user_id的成功值没有改变,那么没有任何改变。
所需的输出:
Success modified_date user_id description
<int> <chr> <int> <chr>
1 0 10/15/2015 13:12 158236 Phone Live
2 0 10/15/2015 13:21 158236 Phone Live
3 1 10/25/2015 20:11 240497 Phone Live
4 1 11/24/2015 17:05 240497 Phone Live
5 1 6/23/2015 10:40 240497 Phone Live
6 1 7/7/2015 8:59 240497 Phone Live
7 0 5/1/2015 11:00 243412 Phone Live
8 0 5/1/2015 11:00 243412 Phone Live
9 0 6/11/2016 9:19 289273 Webform;Webform
这是代码:
time_data3 = time_data2 %>% arrange(user_id, modified_date, Success) %>%
filter(user_id != 0) %>% group_by(Success, user_id)%>%
summarize(sequence = paste(description, collapse = ";"))
dput
structure(list(Success = c(0L, 0L, 1L, 1L, 1L, 1L, 0L, 0L, 0L,
1L), modified_date = c("10/15/2015 13:12", "10/15/2015 13:21",
"10/25/2015 20:11", "11/24/2015 17:05", "6/23/2015 10:40", "7/7/2015 8:59",
"5/1/2015 11:00", "5/1/2015 11:00", "6/11/2016 9:19", "6/11/2016 9:23"
), user_id = c(158236L, 158236L, 240497L, 240497L, 240497L,
240497L, 243412L, 243412L, 289273L, 289273L), description = c("Phone Live",
"Phone Live", "Phone Live", "Phone Live", "Phone Live", "Phone Live",
"Phone Live", "Phone Live", "Webform", "Webform")), .Names = c("Success",
"modified_date", "user_id", "description"), row.names = c(NA,
-10L), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), vars = "user_id", drop = TRUE, indices = list(
0:1, 2:5, 6:7, 8:9), group_sizes = c(2L, 4L, 2L, 2L), biggest_group_size = 4L, labels = structure(list(
user_id = c(158236L, 240497L, 243412L, 289273L)), row.names = c(NA,
-4L), class = "data.frame", vars = "user_id", drop = TRUE, .Names = "user_id"))
答案 0 :(得分:1)
df %>% group_by(user_id) %>%
group_by(user_id,x =cumsum(c(TRUE,Success[-1] == Success[-length(Success)]))) %>%
summarize(Success=Success[1],
modified_date=modified_date[1],
description=paste(description,collapse=";")) %>%
select(-x)
# # A tibble: 9 x 4
# # Groups: user_id [4]
# user_id Success modified_date description
# <int> <int> <chr> <chr>
# 1 158236 0 10/15/2015 13:12 Phone Live
# 2 158236 0 10/15/2015 13:21 Phone Live
# 3 240497 1 10/25/2015 20:11 Phone Live
# 4 240497 1 11/24/2015 17:05 Phone Live
# 5 240497 1 6/23/2015 10:40 Phone Live
# 6 240497 1 7/7/2015 8:59 Phone Live
# 7 243412 0 5/1/2015 11:00 Phone Live
# 8 243412 0 5/1/2015 11:00 Phone Live
# 9 289273 0 6/11/2016 9:19 Webform;Webform
如果成功按组稳定,我们计算的向量为TRUE,因此如果我们收集它,则常数值显示有变化,我们按此值分组并汇总。