How do I merge rows dynamically in R?

时间:2019-06-01 14:04:14

标签: r

I have the following data frame in R:

type,status,count
41,438421,512
41,438422,512
41,438429,269
74,440586,172
74,440590,217

What I want to do is to merge the rows and rearrange data. My desired output is shown below:

[41] = {["512"] = "438421, 438422", ["269"] = "438429",},
[74] = {["172"] = "440586", ["217"] = "440590",},

The rows must be merged so that the type column is unique. Then the status and counts should be added as shown above.

Note that the values of all these are not known so I can't reference anything by the value (such as 438421). There are over 100,000 lines in the actual data frame and they all have different values for everything so the solution code needs to work regardless of what values are used above.

Many thanks.

1 个答案:

答案 0 :(得分:2)

您可以使用dplyr ...

library(dplyr)

df %>% group_by(type, count) %>%
  summarise(status = paste(status, collapse = ", ")) %>%
  mutate(count = paste0('["', count, '"] = "', status, '"')) %>%
  group_by(type) %>%
  summarise(count = paste(count, collapse = ", ")) %>%
  mutate(type = paste0('[', type, '] = {', count, ',},')) %>%
  select(type)

  type                                                               
  <chr>                                                              
1 "[41] = {[\"269\"] = \"438429\", [\"512\"] = \"438421, 438422\",},"
2 "[74] = {[\"172\"] = \"440586\", [\"217\"] = \"440590\",},"       

不用担心上面的反斜杠-它们只是为了打印输出而转义了文字双引号。