按唯一值折叠数据框并合并其他变量的所有相关值

时间:2015-02-25 18:21:23

标签: r

我们说我有一个包含两列的矩阵或数据框:

    marker <- c("A1", "A2", "A2", "A3")  
    gene <- c("gene1", "gene2", "gene3", "gene4")  
    cbind(marker, gene)  

     marker gene   
[1,] "A1"   "gene1"
[2,] "A2"   "gene2"
[3,] "A2"   "gene3"
[4,] "A3"   "gene4"

如何将其转换为矩阵或数据框,每个唯一标记和所有相关基因都有一行?理想情况下,我想得到这样的东西:

     marker gene          
[1,] "A1"   "gene1"       
[2,] "A2"   "gene2";"gene3"
[3,] "A3"   "gene4" 

1 个答案:

答案 0 :(得分:3)

这个怎么样?

spl <- split(gene, marker)
data.frame(name = names(spl), gene = do.call(c, lapply(spl, function(x) paste0(x, collapse = ";"))))
   name        gene
A1   A1       gene1
A2   A2 gene2;gene3
A3   A3       gene4