合并R

时间:2016-07-25 12:51:04

标签: r merge

编辑:我之前已经问过合并多个数据帧的问题,但是我一直在考虑如何合并多个表而不用首先转换为数据帧,以便更简洁。如果您想了解如何合并多个数据框,请查看优秀的答案here(也在下方链接)。

所以我有办法在R中合并多个数据帧,但我希望有人可以帮我找到一种更优雅的方式。以下是我所拥有的代码示例。鉴于df1,df2和df3是具有相同列(包括列名“class”)但行数不同的数据帧,我可以这样做:

table1 <- table(df1$class)
table2 <- table(df2$class)
table3 <- table(df3$class)

并且由this回答给出,我可以合并它们:

merged.table <- Reduce(function(...) merge(..., all=T), list(table1, table2, table3))

我的问题是合并不正确,因为对象table1,table2和table3具有相同的标识名称,merged.table最终将数据组合到一列。

我的解决方法是将表格转换为数据框架,如下所示:

table1 <- as.data.frame(table(df1$class))
colnames(table1) <- c("ID","counts1")
table2 <- as.data.frame(table(df2$class))
colnames(table2) <- c("ID","counts2")
table3 <- as.data.frame(table(df3$class))
colnames(table3) <- c("ID","counts3")

然后合并工作正常。但是让我告诉你,一段时间后,这变得非常笨拙和乏味,我需要做很多这样的事情。

有没有办法在不将表格转换为数据框并指定列名的情况下实现相同的目标?

以下是数据框的外观示例,为简单起见,将其截断:

transcript <- rep(c("a","b","c","d","e","f"))
family <- rep(c("L1","L2","ERV"),2)
class <- rep(c("LINE","LINE","LTR"),2)

df1 <- data.frame(transcript, family, class)

transcript  family  class
a            L1     LINE
b            L2     LINE
c            ERV    LTR
d            L1     LINE
e            L2     LINE
f            ERV    LTR

1 个答案:

答案 0 :(得分:2)

我们需要添加by = "Var1"参数来合并:

# dummy data
transcript <- rep(c("a","b","c","d","e","f"))
family <- rep(c("L1","L2","ERV"),2)
class <- rep(c("LINE","LINE","LTR"),2)
df1 <- data.frame(transcript, family, class)

# get table as data.frame
table1 <- as.data.frame(table(df1$class))
table2 <- as.data.frame(table(df1$class))
table3 <- as.data.frame(table(df1$class))

# merge without by
Reduce(function(...) merge(..., all = TRUE),
       list(table1, table2, table3))
#   Var1 Freq
# 1 LINE    4
# 2  LTR    2

# merge with by = "Var1"
Reduce(function(...) merge(..., all = TRUE, by = "Var1"),
       list(table1, table2, table3))

#   Var1 Freq.x Freq.y Freq
# 1 LINE      4      4    4
# 2  LTR      2      2    2