左联接并保留唯一列

时间:2018-10-03 16:06:27

标签: r dataframe merge

我需要左连接两个df(X1和X2),并只保留唯一的列。

如果我必须进行普通联接,则以下代码有效:

merge(X1,  X2)

样本数据

X1<- data.frame("Group.Name"=c("Group1","Group2","Group1","Group2","Group2","Group2","Group1"),
                     "Sub_group_name"=c("A","A","B","C","D","E","B"),
                      "new_col"=c("Aa","Aa","Ba","Ca","Da","Ea","Ba"),
                     "Total"=c(35,26,10,9,5,11,13))

X2<- data.frame("Group.Name"=c("Group1","Group2","Group1","Group2","Group2"),
                "Sub_group_name"=c("A","A","B","C","D"),
                "new_col_b"=c(351,261,101,91,51),
                "Total_b"=c(35,26,10,9,5))

示例询问

Merge column -> Group.Name
merged dataframe columns -> Group.Name,Sub_group_name,new_col,new_col_b,Total_b

下面的代码也给了我所有重复的列:

merge(x=X1,y=X2,by=c,all.x=TRUE)

我也无法指定各个列的名称,因为一个df中有100多个列。

我搜索了但找不到任何答案。任何帮助请

1 个答案:

答案 0 :(得分:1)

一种简单的方法是执行常规的merge,然后从X2中删除多余的列,然后从任何名称中删除.x

out <- merge(x=X1,y=X2,by='c',all.x=TRUE)

# remove columns from X2
out <- out[!endsWith(names(out), '.y')]
# rename columns from X1
library(magrittr)
names(out)[endsWith(names(out), '.x')] %<>% substr(1, nchar(.) - 2)

out
#   c a b d e
# 1 1 1 2 1 1

使用的数据:

X1 <- data.frame(a = 1, b = 2, c = 1, d = 1)
X2 <- data.frame(b = 1, c = 1, e = 1)