Question

数据集1：

ID Name     Territory   Sales
1  Richard  NY            59
8  Sam      California    44

数据集2：

Terr ID  Name   Comments
 LA   5   Rick    yes
 MH   11  Oly     no

我希望最终数据集仅包含第一个数据集的列，并且标识Territory与Terr相同，并且不会提前Comments列。

最终数据应如下所示：

ID Name     Territory  Sales
1  Richard  NY           59
8  Sam      California   44
5  Rick     LA           NA
11 Oly      MH           NA

提前致谢

Answer 1

可能的解决方案：

# create a named vector with names from 'set2' 
# with the positions of the matching columns in 'set1'
nms2 <- sort(unlist(sapply(names(set2), agrep, x = names(set1))))

# only keep the columns in 'set2' for which a match is found
# and give them the same names as in 'set1'
set2 <- setNames(set2[names(nms2)], names(set1[nms2]))

# bind the two dataset together

# option 1:
library(dplyr)
bind_rows(set1, set2)

# option 2:
library(data.table)
rbindlist(list(set1, set2), fill = TRUE)

给出（dplyr - 输出显示）：

  ID    Name  Territory Sales
1  1 Richard         NY    59
2  8     Sam California    44
3  5    Rick         LA    NA
4 11     Oly         MH    NA

使用过的数据：

set1 <- structure(list(ID = c(1L, 8L), 
                       Name = c("Richard", "Sam"),
                       Territory = c("NY", "California"),
                       Sales = c(59L, 44L)),
                  .Names = c("ID", "Name", "Territory", "Sales"), class = "data.frame", row.names = c(NA, -2L))
set2 <- structure(list(Terr = c("LA", "MH"),
                       ID = c(5L, 11L),
                       Name = c("Rick", "Oly"),
                       Comments = c("yes", "no")),
                  .Names = c("Terr", "ID", "Name", "Comments"), class = "data.frame", row.names = c(NA, -2L))

如何将2个数据集一个接一个地添加到另一个之下，具有略微不同的列名称？

1 个答案: