如何将一个数据帧的两列与另一数据帧的一列匹配

时间:2019-02-26 17:16:45

标签: r dataframe

我有两个大数据集

df1

full.name      first.name   age
bob marley     bob          10
jus bieber     jus          12 
xyz abcdef     xyz          14
abc qwerty     abc          15
hey hello      hey          10
jack ma        jack         12
zuke mark      mark         15

df2
name         age1
asd dfg      23
bob          10
jus bieber   12
xyz          23
abc qwerty   21
hey hello    10
jack         12
zuke mark    17  
bradd pit    50

我想要这样的答案

full.name      first.name   age     name       age1
bob marley     bob          10      bob          10
jus bieber     jus          12      jus bieber   12
xyz abcdef     xyz          14      xyz          23
abc qwerty     abc          15      abc qwerty   21
hey hello      hey          10      hey hello    10
jack ma        jack         12      jack         12
zuke mark      mark         15      zuke mark    17  

我想将df1的full.name和first.name与df2匹配,如果

  • full.name与name或
  • 匹配
  • 名字第一名

并从与df2的名称(列)相匹配的df1列的值中打印出age1的值

1 个答案:

答案 0 :(得分:0)

这可能是您想要的,尽管name列不在最终输出中。

library(tidyverse)

df3 <- df1 %>%
  rowid_to_column() %>%
  gather(type, name, -age, -rowid) %>%
  left_join(df2, by = "name") %>%
  group_by(rowid) %>%
  mutate(age1 = ifelse(is.na(age1), unique(age1[!is.na(age1)]), age1)) %>%
  spread(type, name) %>%
  ungroup()

df3
# # A tibble: 7 x 5
#   rowid   age  age1 first.name full.name 
#   <int> <int> <int> <chr>      <chr>     
# 1     1    10    10 bob        bob marley
# 2     2    12    12 jus        jus bieber
# 3     3    14    23 xyz        xyz abcdef
# 4     4    15    21 abc        abc qwerty
# 5     5    10    10 hey        hey hello 
# 6     6    12    12 jack       jack ma   
# 7     7    15    17 mark       zuke mark

数据

df1 <- read.table(text = "full.name      first.name   age
'bob marley'     bob          10
'jus bieber'     jus          12 
'xyz abcdef'     xyz          14
'abc qwerty'     abc          15
'hey hello'      hey          10
'jack ma'        jack         12
'zuke mark'      mark         15",
                  header = TRUE, stringsAsFactors = FALSE)

df2 <- read.table(text = "name         age1
'asd dfg'      23
bob          10
'jus bieber'   12
xyz          23
'abc qwerty'   21
'hey hello'    10
jack         12
'zuke mark'    17  
'bradd pit'    50",
                  header = TRUE, stringsAsFactors = FALSE)
相关问题