将数据框列列表映射到另一个数据框 r

时间:2021-03-23 13:39:00

标签: r list dataframe

下面给出了数据帧的示例列表。 listofDataFrames 包含多个数据帧。每个数据帧都包含一个列 lev,它是映射过程中要使用的键。值是除 lev 之外的列。应根据 DF 的映射为 listofDataFrames 生成新列。更清楚地说,如果我们考虑 colors 中的 listofDataFrames,有两列:“颜色编号 3”和“颜色编号 10”。这些列都包含 3 个唯一值:“r”、“l”和“?”。在 DF 中,我们应该创建两个新列:“colors number 3”和“colors number 10”。我们可以根据 levlistofDataFramescolors from DF` 中的 . In 列创建它们,如果对于特定的行和列,“颜色”具有“橙色”,那么我们应该将“r”映射到新列“颜色编号 3”。下面给出了预期的输出。

# Create an example list of dataframes and populate it
listofDataFrames <- list() 

genres <- data.frame("genres number 12" =  c("r","l","?","r","r"),
           "genres number 17" =  c("l","r","?","l","?"),
           lev = c("pop","rock","jazz","blues","r&b"),
           check.names = FALSE)

colors <- data.frame("colors number 3" =  c("l","r","?","r"),
                     "colors number 10" =  c("l","r","l","r"),
                     lev = c("red","blue","green","orange"),
                     check.names = FALSE)

listofDataFrames[["genres"]] <- genres
listofDataFrames[["colors"]] <- colors

## DF

DF <-data.frame("genres" = c("pop", "pop","jazz","rock","jazz","blues","rock","pop","blues","pop"),
           "colors" = c("orange","red","red","orange","green","blue","orange","red","blue","green"),
           "values" = c(12, 15, 24, 33 ,47, 2 , 9 ,6, 89, 75))


## EXPECTED OUTPUT

expectedOutput <- 
  data.frame("genres" = c("pop", "pop","jazz","rock","jazz","blues","rock","pop","blues","pop"),
           "colors" = c("orange","red","red","orange","green","blue","orange","red","blue","green"),
           "values" = c(12, 15, 24, 33 ,47, 2 , 9 ,6, 89, 75),
           "genres number 12" = c("r","r","?","l","?","r","l","r","r","r"),
           "genres number 17" = c("l","l","?","r","?","l","r","l","l","l"),
           "colors number 3" = c("r","l","l","r","?","r","r","l","r","?"),
           "colors number 10" = c("r","l","l","r","l","r","r","l","r","l"),
           check.names = FALSE
           )

1 个答案:

答案 0 :(得分:1)

在这里,我们可以先在 'genres' 上使用双 merge,然后在 'DF' 的 'colors' 列上使用相应的 list 元素

merge(merge(DF, listofDataFrames[['genres']], all.x = TRUE, 
   by.x = 'genres', by.y = 'lev'), 
     listofDataFrames[['colors']], all.x = TRUE, by.x = 'colors', by.y = 'lev')

或者我们可以使用循环

nm1 <- names(listofDataFrames)
out <- DF
for(i in seq_along(nm1)) {
     out <- merge(out, listofDataFrames[[nm1[i]]], all.x = TRUE,
       by.x = nm1[i], by.y = 'lev')
}