使用dplyr根据另一列的属性在R中添加新列

时间:2018-05-15 13:36:27

标签: r dplyr mutate

在R中使用dplyr,我尝试根据另一列的属性添加新列。例如,我有一个包含数千行状态代码的数据框(如table1)。现在我想添加一个名为Region的新列,并将状态代码分配给该区域(如table2)。怎么会在dplyr中完成?

table1 <- data.frame(State = c('NY','IL','CA','PA','FL','MI','AZ'))

table2 <- data.frame(State = c('NY','IL','CA','PA','FL','MI','AZ'),
                     Region = c('Northeast','Midwest','West','Northeast','Southeast','Midwest','West'))

1 个答案:

答案 0 :(得分:1)

这是一个JOIN问题。只需使用left_join包中的dplyr即可。在下面的示例中,我重新排序table1中的状态,以表明无论顺序如何,它都可以翻译它们:

library(dplyr)
table1 <- data.frame(State = c('PA','FL','MI','AZ','NY','IL','CA'))
table2 <- data.frame(State = c('NY','IL','CA','PA','FL','MI','AZ'),
                     Region = c('Northeast','Midwest','West','Northeast','Southeast','Midwest','West'))

left_join(table1, table2, by = "State")
  State    Region
1    PA Northeast
2    FL Southeast
3    MI   Midwest
4    AZ      West
5    NY Northeast
6    IL   Midwest
7    CA      West