左连接多列

时间:2016-08-11 05:08:15

标签: python pandas merge left-join

我以前常常使用dplyr和R一起使用

library(dplyr)
mtcars2=mtcars
mtcars3 = mtcars %>% left_join(mtcars2[,c("mpg","vs","hp")], by =c("mpg",'hp') )

# what this does is I do a left join with multiple columns and then bring over only *1* additional column.  This means that mtcars3 only has one additional field - a duplicated 'vs'

我无法弄清楚如何使用pd.merge做同样的事情。 我希望通过两列加入,然后只带 第三列 - 不是连接表中的每一列,除非是有意义的连接...

import pandas as pd
mtcars = pd.read_csv('mtcars.csv')
mtcars2=mtcars

mtcars3  = pd.merge(mtcars, mtcars2['vs','hp','mpg'],how='left', on = ['mpg','hp'])

1 个答案:

答案 0 :(得分:4)

IIUC您可以通过添加[]并省略mtcars2来使用子集 - 您可以再次使用mtcars

import pandas as pd
mtcars = pd.read_csv('mtcars.csv')
mtcars3  = pd.merge(mtcars, mtcars[['vs','hp','mpg']], how='left', on = ['mpg','hp'])

样品:

import pandas as pd

mtcars = pd.DataFrame({'vs':[1,2,3],
                       'hp':[1,1,1],
                       'mpg':[7,7,9],
                       'aaa':[1,3,5]})

print (mtcars)
   aaa  hp  mpg  vs
0    1   1    7   1
1    3   1    7   2
2    5   1    9   3

mtcars3  = pd.merge(mtcars, mtcars[['vs','hp','mpg']], how='left', on = ['mpg','hp'])
print (mtcars3)
   aaa  hp  mpg  vs_x  vs_y
0    1   1    7     1     1
1    1   1    7     1     2
2    3   1    7     2     1
3    3   1    7     2     2
4    5   1    9     3     3