下面是我要合并的DataFrames示例。
#!/usr/bin/env python
import pandas as pd
countries = ['Germany', 'France', 'Indonesia']
rank_one = [1, 5, 7]
capitals = ['Berlin', 'Paris', 'Jakarta']
df1 = pd.DataFrame({'country': countries,
'rank_one': rank_one,
'capital': capitals})
df1 = df1[['country', 'capital', 'rank_one']]
population = ['8M', '82M', '66M', '255M']
rank_two = [0, 1, 6, 9]
df2 = pd.DataFrame({'population': population,
'rank_two': rank_two})
df2 = df2[['rank_two', 'population']]
我想基于完全匹配或近似匹配来合并两个DataFrame。
如果rank_two is equal to rank_one
OR
rank_two is the closest and next bigger number from rank_one
。
示例:
df1 :
country capital rank_one
0 Germany Berlin 1
1 France Paris 5
2 Indonesia Jakarta 7
df2 :
rank_two population
0 0 8M
1 1 82M
2 6 66M
3 9 255M
df3_result :
country capital rank_one rank_two population
0 Germany Berlin 1 1 82M
1 France Paris 5 6 66M
2 Indonesia Jakarta 7 9 255M
答案 0 :(得分:6)
通过使用merge_asof
pd.merge_asof(df1,df2,left_on='rank_one',right_on='rank_two',direction='forward')
Out[1206]:
country capital rank_one rank_two population
0 Germany Berlin 1 1 82M
1 France Paris 5 6 66M
2 Indonesia Jakarta 7 9 255M
答案 1 :(得分:2)
您可以使用熊猫的“ merge_asof”功能
main.js
或者,如果您想按最接近的位置合并,并且不介意它的高低,则可以使用:
pd.merge_asof(df1, df2, left_on="rank_one", right_on="rank_two", direction='forward')