我有两个数据框,一个包含一些缺失值,另一个包含需要替换缺失值的值。因此,第二个数据帧的长度比第一个短。
第一个数据框中的缺失值由“未找到高度信息”或“未找到玩家信息”标记
有没有办法在不循环的情况下用第二个数据帧中的相应值替换第一个数据帧中的缺失值?
我尝试使用 .map() 但未替换的值返回 NaN。
filled_df['height']= filled_df['height'].astype(str) #dataframe with real values
main_df['height']= main_df['height'].astype(str) #dataframe with missing values
mapping = dict(filled_df[['name','height']].values)
main_df['height'] = main_df['url_names'].map(mapping,na_action='ignore')
print(main_df)
name url_names height
0 John Mcenroe John_Mcenroe Height Info Not Found
1 Jimmy Connors Jimmy_Connors Player Info Not Found
2 Ivan Lendl Ivan_Lendl 1.88 m (6 ft 2 in)
3 Mats Wilander Mats_Wilander 1.83 m (6 ft 0 in)
4 Andres Gomez Andres_Gomez 1.93 m (6 ft 4 in)
5 Anders Jarryd Anders_Jarryd 1.80 m (5 ft 11 in)
6 Henrik Sundstrom Henrik_Sundstrom 1.88 m (6 ft 2 in)
7 Pat Cash Pat_Cash Height Info Not Found
8 Eliot Teltscher Eliot_Teltscher 1.75 m (5 ft 9 in)
9 Yannick Noah Yannick_Noah 1.93 m (6 ft 4 in)
10 Joakim Nystrom Joakim_Nystrom 1.87 m (6 ft 2 in)
11 Aaron Krickstein Aaron_Krickstein 6 ft 2 in (1.88 m)
12 Johan Kriek Johan_Kriek 1.75 m (5 ft 9 in)
name height
0 John_Mcenroe 1.80
1 Jimmy_Connors 1.78
2 Pat_Cash 183
3 Jimmy_Arias 175
4 Juan_Aguilera 1.82
5 Henri_Leconte 1.84
6 Balazs_Taroczy 1.82
7 Sammy_Giammalva_Jr 1.78
8 Thierry_Tulasne 1.77
答案 0 :(得分:0)
我认为你只需要用字典匹配的值替换缺失的值:
main_df['height'] = main_df['height'].fillna(main_df['url_names'].map(mapping))
答案 1 :(得分:0)
此代码可以完成这项工作
import pandas as pd
d = {'url_names': ['John_Mcenroe', 'Jimmy_Connors', 'Ivan_Lendl'], 'height': ['Height Info Not Found', 'Player Info Not Found', '1.88 m (6 ft 2 in)']}
main_df = pd.DataFrame(d)
d = {'url_names': ['John_Mcenroe', 'Jimmy_Connors'], 'height': ['1.80', '1.78']}
filled_df = pd.DataFrame(d)
df1 = main_df[(main_df.height == 'Height Info Not Found') | (main_df.height == 'Player Info Not Found')].drop(['height'], axis=1).merge(filled_df, on="url_names")
df2 = main_df[(main_df.height != 'Height Info Not Found') & (main_df.height != 'Player Info Not Found')]
pd.concat([df1, df2])