熊猫数据框中缺失数据的插值

时间:2019-11-17 17:14:23

标签: python pandas dataframe interpolation missing-data

我正在使用建筑物能耗数据集,但缺少要插值的数据,但是插值后NaN没有消失。你能帮忙吗?

for src in ['train', 'test']:
weather = pd.read_csv(f'../input/ashrae-energy-prediction/weather_{src}.csv', parse_dates=['timestamp'])
weather.timestamp = pd.to_datetime(weather.timestamp)
weather = weather.set_index('timestamp')
missing_values = weather.isnull().sum()
print(missing_values)
weather_interp = pd.DataFrame()
for idx in range(15):
    weather_site = weather[weather.site_id == idx]
    for i in ['air_temperature', 'cloud_coverage', 'dew_temperature', 'precip_depth_1_hr', 'sea_level_pressure', 'wind_direction', 'wind_speed']:
        weather_interp[i] = weather_site[i].interpolate(method='polynomial', order=3)
missing_values_interp = weather_interp.isnull().sum()
print(missing_values_interp)

代码输出:

output of code

但是,如果我仅在1种情况下执行此操作,而不是在两个for循环中执行,则效果很好:

weather = pd.read_csv(f'../input/ashrae-energy-prediction/weather_train.csv', parse_dates=['timestamp'])
weather.timestamp = pd.to_datetime(weather.timestamp)
weather = weather.set_index('timestamp')
missing_values = weather.isnull().sum()
print(missing_values)
weather_site_0 = weather[weather.site_id == 0]
weather_interp = weather_site_0['air_temperature'].interpolate(method='polynomial', order=3)
missing_values_0 = weather_interp.isnull().sum()
print(missing_values_0)

0 个答案:

没有答案