Question

我在pandas df中有一张桌子

    id     count
0     10       3
1     20       4
2     30       5
3     40       NaN
4     50       NaN
5     60       NaN
6     70       NaN

我还有另一个熊猫系列

我想要做的是将df中的NaN值替换为系列s中的相应值。我的最终输出应该是

    id     count
0     10       3
1     20       4
2     30       5
3     40       1000
4     50       2000
5     60       3000
6     70       4000

任何想法如何实现这一目标？

提前致谢。

Answer 1

lenght的问题Series与NaN列count值的长度不同。因此，Series的长度需要reindex NaN：

s = pd.Series({0: 1000, 1: 2000, 2: 3000, 3: 4000, 5: 5000})
print (s)
0    1000
1    2000
2    3000
3    4000
5    5000
dtype: int64

df.loc[df['count'].isnull(), 'count'] = 
s.reindex(np.arange(df['count'].isnull().sum())).values
print (df)
   id   count
0  10     3.0
1  20     4.0
2  30     5.0
3  40  1000.0
4  50  2000.0
5  60  3000.0
6  70  4000.0

Answer 2

这很简单：

df.count[df.count.isnull()] = s.values

Answer 3

在这种情况下，我更喜欢它的可读性。

counter = 0    
for index, row in df.iterrows():
    if row['count'].isnull():
        df.set_value(index, 'count', s[counter])
        counter += 1

我可以补充一点，这就是合并＆＃39;数据帧+系列有点奇怪，容易出现奇怪的错误。如果你能以某种方式使系列与数据帧的格式相同（也就是添加一些索引/列标签，那么合并功能可能会更好。）

Answer 4

您可以使用系列中的np.nan索引重新索引系列，而fillna()索引系列：

s.index = np.where(df['count'].isnull())[0]
df['count'] = df['count'].fillna(s)
print(df)

   id   count
0  10     3.0
1  20     4.0
2  30     5.0
3  40  1000.0
4  50  2000.0
5  60  3000.0
6  70  4000.0

将数据框中的NaN值替换为python

4 个答案: