Question

我想替换foll中的一些值。数据帧：

dataframe_a

Y2000   Y2001   Y2002    Y2003    Y2004    Item    Item Code
34        43      0      0          25     Test      Val

我想将列中的值替换为通过将标量（比如0.5）乘以此数据帧中的所有值而得到的数值：

dataframe_b

Y2000   Y2001   Y2002    Y2003    Y2004    Item    Item Code
34        43      10      20        25     Test      Val

因此，在dataframe_a中，列Y2002的值应为10 * 0.5，列Y2003的值应为20 * 0.5

目前，我这样做：

df = dataframe_a[dataframe_a == 0]
df = df * dataframe_b * 0.5

但是，不确定如何使用新值

更新dataframe_a

Answer 1

您可以使用布尔掩码，然后调用fillna：

In [58]:
fill = df2.select_dtypes(include = [np.number]) * 0.5
df1 = df1[df1!=0].fillna(fill)
df1

Out[58]:
   Y2000  Y2001  Y2002  Y2003  Y2004  Item Item  Code
0     34     43      5     10     25  Test        Val

此处df1[df1 !=0]将生成具有NaN值的相同形状的df，其中不满足条件，然后您可以在此处调用fillna并传递将替换的其他df索引和列对齐的NaN值。

布尔掩码的结果：

In [63]:
df1[df1!=0]

Out[63]:
   Y2000  Y2001  Y2002  Y2003  Y2004  Item Item  Code
0     34     43    NaN    NaN     25  Test        Val

Answer 2

通用的，如果您不知道0值的位置：

new_df = 0.5*df2[df==0]
new_df.fillna(df, inplace=True)
print(new_df)

    0   1  2  3   4     5    6
0  34  43  5  5  25  Test  Val

其中dataframe_a为df且dataframe_b为df2

Answer 3

import pandas as pd
import numpy as np
randn = np.random.randn
s = Series(randn(5), index=['a', 'b', 'c', 'd', 'e'])
d = {'one' : Series([1., 2., 3.], index=['a', 'b', 'c']),
     'two' : Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
df
df.replace(1, 12*4)  # replace all values 1 by 12*4
df

参考replace()：Replace all occurrences of a string in a pandas dataframe (Python)

Answer 4

dataframe_a[dataframe_a == 0] = 0.5 * dataframe_b[dataframe_a == 0]

Answer 5

pandas.DataFrame.where可能就是您所需要的。您必须使用要替换的特定列值构建另一个dataframe。

我没有在这里安装Pandas所以我无法显示数据帧示例 - 但它与numpy数组的工作方式类似。

>>> a
array([1, 2, 0, 3, 4, 0, 5])
>>> subst
array([10, 20, 30, 40, 50, 60, 70])
>>> k = -.5
>>> np.where(a == 0, subst * k, a)
array([  1.,   2., -15.,   3.,   4., -30.,   5.])
>>>

与dataframe的一个区别是，它可以执行就地替换，您只需指定其他 dataframe（具有替换值的那个）

最后是熊猫的例子：

>>> 
>>> df
   d  e  f
a  0  1  1
b  1  1  0
c  1  0  1
>>> s
    d   e   f
a  10  20  30
b  10  20  30
c  10  20  30
>>> k = -.5
>>> df.where(df != 0, other = s * k)
   d   e   f
a -5   1   1
b  1   1 -15
c  1 -10   1
>>> 
>>> df.where(df != 0, other = s * k, inplace = True)
>>> df
   d   e   f
a -5   1   1
b  1   1 -15
c  1 -10   1
>>>

Some examples from the pydata site.

使用另一个数据帧替换数据帧中的零值

5 个答案: