向上或向下舍入数据框中的整列

时间:2018-03-03 13:30:29

标签: python pandas

        A
0  31.353
1  28.945
2  17.377

我想创建一个新的df [“B”],其A列值最多为5。 所需的输出:

        A      B
0  31.353   35.0
1  28.945   30.0
2  17.377   20.0

我试过了:

def roundup5(x):
    return int(math.ceil(x / 5.0)) * 5
df["B"] = df["A"].apply(roundup5)

我明白了:

TypeError: unsupported operand type(s) for /: 'str' and 'float'

1 个答案:

答案 0 :(得分:3)

我认为您需要首先将值转换为float,然后将numpy.ceil除以多个:

df["B"] = df["A"].astype(float).div(5.0).apply(np.ceil).mul(5)
df["B"] = np.ceil(df["A"].astype(float).div(5.0)).mul(5)

循环版本:

def roundup5(x):
    return int(math.ceil(float(x) / 5.0)) * 5.0
df["B"] = df["A"].apply(roundup5)
print (df)
        A     B
0  31.353  35.0
1  28.945  30.0
2  17.377  20.0

<强>计时

[30000 rows x 1 columns]
df = pd.concat([df] * 10000, ignore_index=True)

In [327]: %timeit df["B1"] = df["A"].apply(roundup5)
35.7 ms ± 4.54 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [328]: %timeit df["B2"] = df["A"].astype(float).div(5.0).apply(np.ceil).mul(5)
1.25 ms ± 76.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [329]: %timeit df["B3"] = np.ceil(df["A"].astype(float).div(5.0)).mul(5)
1.19 ms ± 22.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
相关问题