Question

如果一列中的值介于其他列中的两个值之间，则无法将权重（int）添加到新的Pandas DataFrame列。我能够创建具有True / False值的列（如果我使用astype，则为0/1值）。

import pandas as pd

df = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6], 'c': [3,6,4]})
df

   a  b  c
0  1  4  3
1  2  5  6
2  3  6  4

这有效：

df['between_bool'] = df['c'].between(df['a'], df['b'])
df

   a  b  c between_bool
0  1  4  3         True     # 3 is between 1 and 4
1  2  5  6        False     # 6 is NOT between 2 and 5
2  3  6  4         True     # 4 is between 3 and 6

然而，这不起作用：

df['between_int'] = df['c'].apply(lambda x: 2 if df['c'].between(df['a'], df['b']) else 0)

上面的代码会产生以下错误：

Traceback (most recent call last):
  File "C:\Python36\envs\PortfolioManager\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-14-0aa1e7cfd5c2>", line 1, in <module>
    df['between_int'] = df['c'].apply(lambda x: 2 if df['c'].between(df['a'], df['b']) else 0)
  File "C:\Python36\envs\PortfolioManager\lib\site-packages\pandas\core\series.py", line 2294, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas\src\inference.pyx", line 1207, in pandas.lib.map_infer (pandas\lib.c:66124)
  File "<ipython-input-14-0aa1e7cfd5c2>", line 1, in <lambda>

所需的输出是：

   a  b  c between_int
0  1  4  3           2      # 3 is between 1 and 4
1  2  5  6           0      # 6 is NOT between 2 and 5
2  3  6  4           2      # 4 is between 3 and 6

有什么想法吗？

Answer 1

我希望我能正确理解你，但如果你只是想在这个条件下添加固定重量2，可以选择以下方法：

import numpy as np
df['between_int'] = np.where(df['c'].between(df['a'], df['b']), 2, 0)

或者，如果您不想导入numpy，则可以执行以下操作：

df['between_int'] = 0
df.loc[df['c'].between(df['a'], df['b']), 'between_int'] = 2

希望这有帮助！

Answer 2

我认为您最初想要使用def func(data, (x_0,y_0)): y, x = numpy.indices(data.shape) r = (x - x_0)**2 + (y - y_0)**2 float_values, r = numpy.unique(r, return_inverse=True) return float_values ** 0.5, r.reshape(data.shape)做的是：

apply

看到与你的不同之处：

df['between_int'] = df.apply(lambda x: 2 if x['c'] in range(x['a'], x['b']) else 0, axis=1)

apply而不是系列df
获取您要df['c']而不是x['c']检查的值，因为您的lambda是x的函数
因为我将df['c']更改为df['c']我不能再使用x['c']了between
对于两个边界，请按in range和x['a']进行调用，原因与第2点相同
最后，不要忘记x['b']现在axis=1在数据框

无论如何，Swebbo的解决方案完美无缺！

如果1列中的值介于2个其他列中的值之间，则使用权重创建Pandas DataFrame列

2 个答案: