查找最小值大于当前值

时间:2018-11-15 08:55:48

标签: pandas

我有一个objects表和一个lookup表。在objects表中,我希望从lookup表中添加比对象的number大的最小值。

我找到了这个similar question,但这是要找到一个大于常量的值,而不是为每一行都更改。

在代码中:

import pandas as pd

objects = pd.DataFrame([{"id": 1, "number": 10}, {"id": 2, "number": 30}])

lookup = pd.DataFrame([{"number": 3}, {"number": 12}, {"number": 40}])

expected = pd.DataFrame(
    [
        {"id": 1, "number": 10, "smallest_greater": 12},
        {"id": 2, "number": 30, "smallest_greater": 40},
    ]
)

2 个答案:

答案 0 :(得分:1)

首先将lookup['number']的每个值objects['number']与2d布尔掩码进行比较,然后添加cumsum并按1比较第一个值,并按numpy.argmax获得位置通过lookup['number']设置值。

使用numpy.where生成输出,以将所有不匹配的值覆盖到NaN

objects = pd.DataFrame([{"id": 1, "number": 10}, {"id": 2, "number": 30},
                        {"id": 3, "number": 100},{"id": 4, "number": 1}])

print (objects)
   id  number
0   1      10
1   2      30
2   3     100
3   4       1

m1 = lookup['number'].values >= objects['number'].values[:, None]
m2 = np.cumsum(m1, axis=1) == 1
m3 = np.any(m1, axis=1)
out = lookup['number'].values[m2.argmax(axis=1)]

objects['smallest_greater'] = np.where(m3, out, np.nan)
print (objects)
   id  number  smallest_greater
0   1      10              12.0
1   2      30              40.0
2   3     100               NaN
3   4       1               3.0

答案 1 :(得分:0)

smallest_greater = []
对于对象('number']中的i :: smallest_greater.append(lookup ['number'[lookup [lookup ['number']> i] .sort_values(by ='number')。index [0]])
objects ['smallest_greater'] = smallest_greater