Question

>>import pandas as pd
>>d = {'a':[5,4,3,1,2],'b':[1,2,3,4,5]}
>>df = pd.DataFrame(d)
>>df
   a  b
0  5  1
1  4  2
2  3  3
3  1  4
4  2  5

如果a，b之间没有重复的值，是否有办法计算变量 indices ，以便：

df['a'] = df['b'][indices]

满意吗？在这种情况下，

>> indices = [4,3,2,0,1]

>> df['b'][indices]
4    5
3    4
2    3
0    1
1    2

Answer 1

我想天真的方法是：

def getIndices(a,b):
   rVal = []
   for i in a:
      index = b.index(i)
      rVal.append(index)
   return rVal

a = [5,4,3,1,2]
b = [1,2,3,4,5]

result = getIndices(a,b)
print result
# prints [4, 3, 2, 0, 1]

我认为这会给你O(nlogn)时间复杂性。

Answer 2

你可以尝试 -

indices = [df['b'][df['b'] == row['a']].index[0]  for idx, row in df.iterrows()]
indices
>> [4, 3, 2, 0, 1]

Answer 3

您可以使用numpy.argsort()：

import numpy as np
a = np.array(["c", "b", "a", "x", "e", "d"])
b = np.array(["d", "a", "b", "c", "x", "e"])
idx_a = np.argsort(a)
idx_b = np.argsort(b)
print b[idx_b[idx_a]]

结果是：

['c' 'b' 'a' 'x' 'e' 'd']

Answer 4

这可以通过直接Python完成（不确定是否有更聪明的熊猫特定方法）。

d = {k:v for v,k in enumerate(list(df['a']))}
indices = [i[0] for i in sorted(enumerate(list(df['b'])), 
                                key=lambda x: d.get(x[1]))]

如果a的某些元素不在b中，反之亦然，那么您必须使用能够容忍缺失值的智能键功能（并决定如何使用你想处理这个案子，就此而言。）

pandas找到列之间匹配值的重新排序索引

4 个答案: