无法比较类型'ndarray(dtype = int64)'和'int64'

时间:2018-09-07 15:38:18

标签: python python-3.x pandas dictionary series

我在数据框中有一个带有条形码的列,并创建了一个字典将条形码映射到商品ID。

我正在创建一个新列:

df['item_id'] = df['bar_code']

字典(第二个数据帧-imdb-)

keys = (int(i) for i in imdb['bar_code'])
values = (int(i) for i in imdb['item_id'])
map_barcode = dict(zip(keys, values))

map_barcode(例如前5个)

{0:1000159,  9000000017515:11  7792690324216:16  7792690324209:20,  70942503334:33}

然后将项目ID与字典映射

df = df.replace({'item_id':map_barcode})

我希望在此获取列中的商品ID

(回到字典示例:)

df['item_id'][0] = 1000159
df['item_id'][1] = 11
df['item_id'][2] = 16
df['item_id'][3] = 20
df['item_id'][4] = 33

但是最终出现此错误:

Cannot compare types 'ndarray(dtype=int64)' and 'int64' 

我试图将字典的类型更改为np.int64

keys = (np.int64(i) for i in imdb['bar_code'])
values = (np.int64(i) for i in imdb['item_id'])
map_barcode = dict(zip(keys, values))

但是有同样的错误。

这里有什么我想念的吗?

1 个答案:

答案 0 :(得分:3)

SELECT Plate, Begin, StayEnd, Loc, DATEDIFF(StayEnd, Begin) As Count FROM (SELECT Plate, Begin, max(`TimeStamp`) AS StayEnd, Loc FROM (SELECT inven_table.*, @f:=CONVERT(IF(@c<=>Plate AND @r<=>Loc AND DATEDIFF(`TimeStamp`, @d)<=1, @f, `TimeStamp`), DATETIME) AS Begin, @c:=Plate, @d:=`TimeStamp`, @r:=Loc FROM inven_table JOIN (SELECT @c:=NULL) AS init ORDER BY Plate,`TimeStamp`, Loc) AS t WHERE Plate = 'XXXXXX' GROUP BY Begin) As C GROUP By Begin ORDER BY StayEnd DESC 示例

首先,我无法重现您的错误。效果很好:

replace

结果:

map_dict = {0: 1000159, 9000000017515: 11, 7792690324216: 16, 7792690324209: 20, 70942503334: 33}

df = pd.DataFrame({'item_id': [0, 7792690324216, 70942503334, 9000000017515, -1, 7792690324209]})

df = df.replace({'item_id': map_dict})

改为使用 item_id 0 1000159 1 16 2 33 3 11 4 -1 5 20 + map

第二,在生成器表达式中手动迭代Pandas系列是相对昂贵的。此外,fillna在通过字典进行映射时效率低下。

实际上,甚至没有必要创建字典。有针对这些任务的基于系列的优化方法:

replace

另请参阅: