Question

为什么在设置或获取具有错误索引数的系列中的项目时，pandas的行为会有所不同：

df = pd.DataFrame({'a': [10]})
# df['a'] is a series, can be indexed with 1 index only

# will raise IndexingError, as expected
df['a'].iloc[0, 0]
df['a'].loc[0, 0]

# will raise nothing, not as expected
df['a'].iloc[0, 0] = 1000 # equivalent to pass
df['a'].loc[0, 0] = 1000 # equivalent to df['a'].loc[0] = 1000

# pandas version 0.18.1, python 3.5

修改：Reported。

Answer 1

获取值

如果密钥是元组（如您的示例所示），那么__getitem__和loc对象的超类的iloc方法会在某个时刻调用_has_valid_tuple(self, key)

此方法具有以下代码

for i, k in enumerate(key):
    if i >= self.obj.ndim:
        raise IndexingError('Too many indexers')

这会引发您期望的IndexingError。

设定值

超类__setitem__拨打_get_setitem_indexer，然后_convert_to_indexer拨打电话。

这个超类的_convert_to_indexer实现有点乱，但在这种情况下它会返回一个numpy数组[0, 0]。

但是，iLoc索引器的类会覆盖_convert_to_indexer。此方法返回原始元组...

def _convert_to_indexer(self, obj, axis=0, is_setter=False):
    ...
    elif self._has_valid_type(obj, axis):
        return obj

现在indexer变量是.loc情况的numpy数组和.iloc情况的元组。这会导致后续超类调用_setitem_with_indexer(indexer, value)中的设置行为的差异。

为什么熊猫默默地忽略了.iloc [i，j]赋值过多的指数？

1 个答案:

获取值

设定值