Question

“是”内置运算符显示np.ndarray中元素的奇怪行为。

尽管rhs和lhs的ID相同，但“ is”运算符返回False（此行为特定于np.ndarray）。

a = np.array([1.,])
b = a.view()
print(id(a[0] == id(b[0])))  # True
print(a[0] is b[0])  # False

这种奇怪的行为甚至发生在没有视图副本的情况下。

a = np.array([1.,])
print(a[0] is a[0])  # False

有人知道这种奇怪行为的机制（可能还有证据或说明）吗？

后继脚本：请重新考虑两个示例。

如果这是列表，则不会观察到此现象。

a = [0., 1., 2.,]
b = []
b.append(a[0])
print(a[0] is b[0])  # True

a [0]和b [0]指的是完全相同的对象。

a = np.array([1.,])
b = a.view()
b[0] = 0.
print(a[0])  # 0.0
print(id(a[0]) == id(b[0]))  # True

注意：这个问题可能是重复的，但我仍然有些困惑。

a = np.array([1.,])
b = a.view()
x = a[0]
y = b[0]
print(id(a[0]))  # 139746064667728
print(id(b[0]))  # 139746064667728
print(id(a[0]) == id(b[0])) # True
print(id(a[0]) == id(x)) # False
print(id(x) == id(y))  # False

[0]是一个时间对象吗？
临时对象的ID是否已重用？
这与规范不矛盾吗？（https://docs.python.org/3.7/reference/expressions.html#is）

6.10.3. Identity comparisons
The operators is and is not test for object identity: x is y is true if and only if x and y are the same object. Object identity is determined using the id() function. x is not y yields the inverse truth value.

如果将id重复用于临时对象，为什么在这种情况下id会有所不同？

>>> id(100000000000000000 + 1) == id(100000000000000001)
True
>>> id(100000000000000000 + 1) == id(100000000000000000)
False

Answer 1

这仅仅是由于is和==的工作方式不同，is运算符不比较它们只是检查两个操作数是否引用同一对象的值。

例如，如果您这样做：

print(a is a)

输出将是：True 有关更多信息，请查看here

当python比较时，它会为操作数分配不同的位置，并且可以通过使用id函数的简单测试来观察到相同的行为。

print(id(a[0]),a[0] is a[0],id(a[0]))

输出将是：

140296834593128 False 140296834593248

对于您所问问题的答案，以及为什么列表不像numpy数组那样表现，完全是基于它们的构造。 Np.arrays被设计为比普通的python列表在处理能力和存储效率上更高。

因此，每次在numpy数组上加载或执行操作时，它都会加载并分配一个不同的ID，如以下代码所示：

a = np.array([0., 1., 2.,])
b = []
b.append(a[0])
print(id(a[0]),a[0] is b[0],id(b[0]))

这是在jupyter-lab中多次重新运行同一代码的输出：

140296834595096 False 140296834594496
140296834595120 False 140296834594496
140296834595120 False 140296834594496
140296834595216 False 140296834594496
140296834595288 False 140296834594496

是不是有些奇怪？，每次重新运行的numpy数组的ID不同，但是列表对象的ID保持不变。这说明了问题中numpy数组的异常行为。

如果您想了解更多有关此行为的信息，我会建议numpy docs

Answer 2

a[0]的类型为<class 'numpy.float64'>。当您进行比较时，它将创建该类的两个实例，因此is检查失败。但是，如果执行以下操作，您将获得所需的内容，因为现在两者都引用同一对象。

x = a[0]
print(x is x)  # True

Answer 3

这由id() vs `is` operator. Is it safe to compare `id`s? Does the same `id` mean the same object?覆盖。在这种情况下：

a[0]和b[0]
```
In [7]: a[0] is a[0]
Out[7]: False
```
在id(a[0]) == id(b[0])中，每个对象取走id后都会被立即丢弃，而b[0]恰好占据了最近丢弃的{ {1}}。即使这种情况在您的CPython版本中每次针对特定的表达式发生（由于特定的评估顺序和堆组织），这都是实现细节，您不能依赖它。

Answer 4

Numpy将数组数据存储为raw data buffer。当您访问a[0]之类的数据时，它会从缓冲区中读取数据并为其构造一个python对象。因此，两次调用a[0]将构造2个python对象。 is检查身份，因此2个不同的对象将比较false。

此插图应使过程更加清晰：

注意：ID号是连续的，仅用作示例。显然，您会得到一个随机的类似数字。该示例中的多个id 3也不一定必须总是相同的数字。它们很可能是因为id 3被反复释放并因此可重复使用。

a = np.array([1.,])
b = a.view()
x = a[0]    # python reads a[0], creates new object id 1.
y = b[0]    # python reads b[0] which reads a[0], creates new object id 2. (1 is used by object x)

print(id(a[0]))  # python reads a[0], creates new object id 3.
                 # After this call, the object id 3 a[0] is no longer used.
                 # Its lifetime has ended and id 3 is freed.

print(id(b[0]))  # python reads b[0] which reads a[0], creates new object id 3. 
                 # id 3 has been freed and is reusable.
                 # After this call, the object id 3 b[0] is no longer used.
                 # Its lifetime has ended and id 3 is freed (again).

print(id(a[0]) == id(b[0])) # This runs in 2 steps.
                            # First id(a[0]) is run. This is just like above, creates a object with id 3.
                            # Then a[0] is disposed of since no references are created to it. id 3 is freed again.
                            # Then id(b[0]) is run. Again, it creates a object with id 3. (Since id 3 is free).
                            # So, id(a[0]) == 3, id(b[0]) == 3. They are equal.

print(id(a[0]) == id(x)) # Following the same thing above, id(a[0]) can create a object of id 3, x maintains its reference to id 1 object. 3 != 1.

print(id(x) == id(y))  # x references id 1 object, y references id 2 object. 1 != 2

关于

>>> id(100000000000000000 + 1) == id(100000000000000001)
True
>>> id(100000000000000000 + 1) == id(100000000000000000)
False

id分配和垃圾回收是实现细节。可以保证的是，在单个时间点上，对两个不同对象的引用是不同的，对两个相同对象的引用是相同的。问题在于某些表达式可能不是原子表达式（即不在单个时间点运行）。

取决于实现，Python可能会根据需要决定是否重用已释放的ID号。在这种情况下，它决定在一种情况下重用，而在另一种情况下不重用。（很可能在id(100000000000000000 + 1) == id(100000000000000001)中，python意识到由于数字相同，因此可以有效地重用它，因为100000000000000001将位于内存中的同一位置。）

Answer 5

在混乱的情况下，很大一部分是数组中a[0]的性质。

对于列表，b[0]是b的实际元素。我们可以通过列出可变项（其他列表）来说明这一点：

In [22]: b = [[0],[1],[2],[3]]
In [23]: b1 = b[0]
In [24]: b1
Out[24]: [0]
In [25]: b[0].append(10)
In [26]: b
Out[26]: [[0, 10], [1], [2], [3]]
In [27]: b1
Out[27]: [0, 10]
In [28]: b1.append(20)
In [29]: b
Out[29]: [[0, 10, 20], [1], [2], [3]]

对b[0]和b1进行操作会作用于同一对象。

对于数组：

In [35]: a = np.array([0,1,2,3])
In [36]: c = a.view()
In [37]: a1 = a[0]
In [38]: a += 1
In [39]: a
Out[39]: array([1, 2, 3, 4])
In [40]: c
Out[40]: array([1, 2, 3, 4])
In [41]: a1
Out[41]: 0

a中的就地更改不会更改a1，即使它确实更改了c。

__array_interface__向我们展示了数组的数据缓冲区的存储位置-从广义上讲，应将其视为该缓冲区的内存地址。

In [42]: a.__array_interface__['data']
Out[42]: (31233216, False)
In [43]: c.__array_interface__['data']
Out[43]: (31233216, False)
In [44]: a1.__array_interface__['data']
Out[44]: (28513712, False)

该视图具有相同的数据缓冲区。但是a1没有。 a[0:1]是view中的单个元素a，并且确实共享数据缓冲区。

In [45]: a[0:1].__array_interface__['data']
Out[45]: (31233216, False)
In [46]: a[1:2].__array_interface__['data']  # 8 bytes over
Out[46]: (31233224, False)

因此id(a[0])几乎没有告诉我们有关a的任何信息。比较ID仅能告诉我们一些有关在构造Python对象时如何回收内存插槽的信息。

np.ndarray`“ is”中的奇怪行为

5 个答案: