Question

16Gb机器出现内存不足错误。我怀疑转换是否真的到位。

import numpy as np
x = np.ones(int(1.5e9), dtype=np.int64)  # 12 Gb
x.astype(np.float64, copy=False)  # gives out of memory error.

如何进行就地内存转换？我想转换数据类型并保留值。例如，1.0f变为整数1.

In-place type conversion of a NumPy array

Answer 1

关于copy参数：

默认情况下，astype始终返回新分配的数组。如果这设置为false，dtype，order和subok 如果满足要求，则返回输入数组一份。

所以它是有条件的。

In [540]: x=np.arange(10)
In [542]: x.dtype
Out[542]: dtype('int32')
In [543]: z=x.astype('float32',copy=False)
In [544]: z
Out[544]: array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.], dtype=float32)
In [545]: x.__array_interface__
Out[545]: 
{'data': (188221848, False),
 'descr': [('', '<i4')],
 'shape': (10,),
 'strides': None,
 'typestr': '<i4',
 'version': 3}
In [546]: z.__array_interface__
Out[546]: 
{'data': (191273640, False),
 'descr': [('', '<f4')],
 'shape': (10,),
 'strides': None,
 'typestr': '<f4',
 'version': 3}

z有不同的内存位置。

链接中已接受的答案似乎有效

In [549]: z=x.view('float32')
In [550]: z[:]=x
In [551]: z
Out[551]: array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.], dtype=float32)
In [552]: x
Out[552]: 
array([         0, 1065353216, 1073741824, 1077936128, 1082130432,
       1084227584, 1086324736, 1088421888, 1090519040, 1091567616])
In [553]: z
Out[553]: array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.], dtype=float32)
In [555]: x.__array_interface__
Out[555]: 
{'data': (188221848, False),
 'descr': [('', '<i4')],
 'shape': (10,),
 'strides': None,
 'typestr': '<i4',
 'version': 3}
In [556]: z.__array_interface__
Out[556]: 
{'data': (188221848, False),
 'descr': [('', '<f4')],
 'shape': (10,),
 'strides': None,
 'typestr': '<f4',
 'version': 3}

这是有效的，因为z与x共享内存，但使用不同的dtype。从x复制到z时，会转换它们以匹配新的dtype。内存位置保留。但是，我不能保证没有临时缓冲区。

如果不清除，转换形式int32到float32需要更改基础字节。整数的位表示与浮点数的位表示不同。

In [594]: np.array(1, 'int32').tobytes()
Out[594]: b'\x01\x00\x00\x00'
In [595]: np.array(1, 'float32').tobytes()
Out[595]: b'\x00\x00\x80?'

Numpy inplace dtype转换

1 个答案: