给定一个numpy数组,我想确定哪些行包含NaN值和对象。 例如,一行将包含浮点值和列表。
对于输入数组arr
,我尝试做arr[~np.isnan(arr).any(axis=1)]
,但随后收到错误消息
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could
not be safely coerced to any supported types according to the casting rule ''safe''
答案 0 :(得分:1)
In [314]: x = np.array([[1, [2,3], np.nan], [3, [5,6,7], 8]])
In [315]: x
Out[315]:
array([[1, list([2, 3]), nan],
[3, list([5, 6, 7]), 8]], dtype=object)
In [316]: x.shape
Out[316]: (2, 3)
In [317]: x[0]
Out[317]: array([1, list([2, 3]), nan], dtype=object)
In [318]: x[1]
Out[318]: array([3, list([5, 6, 7]), 8], dtype=object)
isnan
适用于float dtype数组; dtype对象无法转换为该类型:
In [320]: np.isnan(x)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-320-3b2be83a8ed7> in <module>
----> 1 np.isnan(x)
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
不过,我们可以使用is np.nan
测试来逐个测试元素:
In [325]: np.frompyfunc(lambda i: i is np.nan,1,1)(x)
Out[325]:
array([[False, False, True],
[False, False, False]], dtype=object)
frompyfunc
返回对象dtype;让我们将其转换为bool:
In [328]: np.frompyfunc(lambda i: i is np.nan,1,1)(x).astype(bool)
Out[328]:
array([[False, False, True],
[False, False, False]])
In [329]: np.any(_, axis=1) # test whole rows
Out[329]: array([ True, False])
In [330]: x[~_, :] # use that as mask to keep other rows
Out[330]: array([[3, list([5, 6, 7]), 8]], dtype=object)
在另一个答案中建议的熊猫isnull
可以通过逐个元素测试来做类似的事情:
In [335]: pd.isnull(x)
Out[335]:
array([[False, False, True],
[False, False, False]])