Question

我有一个问题。

我有两个numpy数组是OpenCV凸包，我想检查交集而不创建for循环或创建图像并对它们执行numpy.bitwise_and，这两者在Python中都很慢。数组看起来像这样：

[[[x1 y1]]
 [[x2 y2]]
 [[x3 y3]]
...
 [[xn yn]]]

将[[x1 y1]]视为一个单独的元素，我想在两个numpy ndarray之间执行交集。我怎样才能做到这一点？我发现了一些类似性质的问题，但我无法从中找出解决方法。

提前致谢！

Answer 1

您可以使用数组视图作为intersect1d函数的单个维度，如下所示：

def multidim_intersect(arr1, arr2):
    arr1_view = arr1.view([('',arr1.dtype)]*arr1.shape[1])
    arr2_view = arr2.view([('',arr2.dtype)]*arr2.shape[1])
    intersected = numpy.intersect1d(arr1_view, arr2_view)
    return intersected.view(arr1.dtype).reshape(-1, arr1.shape[1])

这将创建每个数组的视图，将每一行更改为值的元组。然后它执行交集，并将结果更改回原始格式。以下是使用它的示例：

test_arr1 = numpy.array([[0, 2],
                         [1, 3],
                         [4, 5],
                         [0, 2]])

test_arr2 = numpy.array([[1, 2],
                         [0, 2],
                         [3, 1],
                         [1, 3]])

print multidim_intersect(test_arr1, test_arr2)

打印：

[[0 2]
 [1 3]]

Answer 2

你可以使用http://pypi.python.org/pypi/Polygon/2.0.4，这是一个例子：

>>> import Polygon
>>> a = Polygon.Polygon([(0,0),(1,0),(0,1)])
>>> b = Polygon.Polygon([(0.3,0.3), (0.3, 0.6), (0.6, 0.3)])
>>> a & b
Polygon:
  <0:Contour: [0:0.60, 0.30] [1:0.30, 0.30] [2:0.30, 0.60]>

要将cv2.findContours的结果转换为多边形点格式，您可以：

points1 = contours[0].reshape(-1,2)

这会将形状从（N，1,2）转换为（N，2）

以下是一个完整的例子：

import Polygon
import cv2
import numpy as np
from scipy.misc import bytescale

y, x = np.ogrid[-2:2:100j, -2:2:100j]

f1 = bytescale(np.exp(-x**2 - y**2), low=0, high=255)
f2 = bytescale(np.exp(-(x+1)**2 - y**2), low=0, high=255)


c1, hierarchy = cv2.findContours((f1>120).astype(np.uint8), 
                                       cv2.cv.CV_RETR_EXTERNAL, 
                                       cv2.CHAIN_APPROX_SIMPLE)

c2, hierarchy = cv2.findContours((f2>120).astype(np.uint8), 
                                       cv2.cv.CV_RETR_EXTERNAL, 
                                       cv2.CHAIN_APPROX_SIMPLE)


points1 = c1[0].reshape(-1,2) # convert shape (n, 1, 2) to (n, 2)
points2 = c2[0].reshape(-1,2)

import pylab as pl
poly1 = pl.Polygon(points1, color="blue", alpha=0.5)
poly2 = pl.Polygon(points2, color="red", alpha=0.5)
pl.figure(figsize=(8,3))
ax = pl.subplot(121)
ax.add_artist(poly1)
ax.add_artist(poly2)
pl.xlim(0, 100)
pl.ylim(0, 100)

a = Polygon.Polygon(points1)
b = Polygon.Polygon(points2)
intersect = a&b # calculate the intersect polygon

poly3 = pl.Polygon(intersect[0], color="green") # intersect[0] are the points of the polygon
ax = pl.subplot(122)
ax.add_artist(poly3)
pl.xlim(0, 100)
pl.ylim(0, 100)
pl.show()

输出：

enter image description here

Answer 3

所以这就是我为完成工作所做的工作：

import Polygon, numpy

# Here I extracted and combined some contours and created a convex hull from it.
# Now I wanna check whether a contour acquired differently intersects with this hull or not.

for contour in contours:  # The result of cv2.findContours is a list of contours
    contour1 = contour.flatten()
    contour1 = numpy.reshape(contour1, (int(contour1.shape[0]/2),-1))
    poly1 = Polygon.Polygon(contour1)

    hull = hull.flatten()  # This is the hull is previously constructued
    hull = numpy.reshape(hull, (int(hull.shape[0]/2),-1))
    poly2 = Polygon.Polygon(hull)

    if (poly1 & poly2).area()<= some_max_val:
        some_operations

我不得不使用for循环，这看起来有点单调乏味，虽然它给了我预期的结果。任何更好的方法将不胜感激！

Answer 4

受到jiterrace的回答

的启发

我在使用Udacity deep learning class(尝试查找训练和测试数据之间的重叠时遇到了这篇文章。

我不熟悉＆＃34;查看＆＃34;并且发现语法有点难以理解，当我尝试与在＃34; table＆＃34;中思考的朋友交流时，可能是相同的。我的方法基本上是将形状（N，X，Y）的形状（N，X，Y）平整/重塑成形状（N，X * Y，1）。

print(train_dataset.shape)
print(test_dataset.shape)
#(200000L, 28L, 28L)
#(10000L, 28L, 28L)

1）。 INNER JOIN（更容易理解，慢）

import pandas as pd

%%timeit -n 1 -r 1
def multidim_intersect_df(arr1, arr2):
    p1 = pd.DataFrame([r.flatten() for r in arr1]).drop_duplicates()
    p2 = pd.DataFrame([r.flatten() for r in arr2]).drop_duplicates()
    res = p1.merge(p2)
    return res
inters_df = multidim_intersect_df(train_dataset, test_dataset)
print(inters_df.shape)
#(1153, 784)
#1 loop, best of 1: 2min 56s per loop

2）。设置交叉（快速）

%%timeit -n 1 -r 1
def multidim_intersect(arr1, arr2):
    arr1_new = arr1.reshape((-1, arr1.shape[1]*arr1.shape[2])) # -1 means row counts are inferred from other dimensions
    arr2_new = arr2.reshape((-1, arr2.shape[1]*arr2.shape[2]))
    intersected = set(map(tuple, arr1_new)).intersection(set(map(tuple, arr2_new)))  # list is not hashable, go tuple
    return list(intersected)  # in shape of (N, 28*28)

inters = multidim_intersect(train_dataset, test_dataset)
print(len(inters))
# 1153
#1 loop, best of 1: 34.6 s per loop

2D numpy ndarray的交叉点

4 个答案: