在二维数组中查找到最近邻居的距离

时间:2019-07-21 00:01:15

标签: python numpy scipy nearest-neighbor euclidean-distance

我有一个2D数组,我想为每个(x, y)找到距离到它最近的邻居。

我可以使用scipy.spatial.distance.cdist

import numpy as np
from scipy.spatial.distance import cdist

# Random data
data = np.random.uniform(0., 1., (1000, 2))
# Distance between the array and itself
dists = cdist(data, data)
# Sort by distances
dists.sort()
# Select the 1st distance, since the zero distance is always 0.
# (distance of a point with itself)
nn_dist = dists[:, 1]

这可行,但是我觉得它的工作量很大,KDTree应该可以解决这个问题,但我不确定如何解决。我对最近的邻居的坐标不感兴趣,我只想要距离(并尽可能快)。

1 个答案:

答案 0 :(得分:2)

KDTree可以做到这一点。该过程与使用cdist时几乎相同。但是cdist更快。正如评论中指出的那样,cKDTree甚至更快:

import numpy as np
from scipy.spatial.distance import cdist
from scipy.spatial import KDTree
from scipy.spatial import cKDTree
import timeit

# Random data
data = np.random.uniform(0., 1., (1000, 2))

def scipy_method():
    # Distance between the array and itself
    dists = cdist(data, data)
    # Sort by distances
    dists.sort()
    # Select the 1st distance, since the zero distance is always 0.
    # (distance of a point with itself)
    nn_dist = dists[:, 1]
    return nn_dist

def KDTree_method():
    # You have to create the tree to use this method.
    tree = KDTree(data)
    # Then you find the closest two as the first is the point itself
    dists = tree.query(data, 2)
    nn_dist = dists[0][:, 1]
    return nn_dist

def cKDTree_method():
    tree = cKDTree(data)
    dists = tree.query(data, 2)
    nn_dist = dists[0][:, 1]
    return nn_dist

print(timeit.timeit('cKDTree_method()', number=100, globals=globals()))
print(timeit.timeit('scipy_method()', number=100, globals=globals()))
print(timeit.timeit('KDTree_method()', number=100, globals=globals()))

输出:

0.34952507635557595
7.904083715193579
20.765962179145546

再一次,就不需要使用C了不起的证据!

相关问题