为什么scipy.spatial.ckdtree比scipy.spatial.kdtree运行得慢

时间:2018-07-29 02:25:02

标签: python numpy scipy cython

通常,scipy.spatial.ckdtree的运行速度比scipy.spatial.kdtree快得多。

但是在我的情况下,scipy.spatial.ckdtree运行得比scipy.spatial.kdtree慢。 我的代码如下:

import numpy as np
from laspy.file import File
from scipy import spatial
from timeit import default_timer as timer
inFile = File("Toronto_Strip_01.las")
dataset = np.vstack([inFile.x, inFile.y, inFile.z]).transpose()
print(dataset.shape)
start=timer()
tree = spatial.cKDTree(dataset)
# balanced_tree = False
end=timer()
distance,index=tree.query(dataset[100,:],k=5)
print(distance,index)
print(end-start)

start=timer()
tree = spatial.KDTree(dataset)
end=timer()
dis,indices= tree.query(dataset[100,:],k=5)
print(dis,indices)
print(end-start)

dataset.shape为(2727891,3),dataset.max()为4834229.32

但是,在一个测试案例中,scipy.spatial.ckdtree的运行速度比scipy.spatial.kdtree快得多,代码如下:

import numpy as np
from timeit import default_timer as timer
from scipy import spatial
np.random.seed(0)
A = np.random.random((2000000,3))*2000000
start1 = timer()
kdt=spatial.KDTree(A)
end1 = timer()
distance,index = kdt.query(A[100,:],k=5)
print(distance,index)
print(end1-start1)

start2 = timer()
kdt = spatial.cKDTree(A)  # cKDTree + outside construction
end2 = timer()
distance,index = kdt.query(A[100,:],k=5)
print(distance,index)
print(end2-start2)

这是我的问题:,在我的代码中,我是否需要处理数据集以加快cKDTree的速度?

我的python版本是3.6.5,scipy版本是1.1.0,cython是0.28.4

1 个答案:

答案 0 :(得分:0)

也许是长篇评论;但您应该考虑cKDTree parameters如何影响您的特定数据集的性能。

特别是balanced_treecompact_nodes-以pointed out here.