Question

from libcpp.algorithm cimport sort as stdsort
from libcpp.algorithm cimport unique
from libcpp.vector cimport vector
# from libcpp cimport bool
cimport cython

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.initializedcheck(False)
cdef class Vector:
    cdef vector[cython.int] wrapped_vector

    # the easiest thing to do is add short wrappers for the methods you need
    def push_back(self, int num):
        self.wrapped_vector.push_back(num)

    def sort(self):
        stdsort(self.wrapped_vector.begin(), self.wrapped_vector.end())

    def unique(self):
        self.wrapped_vector.erase(unique(self.wrapped_vector.begin(), self.wrapped_vector.end()), self.wrapped_vector.end())


    def __str__(self):
        return "[" + ", ".join([str(i) for i in self.wrapped_vector]) + "]"

    def __repr__(self):
        return str(self)

    def __len__(self):
        return self.wrapped_vector.size()

    @cython.boundscheck(False)
    @cython.wraparound(False)
    @cython.initializedcheck(False)
    def __setitem__(self, int key, int item):
        self.wrapped_vector[key] = item

    @cython.boundscheck(False)
    @cython.wraparound(False)
    @cython.initializedcheck(False)
    def __getitem__(self, int key):
        return self.wrapped_vector[key]

我试图包装矢量，以便可以在Python字典中使用它们。

这似乎会产生大量的开销。例如，参见第72和75行。他们只是向向量中已经存在的数字添加一个整数：

是否可以消除此开销，或者这是我为包装矢量支付的价格？

Answer 1

这似乎基于my answer to another question。将__getitem__和__setitem__添加到cdef class Vector的目的纯粹是为了可以从Python对其进行索引。在Cython中，您可以直接索引到C ++向量中以提高速度。

在files_to_bins的开头添加以下行：

cdef Vector v

这将使Cython确保分配给v的任何对象都是Vector对象（如果不是，它将引发TypeError），因此将允许您访问它的cdef属性直接。

然后更改该行：

v[i] = v[i] + half_fragment_size

收件人：

v.wrapped_vector[i] = v.wrapped_vector[i] + half_fragment_size

（其他索引行也类似）

请注意，boundscheck(False)和wraparound(False)对于C ++对象没有任何作用。 C ++索引运算符不执行边界检查（并且Cython不会添加边界检查），并且它也不支持负索引。 boundscheck和wraparound仅适用于索引内存视图或numpy数组。

包装C ++向量时消除Python开销

1 个答案: