修改

Question

对<实现std::bitset运算符的最优化方法是什么？对应于无符号整数表示的比较（它应该适用于more than 64 bits的位集）？

一个简单的实现将是：

template<std::size_t N>
bool operator<(const std::bitset<N>& x, const std::bitset<N>& y)
{
    for (int i = N-1; i >= 0; i--) {
        if (x[i] && !y[i]) return false;
        if (!x[i] && y[i]) return true;
    }
    return false;
}

当我说＆＃34;最优化的方式＆＃34;我正在寻找使用按位运算和元编程技巧（以及类似的东西）的实现。

编辑：我认为我已经找到了技巧：模板元编程用于编译时递归和右移位，以便将位集比较为几个无符号长整数。但不明白如何做到这一点......

Answer 1

显而易见的优化是

template<std::size_t N>
bool operator<(const std::bitset<N>& x, const std::bitset<N>& y)
{
    for (int i = N-1; i >= 0; i--) {
        if (x[i] ^ y[i]) return y[i];
    }
    return false;
}

除此之外，由于没有符合标准的方法来访问它们，因此使用更多的每次测试应该是不可能的。您可以对x.to_string() < y.to_string()进行基准测试，并希望to_string()和字符串比较优于对bitset的按位访问优化，但这是一个很长的镜头。

Answer 2

我只是查看了源代码，但不幸的是（除非，希望我错了），它们似乎没有为特定的位块提供const & unsigned long的就地访问。如果他们这样做，那么你可以执行模板递归，并有效地比较每个unsigned long而不是无符号长整数中的每一位。

毕竟，如果A < B，那么不仅每个最重要的位a <= b，还应该是每个最重要的位A[i] <= B[i]。

我不想这么说，但我可能会在C ++ 11的std::array上使用递归来推广自己。如果您可以访问这些块，那么您可以使用模板递归函数来轻松完成此操作（因为我确定您知道，因为您要求元编程），为编译器提供了很好的优化机会。

总而言之，不是一个好的答案，但这就是我要做的。

顺便说一下，这是一个很好的问题。

===========

修改

这应该是时间三种方法：具有最新upvotes的方法，我描述的块策略，以及模板递归变体。我使用位集填充矢量，然后使用指定的比较器函数重复排序。

快乐的黑客攻击！

在我的计算机上输出

RUNTIMES: compiled g++ -std=c++11 -Wall -g test.cpp std::bitset 4530000 (6000000 original in OP) Block-by-block 900000 Template recursive 730000 compiled g++ -std=c++11 -Wall -g -O3 test.cpp RUNTIMES: std::bitset 700000 (740000 original in OP) Block-by-block 470000 Template recursive 530000

C ++ 11代码：

#include <iostream> #include <bitset> #include <algorithm> #include <time.h> /* Existing answer. Note that I've flipped the order of bit significance to match my own */ template<std::size_t N> class BitByBitComparator { public: bool operator()(const std::bitset<N>& x, const std::bitset<N>& y) const { for (int i = 0; i < N; ++i) { if (x[i] ^ y[i]) return y[i]; } return false; } }; /* New simple bit set class (note: mostly untested). Also note bad design: should only allow read access via immutable facade. */ template<std::size_t N> class SimpleBitSet { public: static const int BLOCK_SIZE = 64; static const int LOG_BLOCK_SIZE = 6; static constexpr int NUM_BLOCKS = N >> LOG_BLOCK_SIZE; std::array<unsigned long int, NUM_BLOCKS> allBlocks; SimpleBitSet() { allBlocks.fill(0); } void addItem(int itemIndex) { // TODO: can do faster int blockIndex = itemIndex >> LOG_BLOCK_SIZE; unsigned long int & block = allBlocks[blockIndex]; int indexWithinBlock = itemIndex % BLOCK_SIZE; block |= (0x8000000000000000 >> indexWithinBlock); } bool getItem(int itemIndex) const { int blockIndex = itemIndex >> LOG_BLOCK_SIZE; unsigned long int block = allBlocks[blockIndex]; int indexWithinBlock = itemIndex % BLOCK_SIZE; return bool((block << indexWithinBlock) & 0x8000000000000000); } }; /* New comparator type 1: block-by-block. */ template<std::size_t N> class BlockByBlockComparator { public: bool operator()(const SimpleBitSet<N>& x, const SimpleBitSet<N>& y) const { return ArrayCompare(x.allBlocks, y.allBlocks); } template <std::size_t S> bool ArrayCompare(const std::array<unsigned long int, S> & lhs, const std::array<unsigned long int, S> & rhs) const { for (int i=0; i<S; ++i) { unsigned long int lhsBlock = lhs[i]; unsigned long int rhsBlock = rhs[i]; if (lhsBlock < rhsBlock) return true; if (lhsBlock > rhsBlock) return false; } return false; } }; /* New comparator type 2: template recursive block-by-block. */ template <std::size_t I, std::size_t S> class TemplateRecursiveArrayCompare; template <std::size_t S> class TemplateRecursiveArrayCompare<S, S> { public: bool operator()(const std::array<unsigned long int, S> & lhs, const std::array<unsigned long int, S> & rhs) const { return false; } }; template <std::size_t I, std::size_t S> class TemplateRecursiveArrayCompare { public: bool operator()(const std::array<unsigned long int, S> & lhs, const std::array<unsigned long int, S> & rhs) const { unsigned long int lhsBlock = lhs[I]; unsigned long int rhsBlock = rhs[I]; if (lhsBlock < rhsBlock) return true; if (lhsBlock > rhsBlock) return false; return TemplateRecursiveArrayCompare<I+1, S>()(lhs, rhs); } }; template<std::size_t N> class TemplateRecursiveBlockByBlockComparator { public: bool operator()(const SimpleBitSet<N>& x, const SimpleBitSet<N>& y) const { return TemplateRecursiveArrayCompare<x.NUM_BLOCKS, x.NUM_BLOCKS>()(x.allBlocks, y.allBlocks); } }; /* Construction, timing, and verification code */ int main() { srand(0); const int BITSET_SIZE = 4096; std::cout << "Constructing..." << std::endl; // Fill a vector with random bitsets const int NUMBER_TO_PROCESS = 10000; const int SAMPLES_TO_FILL = BITSET_SIZE; std::vector<std::bitset<BITSET_SIZE> > allBitSets(NUMBER_TO_PROCESS); std::vector<SimpleBitSet<BITSET_SIZE> > allSimpleBitSets(NUMBER_TO_PROCESS); for (int k=0; k<NUMBER_TO_PROCESS; ++k) { std::bitset<BITSET_SIZE> bs; SimpleBitSet<BITSET_SIZE> homemadeBs; for (int j=0; j<SAMPLES_TO_FILL; ++j) { int indexToAdd = rand()%BITSET_SIZE; bs[indexToAdd] = true; homemadeBs.addItem(indexToAdd); } allBitSets[k] = bs; allSimpleBitSets[k] = homemadeBs; } clock_t t1,t2,t3,t4; t1=clock(); std::cout << "Sorting using bit-by-bit compare and std::bitset..." << std::endl; const int NUMBER_REPS = 100; for (int rep = 0; rep<NUMBER_REPS; ++rep) { auto tempCopy = allBitSets; std::sort(tempCopy.begin(), tempCopy.end(), BitByBitComparator<BITSET_SIZE>()); } t2=clock(); std::cout << "Sorting block-by-block using SimpleBitSet..." << std::endl; for (int rep = 0; rep<NUMBER_REPS; ++rep) { auto tempCopy = allSimpleBitSets; std::sort(tempCopy.begin(), tempCopy.end(), BlockByBlockComparator<BITSET_SIZE>()); } t3=clock(); std::cout << "Sorting block-by-block w/ template recursion using SimpleBitSet..." << std::endl; for (int rep = 0; rep<NUMBER_REPS; ++rep) { auto tempCopy = allSimpleBitSets; std::sort(tempCopy.begin(), tempCopy.end(), TemplateRecursiveBlockByBlockComparator<BITSET_SIZE>()); } t4=clock(); std::cout << std::endl << "RUNTIMES:" << std::endl; std::cout << "\tstd::bitset \t" << t2-t1 << std::endl; std::cout << "\tBlock-by-block \t" << t3-t2 << std::endl; std::cout << "\tTemplate recursive \t" << t4-t3 << std::endl; std::cout << std::endl; std::cout << "Checking result... "; std::sort(allBitSets.begin(), allBitSets.end(), BitByBitComparator<BITSET_SIZE>()); auto copy = allSimpleBitSets; std::sort(allSimpleBitSets.begin(), allSimpleBitSets.end(), BlockByBlockComparator<BITSET_SIZE>()); std::sort(copy.begin(), copy.end(), TemplateRecursiveBlockByBlockComparator<BITSET_SIZE>()); for (int k=0; k<NUMBER_TO_PROCESS; ++k) { auto stdBitSet = allBitSets[k]; auto blockBitSet = allSimpleBitSets[k]; auto tempRecBlockBitSet = allSimpleBitSets[k]; for (int j=0; j<BITSET_SIZE; ++j) if (stdBitSet[j] != blockBitSet.getItem(j) || blockBitSet.getItem(j) != tempRecBlockBitSet.getItem(j)) std::cerr << "error: sorted order does not match" << std::endl; } std::cout << "success" << std::endl; return 0; }

Answer 3

虽然你说位集，但你真的不是在谈论任意精度无符号整数比较。如果是这样，那么你可能不会轻易做得更好，然后包装GMP。

从他们的网站：

GMP经过精心设计，尽可能快，适用于小型操作数和巨大的操作数。速度通过使用来实现 fullwords作为基本算术类型，通过使用快速算法，用高度优化的汇编代码，用于最常见的内部循环很多CPU，并且一般都强调速度。

考虑their integer functions

Answer 4

如何检查XOR的最高位？

bool operator<(const std::bitset<N>& x, const std::bitset<N>& y)
{
    return y[fls(x^y)]
}

int fls(const std::bitset<N>& n) {
    // find the last set bit
}

fps的一些想法可以在http://uwfsucks.blogspot.be/2007/07/fls-implementation.html找到。

Answer 5

如果您愿意采用该解决方案，如果STL位集改变，您可以使用

template<int n>
bool compare(bitset<n>& l, bitset<n>& r){
  if(n > 64){
  typedef array<long, (n/64)> AsArray;
  return *reinterpret_cast<AsArray*>(&l)
       < *reinterpret_cast<AsArray*>(&r);
    }//else
  return l.to_ulong() < r.to_ulong();
}

编译器抛出if away

的无关分支

Answer 6

嗯，有个不错的旧memcmp。从某种意义上来说，它是脆弱的，它取决于std::bitset的实现。因此可能无法使用。但是假设模板创建了int的不透明数组是合理的。并且没有其他簿记字段。

template<std::size_t N>
bool operator<(const std::bitset<N>& x, const std::bitset<N>& y)
{
    int cmp = std::memcmp(&x, &y, sizeof(x));
    return (cmp < 0);
}

这将唯一确定bitset的顺序。但这可能不是人类直观的命令。这取决于哪个位用于哪个集合成员索引。例如，索引0可以是前32位整数的LSB。或者它可以是前8位字节的LSB。

我强烈推荐单元测试，以确保此方法可以实际使用。 ;->

Answer 7

如果两个位集不同，则仅执行按位比较已经可以提高性能：

template<std::size_t N>
bool operator<(const std::bitset<N>& x, const std::bitset<N>& y)
{       if (x == y)
                return false;
        ….
}

Answer 8

我知道这是一个有点老的问题，但是如果您知道位集的最大大小，则可以这样创建：

class Bitset{
    vector<bitset<64>> bits;
    /*
     * operators that you need
    */
};

这使您可以将每个bitsets<64>强制转换为unsigned long long以进行快速比较。如果您想到达特定位（以便更改它或进行其他操作），可以执行bits[id / 64][id % 64]

比较位集的最快方法（＆lt; bitsets上的运算符）？

8 个答案:

修改