复制整数位的最快方法是什么。
例如,
17
-> 10001
复制后:
1100000011
答案 0 :(得分:3)
看起来像是位交织的一种变体。
交织比特是显而易见的方式
(从http://graphics.stanford.edu/~seander/bithacks.html修改)
unsigned int x = 17; unsigned int z = 0; // z gets the resulting Morton Number. for (int i = 0; i < sizeof(x) * CHAR_BIT; i++) // unroll for more speed... { z |= (x & 1U << i) << i | (x & 1U << i) << (i + 1); }
有关更多方法,请参见http://graphics.stanford.edu/~seander/bithacks.html#InterleaveTableObvious。
答案 1 :(得分:3)
如果整数大小为32位,则实现此目标的一种非常快速的方法包括256个16位字的数组和2个查找:
uint16_t dup16[256] = { 0, 3, 12, 15, 48, ... };
uint32_t x = 17
uint32_t z = dup16[x & 255] | ((uint32_t)dup16[(x >> 8) & 255] << 16);
答案 2 :(得分:2)
我应该认为最快的方法是查找表。
您可以在编译时进行计算。
#include <array>
#include <cstdint>
#include <iostream>
#include <iomanip>
constexpr std::uint64_t dup_bit(bool b)
{
unsigned result = 0;
if (b) result = 0x3;
return result;
}
constexpr std::uint64_t slow_dup_bits(std::uint64_t input)
{
std::uint64_t result = 0;
for(int i = 16 ; i-- ; )
{
result <<= 2;
result |= dup_bit(input & (1u << i));
}
return result;
}
template<std::uint64_t Start, std::uint64_t...Is>
constexpr auto make_dup_table(std::integer_sequence<std::uint64_t, Is...>)
{
return std::array<std::uint64_t, sizeof...(Is)>
{{
slow_dup_bits(Start + Is)...,
}};
}
std::uint64_t dup_bits(std::uint64_t x)
{
static const auto table = make_dup_table<0>(std::make_integer_sequence<std::uint64_t, 65536>());
auto low32 = table[x & 65535];
auto high32 = table[x >> 16];
return (high32 << 32) + low32;
}
int main()
{
for(std::uint64_t i = 0 ; i < 64 ; ++i)
std::cout << std::setw(8) << std::setfill(' ') << i
<< " -> "
<< std::hex << std::setw(64) << std::setfill('0') << dup_bits(i) << '\n';
}
预期输出:
0 -> 0000000000000000000000000000000000000000000000000000000000000000
1 -> 0000000000000000000000000000000000000000000000000000000000000003
2 -> 000000000000000000000000000000000000000000000000000000000000000c
3 -> 000000000000000000000000000000000000000000000000000000000000000f
4 -> 0000000000000000000000000000000000000000000000000000000000000030
5 -> 0000000000000000000000000000000000000000000000000000000000000033
6 -> 000000000000000000000000000000000000000000000000000000000000003c
7 -> 000000000000000000000000000000000000000000000000000000000000003f
8 -> 00000000000000000000000000000000000000000000000000000000000000c0
9 -> 00000000000000000000000000000000000000000000000000000000000000c3
a -> 00000000000000000000000000000000000000000000000000000000000000cc
b -> 00000000000000000000000000000000000000000000000000000000000000cf
c -> 00000000000000000000000000000000000000000000000000000000000000f0
d -> 00000000000000000000000000000000000000000000000000000000000000f3
e -> 00000000000000000000000000000000000000000000000000000000000000fc
f -> 00000000000000000000000000000000000000000000000000000000000000ff
10 -> 0000000000000000000000000000000000000000000000000000000000000300
11 -> 0000000000000000000000000000000000000000000000000000000000000303
12 -> 000000000000000000000000000000000000000000000000000000000000030c
13 -> 000000000000000000000000000000000000000000000000000000000000030f
14 -> 0000000000000000000000000000000000000000000000000000000000000330
15 -> 0000000000000000000000000000000000000000000000000000000000000333
16 -> 000000000000000000000000000000000000000000000000000000000000033c
17 -> 000000000000000000000000000000000000000000000000000000000000033f
18 -> 00000000000000000000000000000000000000000000000000000000000003c0
19 -> 00000000000000000000000000000000000000000000000000000000000003c3
1a -> 00000000000000000000000000000000000000000000000000000000000003cc
1b -> 00000000000000000000000000000000000000000000000000000000000003cf
1c -> 00000000000000000000000000000000000000000000000000000000000003f0
1d -> 00000000000000000000000000000000000000000000000000000000000003f3
1e -> 00000000000000000000000000000000000000000000000000000000000003fc
1f -> 00000000000000000000000000000000000000000000000000000000000003ff
20 -> 0000000000000000000000000000000000000000000000000000000000000c00
21 -> 0000000000000000000000000000000000000000000000000000000000000c03
22 -> 0000000000000000000000000000000000000000000000000000000000000c0c
23 -> 0000000000000000000000000000000000000000000000000000000000000c0f
24 -> 0000000000000000000000000000000000000000000000000000000000000c30
25 -> 0000000000000000000000000000000000000000000000000000000000000c33
26 -> 0000000000000000000000000000000000000000000000000000000000000c3c
27 -> 0000000000000000000000000000000000000000000000000000000000000c3f
28 -> 0000000000000000000000000000000000000000000000000000000000000cc0
29 -> 0000000000000000000000000000000000000000000000000000000000000cc3
2a -> 0000000000000000000000000000000000000000000000000000000000000ccc
2b -> 0000000000000000000000000000000000000000000000000000000000000ccf
2c -> 0000000000000000000000000000000000000000000000000000000000000cf0
2d -> 0000000000000000000000000000000000000000000000000000000000000cf3
2e -> 0000000000000000000000000000000000000000000000000000000000000cfc
2f -> 0000000000000000000000000000000000000000000000000000000000000cff
30 -> 0000000000000000000000000000000000000000000000000000000000000f00
31 -> 0000000000000000000000000000000000000000000000000000000000000f03
32 -> 0000000000000000000000000000000000000000000000000000000000000f0c
33 -> 0000000000000000000000000000000000000000000000000000000000000f0f
34 -> 0000000000000000000000000000000000000000000000000000000000000f30
35 -> 0000000000000000000000000000000000000000000000000000000000000f33
36 -> 0000000000000000000000000000000000000000000000000000000000000f3c
37 -> 0000000000000000000000000000000000000000000000000000000000000f3f
38 -> 0000000000000000000000000000000000000000000000000000000000000fc0
39 -> 0000000000000000000000000000000000000000000000000000000000000fc3
3a -> 0000000000000000000000000000000000000000000000000000000000000fcc
3b -> 0000000000000000000000000000000000000000000000000000000000000fcf
3c -> 0000000000000000000000000000000000000000000000000000000000000ff0
3d -> 0000000000000000000000000000000000000000000000000000000000000ff3
3e -> 0000000000000000000000000000000000000000000000000000000000000ffc
3f -> 0000000000000000000000000000000000000000000000000000000000000fff
答案 3 :(得分:2)
对于GCC至少为-march=haswell
的人:
#include <cstddef>
#include <iomanip>
#include <iostream>
#include <immintrin.h>
std::uint64_t interleave_pdep(std::uint32_t input) {
return _pdep_u64(input, 0x5555555555555555)
| _pdep_u64(input, 0xaaaaaaaaaaaaaaaa);
}
int main()
{
std::uint32_t i;
std::cin >> std::hex;
std::cout << std::hex;
while (std::cin >> i)
std::cout << interleave_pdep(i) << '\n';
}
(向Daniel Lemire's blog赠送积分)
例如11-> 303(作为练习的剩余二进制数I / O……)
答案 4 :(得分:0)
没有循环或LUT的解决方案。
它适用于4位半字节。首先,将半字节复制并通过多倍移位。然后,将有趣的位屏蔽并复制。然后,通过移位和运算将所有位收集在一起。
int duplicate4(unsigned char c){
// c is a 4bits nibble.
// Assume c=2#dcba
unsigned long long spread=(1+(1<<9)+(1<<18)+(1<<27));
// to duplicate the nibble
//=2#00001000.00000100.00000010.00000001
unsigned long long ll=c*spread;
// ll = 0dcba000.00dcba00.000dcba0.0000dcba
unsigned long long mask=(1+(1<<10)+(1<<20)+(1<<30));
// to extract bits
//=2#01000000.00010000.00000100.00000001
ll &= mask;
// ll = 0d000000.000c0000.00000b00.0000000a
ll |= (ll<<1) ;
// ll = dd000000.00cc0000.0000bb00.000000aa
ll |= (ll >> 16) ;
// ll = dd000000.00cc0000.dd00bb00.00cc00aa
ll |= (ll >> 8) ;
// ll = dd000000.00cc0000.dd00bb00.ddccbbaa
ll &= 0xff ;
// ll = 00000000.00000000.00000000.ddccbbaa
return ll;
}
int duplicate8(unsigned char c){
return duplicate4(c&0xf) + 256*duplicate4((c&0xf0)>>4);
}