在Perl中打包/解压缩二进制字符串

时间:2014-06-09 15:37:44

标签: perl pack unpack

我试图理解Perl代码的一个片段。我认为的目的是从输入整数产生二进制字符串,但是按反向位顺序(左边的低位,右边的高位)。我不明白pack / unpack对输入值做了什么;它似乎是不正确的。

考虑这个测试代码:

for (my $i = 0; $i < 16; $i++) {

    for (my $j = 0; $j < 16; $j++) {

        $x = $i * 16 + $j;
        $x = unpack("b8", pack("U", $x));
        printf $x;
        print " ";
    }
    print "\n";
}

这会产生:

00000000 10000000 01000000 11000000 00100000 10100000 01100000 11100000 00010000 10010000 01010000 11010000 00110000 10110000 01110000 11110000
00001000 10001000 01001000 11001000 00101000 10101000 01101000 11101000 00011000 10011000 01011000 11011000 00111000 10111000 01111000 11111000
00000100 10000100 01000100 11000100 00100100 10100100 01100100 11100100 00010100 10010100 01010100 11010100 00110100 10110100 01110100 11110100
00001100 10001100 01001100 11001100 00101100 10101100 01101100 11101100 00011100 10011100 01011100 11011100 00111100 10111100 01111100 11111100
00000010 10000010 01000010 11000010 00100010 10100010 01100010 11100010 00010010 10010010 01010010 11010010 00110010 10110010 01110010 11110010
00001010 10001010 01001010 11001010 00101010 10101010 01101010 11101010 00011010 10011010 01011010 11011010 00111010 10111010 01111010 11111010
00000110 10000110 01000110 11000110 00100110 10100110 01100110 11100110 00010110 10010110 01010110 11010110 00110110 10110110 01110110 11110110
00001110 10001110 01001110 11001110 00101110 10101110 01101110 11101110 00011110 10011110 01011110 11011110 00111110 10111110 01111110 11111110
01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011
01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011
01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011
01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011 01000011
11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011
11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011
11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011
11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011 11000011

那么,这里发生了什么?似乎所有的高ASCII码都是&#39;值(超过128)被错误地转换,但尽管阅读了packunpack的文档,但我看不到这里发生了什么。

1 个答案:

答案 0 :(得分:1)

pack&#39; U模式将其打包成UTF-8字符,其中可能是也可能不是是一个字节。 (您的输出开始110的事实意味着the result is two bytes long,但这是一个不同的故事。)

From the documentation

U - A Unicode character number. Encodes to a character in character mode and UTF-8 (or UTF-EBCDIC in EBCDIC platforms) in byte mode.

您应该使用C选项来确保您只获得一个字节:

C - An unsigned char (octet) value.

这给了我们:

for ( my $i = 0; $i < 16; $i++ ) {

    for ( my $j = 0; $j < 16; $j++ ) {

        $x = $i * 16 + $j;
        $x = unpack("b8", pack("C", $x));
        printf $x;
        print " ";
    }
    print "\n";
}
相关问题