使用GZipStream压缩字符串而不使用Base64

时间:2019-03-27 02:59:37

标签: c# gzip

我正在尝试尽可能多地压缩字符串。当压缩为Base64字符串并从Base64字符串解压缩时,我确实有能正常工作的代码。

    public static string CompressString(string text)
    {
        byte[] buffer = Encoding.UTF8.GetBytes(text);
        var memoryStream = new MemoryStream();
        using (var gZipStream = new GZipStream(memoryStream, CompressionMode.Compress, true))
        {
            gZipStream.Write(buffer, 0, buffer.Length);
        }

        memoryStream.Position = 0;

        var compressedData = new byte[memoryStream.Length];
        memoryStream.Read(compressedData, 0, compressedData.Length);

        var gZipBuffer = new byte[compressedData.Length + 4];
        Buffer.BlockCopy(compressedData, 0, gZipBuffer, 4, compressedData.Length);
        Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gZipBuffer, 0, 4);

        return Convert.ToBase64String(gZipBuffer); // RETURNS AS BASE64
        //return Encoding.UTF8.GetString(gZipBuffer); // RETURN AS UTF8 STRING
    }

    public static string DecompressString(string compressedText)
    {
        byte[] gZipBuffer = Convert.FromBase64String(compressedText); // BASE64 STRING TO BYTE ARRAY
        //byte[] gZipBuffer = Encoding.UTF8.GetBytes(compressedText); // UTF8 STRING TO BYTE ARRAY
        using (var memoryStream = new MemoryStream())
        {
            int dataLength = BitConverter.ToInt32(gZipBuffer, 0);
            memoryStream.Write(gZipBuffer, 4, gZipBuffer.Length - 4);

            var buffer = new byte[dataLength];

            memoryStream.Position = 0;
            using (var gZipStream = new GZipStream(memoryStream, CompressionMode.Decompress))
            {
                gZipStream.Read(buffer, 0, buffer.Length);
            }

            return Encoding.UTF8.GetString(buffer);
        }
    }

这很好。但是,如果我切换CompressString以返回Encoding.UTF8.GetString(gZipBuffer)而不是Convert.ToBase64String(gZipBuffer)并更改DecompressString以使用Encoding.UTF8.GetBytes(compressedText)而不是Convert.FromBase64String(compressedText)读入缓冲区在解压缩时出现异常(尽管压缩工作正常)。

Additional information: The magic number in GZip header is not correct. Make sure you are passing in a GZip stream.

使用Base64的问题是,最终压缩的字符串比使用Encoding.UTF8.GetStringEncoding.UTF8.GetBytes的字符串长40%

有什么方法可以压缩字符串,而不必对结果字符串进行base64编码?

0 个答案:

没有答案