用非打印字符替换子字符串

时间:2012-10-26 19:02:55

标签: c# string encoding

  

可能重复:
  Encode to single byte extended ascii values

在C#中,我试图用非打印字符(字节代码高于0xE0的字符)替换字符串中的子字符串。我已经看到了很多其他方面的问题 - 即尝试从字符串中删除非打印字符 - 但是没有尝试插入非打印字符。下面的代码(无法正常工作)是我现在的位置:

string[] _symbol = {"Hello", "the", "man"};
string _source = "\"Hello, Hello,\" the man said.\n\"Hello,\" the woman replied.";
string _expect = "\"\xF3, \xF3,\" \xF2 \xF1 said.\n\"\xF3,\" \xF2 wo\xF1 replied.";

byte[] tblix = { 0xF3, 0x00 };
string _repl, _dest;

_repl = System.Text.Encoding.UTF8.GetString(tblix, 0, 1);
_dest = _source.Replace(_symbol[0], _repl);

tblix[0]--;
_repl = System.Text.Encoding.UTF8.GetString(tblix, 0, 1);
_dest = _dest.Replace(_symbol[1], _repl);

tblix[0]--;
_repl = System.Text.Encoding.UTF8.GetString(tblix, 0, 1);
_dest = _dest.Replace(_symbol[2], _repl);

bool check = (_dest == _expect);

File.WriteAllText("temp.dat", _dest);

我希望在_dest中生成一个等同于_expect的字符串;如果我使用ASCII编码,非打印字符将恢复为“?”。 UTF8也无法正常工作。此外,我希望将输出作为单字节字符序列写入文件,因此将所有内容转换为多字节编码最终将需要返回单字节表示。有没有方便的方法来做我想要完成的事情?提前感谢任何建议。

2 个答案:

答案 0 :(得分:1)

直接创建char,而不是编码...

        string file = @"C:\Temp\temp.dat";

        string[] _symbol = { "Hello", "the", "man" };
        string _source = "\"Hello, Hello,\" the man said.\n\"Hello,\" the woman replied.";
        string _expect = "\"\xF3, \xF3,\" \xF2 \xF1 said.\n\"\xF3,\" \xF2 wo\xF1 replied.";

        //byte[] tblix = { 0xF3, 0x00 };

        char c = (char)0xF300;

        string _repl, _dest;

        //_repl = System.Text.Encoding.UTF8.GetString(tblix, 0, 1);
        _dest = _source.Replace(_symbol[0], c.ToString());

        c -= (char)0x100;
        //_repl = System.Text.Encoding.UTF8.GetString(tblix, 0, 1);
        _dest = _dest.Replace(_symbol[1], c.ToString());

        c -= (char)0x100;
        //_repl = System.Text.Encoding.UTF8.GetString(tblix, 0, 1);
        _dest = _dest.Replace(_symbol[2], c.ToString());

        bool check = (_dest == _expect);

        File.WriteAllText(file, _dest);

我认为那是试图转换为可打印的字符,而创建字符直接强制打印精确的字符(或者不是在这种情况下)。这是我从您复制并放入新的控制台应用程序的代码的副本。完全按照你说的做了,我做了这些修改,然后才有效。

答案 1 :(得分:0)

我能够使用Windows-1252编码使其工作,如下面的修改代码所示。我还必须确保文件也是用1252编码编写的。

string file = @"C:\Temp\temp.dat";

string[] _symbol = { "Hello", "the", "man" };
string _source = "\"Hello, Hello,\" the man said.\n\"Hello,\" the woman replied.";
string _expect = "\"\xF3, \xF3,\" \xF2 \xF1 said.\n\"\xF3,\" \xF2 wo\xF1 replied.";
byte[] tblix = { 0xF3 };

string _repl, _dest;

Encoding e1252 = Encoding.GetEncoding(1252);
_repl = e1252.GetString(tblix);
_dest = _source.Replace(_symbol[0], _repl);

tblix[0]--;
_repl = e1252.GetString(tblix);
_dest = _dest.Replace(_symbol[1], _repl);

tblix[0]--;
_repl = e1252.GetString(tblix);
_dest = _dest.Replace(_symbol[2], _repl);

bool check = (_dest == _expect);

TextWriter tw = new StreamWriter(file, false, e1252);
tw.Write(_dest);
tw.Close();