将Unicode字符串转换为正确的字符串

时间:2015-05-18 06:33:54

标签: c# unicode

我有一个包含unicode数据的字符串。

我想把它写在一个文件中。当数据写入文件时,它为我提供了简单的unicode值,而不是英语以外的语言。

string originalString = ((char)(buffer[index])).ToString();
//sb.Append(DecodeEncodedNonAsciiCharacters(originalString.ToString()));
foreach (char c1 in originalString)
{
    // test if char is ascii, otherwise convert to Unicode Code Point
    int cint = Convert.ToInt32(c1);
    if (cint <= 127 && cint >= 0)
        asAscii.Append(c1.ToString());
    else
    {
        //String s = Char.ConvertFromUtf32(cint);
        asAscii.Append(String.Format("\\u{0:x4} ", cint).Trim());
       // asAscii.Append(s);
    }
}

sb.Append((asAscii));
Console.WriteLine();

当我看到输出文件时,数据显示如下

  

1 00:00:27,709 - > 00:00:32,959   1.2 \ u00e0 \ u00a4 \ u0085 \ u00e0 \ u00a4 \ u00b0 \ u00e0 \ u00a4 \ u00ac \ u00e0 \ u00a4 \ u00b2 \ u00e0 \ u00a5 \ u008b \ u00e0 \ u00a4 \ u0097 28   \ u00e0 \ u00a4 \ u00b0 \ u00e0 \ u00a4 \ u00be \ u00e0 \ u00a4 \ u009c \ u00e0 \ u00a5 \ u008d \ u00e0 \ u00a4 \ u00af   \ u00e0 \ u00a4 \ u0094 \ u00e0 \ u00a4 \ u00b0   \ u00e0 \ u00a4 \ u00b8 \ u00e0 \ u00a4 \ u00be \ u00e0 \ u00a4 \ u00a4   \ u00e0 \ u00a4 \ u0095 \ u00e0 \ u00a5 \ u0087 \ u00e0 \ u00a4 \ u0082 \ u00e0 \ u00a4 \ u00a6 \ u00e0 \ u00a5 \ u008d \ u00e0 \ u00a4 \ u00b0   \ u00e0 \ u00a4 \ u00b6 \ u00e0 \ u00a4 \ u00be \ u00e0 \ u00a4 \ u00b8 \ u00e0 \ u00a4 \ u00bf \ u00e0 \ u00a4 \ u00a4   \ u00e0 \ u00a4 \ u00aa \ u00e0 \ u00a5 \ u008d \ u00e0 \ u00a4 \ u00b0 \ u00e0 \ u00a4 \ u00a6 \ u00e0 \ u00a5 \ u0087 \ u00e0 \ u00a4 \ u00b6

但它看起来应该是这样的

  

1 00:00:27,400 - &gt; 00:00:32,760   1.2अरबलोग28राज्यऔरसातकेंद्रशासितप्रदेश

我尝试过很多东西,但没有人完成我的工作。

1 个答案:

答案 0 :(得分:1)

string unicodeString = "This string contains the unicode character Pi(\u03a0)";

     // Create two different encodings.
     Encoding ascii = Encoding.ASCII;
     Encoding unicode = Encoding.Unicode;

     // Convert the string into a byte[].
     byte[] unicodeBytes = unicode.GetBytes(unicodeString);

     // Perform the conversion from one encoding to the other.
     byte[] asciiBytes = Encoding.Convert(unicode, ascii, unicodeBytes);

     // Convert the new byte[] into a char[] and then into a string.
     // This is a slightly different approach to converting to illustrate
     // the use of GetCharCount/GetChars.
     char[] asciiChars = new char[ascii.GetCharCount(asciiBytes, 0, asciiBytes.Length)];
     ascii.GetChars(asciiBytes, 0, asciiBytes.Length, asciiChars, 0);
     string asciiString = new string(asciiChars);

     // Display the strings created before and after the conversion.
     Console.WriteLine("Original string: {0}", unicodeString);
     Console.WriteLine("Ascii converted string: {0}", asciiString);