Question

使用此代码：

test.py

import sys
import codecs

sys.stdout = codecs.getwriter('utf-16')(sys.stdout)

print "test1"
print "test2"

然后我将其作为：

运行

test.py > test.txt

在Windows 2000上的Python 2.6中，我发现新行字符正在输出为字节序列 \x0D\x0A\x00 ，这当然对UTF-16来说是错误的。

我错过了什么，或者这是一个错误？

Answer 1

新行转换发生在stdout文件中。您正在将“test1 \ n”写入sys.stdout（StreamWriter）。 StreamWriter将其转换为“t \ x00e \ x00s \ x00t \ x001 \ x00 \ n \ x00”，并将其发送到真实文件，即原始sys.stderr。

该文件不知道您已将数据转换为UTF-16;它只知道输出流中的任何\ n值都需要转换为\ x0D \ x0A，这会产生你所看到的输出。

Answer 2

试试这个：

import sys
import codecs

if sys.platform == "win32":
    import os, msvcrt
    msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)

class CRLFWrapper(object):
    def __init__(self, output):
        self.output = output

    def write(self, s):
        self.output.write(s.replace("\n", "\r\n"))

    def __getattr__(self, key):
        return getattr(self.output, key)

sys.stdout = CRLFWrapper(codecs.getwriter('utf-16')(sys.stdout))
print "test1"
print "test2"

Answer 3

到目前为止，我找到了两个解决方案，但没有一个能够提供UTF-16 输出 Windows样式行结尾的解决方案。

首先，将Python print 语句重定向到具有UTF-16编码的文件（输出Unix样式的行结尾）：

import sys
import codecs

sys.stdout = codecs.open("outputfile.txt", "w", encoding="utf16")

print "test1"
print "test2"

其次，使用UTF-16编码重定向到 stdout ，没有行结束转换损坏（输出Unix样式行结尾）（感谢this ActiveState recipe）：

import sys
import codecs

sys.stdout = codecs.getwriter('utf-16')(sys.stdout)

if sys.platform == "win32":
    import os, msvcrt
    msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)

print "test1"
print "test2"

Python UTF-16输出和Windows行结尾的错误？

3 个答案: