Question

根据nodeJS docs(v5.10.0)获取可读流：

最好使用readable.setEncoding('utf8')而不是使用使用buf.toString(encoding)直接缓冲。这是因为＆＃34;多字节字符（...）否则可能会被破坏。如果要将数据作为字符串读取，请始终使用此方法。

我的问题是如何使用转换流的新API实现这一点。现在不需要通过继承详细方法。

因此，例如，这可以作为将stdin转换为大写字符串

的方法

const transform = require("stream").Transform({
  transform: function(chunk, encoding, next) {
    this.push(chunk.toString().toUpperCase());
    next();
  }
});

process.stdin.pipe(transform).pipe(process.stdout);

但是，这似乎违背了在缓冲区上不使用toString()的建议。我已经尝试通过将编码设置为＆＃34; utf-8＆＃34;来修改Transform实例。像这样：

const transform = require("stream").Transform({
  transform: function(chunk, encoding, next) {
    this.push(chunk.toUpperCase()); //chunk is a buffer so this doesn't work
    next();
  }
});
transform.setEncoding("utf-8");

process.stdin.pipe(transform).pipe(process.stdout);

经过检查，第一种情况下的transform编码为null，而第二种情况下，它确实改为＆＃34; utf-8＆＃34;。然而，传递给transform函数的块仍然是缓冲区。我认为通过设置编码toString()方法可以跳过，但事实并非如此。

我也尝试过扩展read方法，如可读和双工示例，但这是不允许的。

有没有办法摆脱toString()？

Answer 1

你是对的。在_transform方法中直接使用Buffer#toString是不好的。但是，setEncoding应由可读流使用者（即从转换流中读取的代码）使用。您正在实施转换流。它不会为您改变_transform方法的输入。

如果消费者激活自动解码，则内部可读流使用StringDecoder。您也可以在转换方法中使用它。

这是code comment解释它的工作原理：

[StringDecoder]解码给定的缓冲区并将其作为JS字符串返回，保证不包含任何部分多字节字符。在缓冲区末尾找到的任何部分字符都被缓冲，并在使用剩余字节再次调用write时返回。

所以，你的例子可以改写如下：

var StringDecoder = require('string_decoder').StringDecoder
const transform = require("stream").Transform({
  transform: function(chunk, encoding, next) {
    if(!this.myStringDecoder) this.myStringDecoder = new StringDecoder('utf8')
    this.push(this.myStringDecoder.write().toUpperCase());
    next();
  }
});

process.stdin.pipe(transform).pipe(process.stdout);

Answer 2

将' decodeStrings：false '作为' options '属性传递给Transform的构造函数：

const transform = require("stream").Transform({   transform: function(chunk, encoding, next) {
    this.push(chunk.toUpperCase()); //chunk is a buffer so this doesn't work
    next();   },   decodeStrings: false });

以安全的方式为nodeJS转换流设置编码

2 个答案: