为什么Flume源需要识别消息的格式?

时间:2013-10-22 14:29:09

标签: flume

根据here

中的Flume文档
  

Flume源消耗由外部源(如Web服务器)传递给它的事件。外部源以目标Flume源识别的格式向Flume发送事件。例如,Avro Flume源可用于从Avro客户端或从Avro接收器发送事件的流中的其他Flume代理接收Avro事件。

为什么Flume源需要识别或理解消息的格式?虽然它所做的只是将消息转发到其中一个频道。

1 个答案:

答案 0 :(得分:0)

根据我的了解,Flume将传输数据封装在由报头和有效负载(传输数据)组成的事件包中。从文档中:

  

Flume事件定义为具有字节有效负载的数据流单位   以及一组可选的字符串属性。

在您引用文档之前。

您指定的格式是事件包的格式,而不是数据的格式。

假设您有这个代理人:

plain_to_avro_translator.sources = plain-source avro-source
plain_to_avro_translator.sinks = avro-sink local-file-sink
plain_to_avro_translator.channels = mem-channel1 mem-channel2

plain_to_avro_translator.sources.plain-source.channels = mem-channel1
plain_to_avro_translator.sources.plain-source.type = exec
plain_to_avro_translator.sources.plain-source.restart = true
plain_to_avro_translator.sources.plain-source.restartThrottle = 40000
plain_to_avro_translator.sources.plain-source.command = cat /home/user/data.log

plain_to_avro_translator.sinks.avro-sink.channel = mem-channel1
plain_to_avro_translator.sinks.avro-sink.type = thrift
plain_to_avro_translator.sinks.avro-sink.hostname = 192.168.200.43
plain_to_avro_translator.sinks.avro-sink.port = 6000

plain_to_avro_translator.channels.mem-channel1.type = memory
plain_to_avro_translator.channels.mem-channel1.capacity = 100
plain_to_avro_translator.channels.mem-channel1.transactionCapacity = 100

plain_to_avro_translator.sources.avro-source.channels = mem-channel2
plain_to_avro_translator.sources.avro-source.type = thrift
plain_to_avro_translator.sources.avro-source.bind = 0.0.0.0
plain_to_avro_translator.sources.avro-source.port = 6000

plain_to_avro_translator.channels.mem-channel2.type = memory
plain_to_avro_translator.channels.mem-channel2.capacity = 100
plain_to_avro_translator.channels.mem-channel2.transactionCapacity = 100

plain_to_avro_translator.sinks.local-file-sink.channel = mem-channel2
plain_to_avro_translator.sinks.local-file-sink.type = file_roll
plain_to_avro_translator.sinks.local-file-sink.sink.directory = /home/user/flume_output

这将毫无问题,并且不依赖于data.log格式(您可以编写所需的任何格式的内容)。如果尝试将avro-sink类型设置为avro而不是节俭,则会从avro-source收到错误消息,因为它期望节俭格式事件。

接收器和源需要知道如何解析事件包。

希望我一切都好。如果我错了,请任何人纠正我。