使用flume将twitter数据传输到hadoop到HDFS接收器

时间:2014-10-12 21:52:29

标签: hadoop twitter flume flume-ng flume-twitter

我安装了Flume,以运行cloudera的twitter情绪分析

当我通过此命令运行 twitter.conf

 bin/flume-ng agent start --conf conf/ -f conf/twitter.conf -Dflume.root.logger=DEBUG,console -n TwitterAgent

我尝试更改命令,我尝试将hadoop中的JARS导入水槽,没有任何效果。

这是问题发生的具体位置

2014-10-13 02:40:16,511 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:119)] 
Monitored counter group for type: SINK, name: HDFS: Successfully registered new MBean.
2014-10-13 02:40:16,511 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:95)] 
Component type: SINK, name: HDFS started
2014-10-13 02:40:16,514 (SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:143)] 
Polling sink runner starting

在此之后,以下行不断重复,直到被用户中断

2014-10-13 02:40:46,509 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)]
Checking file:conf/twitter.conf for changes
2014-10-13 02:41:16,510 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)]
Checking file:conf/twitter.conf for changes

我发布输出日志(没有加载的jar)

Info: Sourcing environment configuration script /home/gautham/Downloads/apache-flume-1.5.0.1-bin/conf/flume-env.sh
Info: Including Hadoop libraries found via (/usr/local/hadoop-2.4.1/bin/hadoop) for HDFS access
Info: Excluding /usr/local/hadoop-2.4.1/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.4.1/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath
2014-10-13 02:40:15,948 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start(PollingPropertiesFileConfigurationProvider.java:61)] Configuration provider starting
2014-10-13 02:40:15,955 (lifecycleSupervisor-1-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start(PollingPropertiesFileConfigurationProvider.java:78)] Configuration provider started
2014-10-13 02:40:15,958 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:40:15,960 (conf-file-poller-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:133)] Reloading configuration file:conf/twitter.conf
2014-10-13 02:40:15,971 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:HDFS
2014-10-13 02:40:15,971 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1020)] Created context for HDFS: hdfs.rollCount
2014-10-13 02:40:15,972 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:HDFS
2014-10-13 02:40:15,972 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:HDFS
2014-10-13 02:40:15,972 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:HDFS
2014-10-13 02:40:15,972 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:HDFS
2014-10-13 02:40:15,973 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:930)] Added sinks: HDFS Agent: TwitterAgent
2014-10-13 02:40:15,973 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:HDFS
2014-10-13 02:40:15,973 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:HDFS
2014-10-13 02:40:15,973 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:HDFS
2014-10-13 02:40:15,974 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:313)] Starting validation of configuration for agent: TwitterAgent, initial-configuration: AgentConfiguration[TwitterAgent]SOURCES: {Twitter={ parameters:{consumerSecret=bVlUbZwHzCnpOfWc8MrWStzV7Mj4GUtAHex2pfLKOsgGJ3CA6T, keywords=kathi, channels=MemChannel, accessToken=1954292516-So7GAid1x2NzxQXauP6qkQ0Ha7wzyMOPXwoeNqt, consumerKey=GSmUZJz8XQsMM89d3gpJ1sdW1, type=com.cloudera.flume.source.TwitterSource, accessTokenSecret=uo126JopSBYQVBf3PaWBaMYdEiVxCONJnaTBu4tOaiMmB} }CHANNELS: {MemChannel={ parameters:{type=memory, transactionCapacity=100, capacity=10000} }}
SINKS: {HDFS={ parameters:{hdfs.batchSize=10, hdfs.path=hdfs://gautham-Lenovo-IdeaPad-Z500:54310/home/kathireal/tweets/%Y/%m/%d/%H, hdfs.writeFormat=Text, hdfs.rollSize=0, hdfs.rollCount=10000, channel=MemChannel, hdfs.fileType=DataStream, type=hdfs} }}

2014-10-13 02:40:15,984 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateChannels(FlumeConfiguration.java:468)] Created channel MemChannel
2014-10-13 02:40:15,990 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:596)] Configuration empty for: Twitternf.Removed.
2014-10-13 02:40:15,992 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSinks(FlumeConfiguration.java:674)] Creating sink: HDFS using HDFS
2014-10-13 02:40:15,996 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:371)] Post validation configuration for TwitterAgent AgentConfiguration created without Configuration stubs for which only basic syntactical validation was performed[TwitterAgent] CHANNELS: {MemChannel={ parameters:{type=memory, transactionCapacity=100, capacity=10000}}}

SINKS: {HDFS={ parameters:{hdfs.batchSize=10, hdfs.path=hdfs://gautham-Lenovo-IdeaPad-Z500:54310/home/kathireal/tweets/%Y/%m/%d/%H, hdfs.writeFormat=Text, hdfs.rollSize=0, hdfs.rollCount=10000, channel=MemChannel, hdfs.fileType=DataStream, type=hdfs} }}

2014-10-13 02:40:15,996 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:135)] Channels:MemChannel

2014-10-13 02:40:15,996 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:136)] Sinks HDFS

2014-10-13 02:40:15,997 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:137)] Sources null

2014-10-13 02:40:15,997 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:140)] Post-validation flume configuration contains configuration for agents: [TwitterAgent]
2014-10-13 02:40:15,997 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:150)] Creating channels
2014-10-13 02:40:16,009 (conf-file-poller-0) [INFO - org.apache.flume.channel.DefaultChannelFactory.create(DefaultChannelFactory.java:40)] Creating instance of channel MemChannel type memory
2014-10-13 02:40:16,017 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:205)] Created channel MemChannel
2014-10-13 02:40:16,019 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:40)] Creating instance of sink: HDFS, type: hdfs
2014-10-13 02:40:16,331 (conf-file-poller-0) [INFO - org.apache.flume.sink.hdfs.HDFSEventSink.authenticate(HDFSEventSink.java:555)] Hadoop Security enabled: false
2014-10-13 02:40:16,335 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:119)] Channel MemChannel connected to [HDFS]
2014-10-13 02:40:16,349 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:138)] Starting new configuration:{ sourceRunners:{} sinkRunners:{HDFS=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@32f5a9 counterGroup:{ name:null counters:{} } }} channels:{MemChannel=org.apache.flume.channel.MemoryChannel{name: MemChannel}} }
2014-10-13 02:40:16,375 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:145)] Starting Channel MemChannel
2014-10-13 02:40:16,505 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:119)] Monitored counter group for type: CHANNEL, name: MemChannel: Successfully registered new MBean.
2014-10-13 02:40:16,506 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:95)] Component type: CHANNEL, name: MemChannel started
2014-10-13 02:40:16,507 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:173)] Starting Sink HDFS
2014-10-13 02:40:16,511 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:119)] Monitored counter group for type: SINK, name: HDFS: Successfully registered new MBean.
2014-10-13 02:40:16,511 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:95)] Component type: SINK, name: HDFS started
2014-10-13 02:40:16,514 (SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:143)] Polling sink runner starting
2014-10-13 02:40:46,509 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:41:16,510 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes


2014-10-13 02:41:46,510 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:42:16,511 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:42:46,512 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:43:16,512 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:43:46,513 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:44:16,514 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:44:46,514 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:45:16,515 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:45:40,220 (agent-shutdown-hook) [INFO - org.apache.flume.lifecycle.LifecycleSupervisor.stop(LifecycleSupervisor.java:79)] Stopping lifecycle supervisor 12
2014-10-13 02:45:40,224 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:149)] Component type: CHANNEL, name: MemChannel stopped
2014-10-13 02:45:40,225 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:155)] Shutdown Metric for type: CHANNEL, name: MemChannel. channel.start.time == 1413148216506
2014-10-13 02:45:40,225 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:161)] Shutdown Metric for type: CHANNEL, name: MemChannel. channel.stop.time == 1413148540224
2014-10-13 02:45:40,225 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: CHANNEL, name: MemChannel. channel.capacity == 10000
2014-10-13 02:45:40,225 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: CHANNEL, name: MemChannel. channel.current.size == 0
2014-10-13 02:45:40,225 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.put.attempt == 0
2014-10-13 02:45:40,226 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.put.success == 0
2014-10-13 02:45:40,226 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.take.attempt == 42
2014-10-13 02:45:40,226 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.take.success == 0
2014-10-13 02:45:40,226 (agent-shutdown-hook) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.stop(PollingPropertiesFileConfigurationProvider.java:83)] Configuration provider stopping
2014-10-13 02:45:40,226 (agent-shutdown-hook) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.stop(PollingPropertiesFileConfigurationProvider.java:95)] Configuration provider stopped
2014-10-13 02:45:40,227 (agent-shutdown-hook) [DEBUG - org.apache.flume.SinkRunner.stop(SinkRunner.java:104)] Waiting for runner thread to exit
2014-10-13 02:45:40,227 (SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:157)] Interrupted while processing an event. Exiting.
2014-10-13 02:45:40,227 (SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:173)] Polling runner exiting. Metrics:{ name:null counters:{runner.interruptions=1, runner.backoffs.consecutive=42, runner.backoffs=42} }
2014-10-13 02:45:40,228 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:149)] Component type: SINK, name: HDFS stopped
2014-10-13 02:45:40,228 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:155)] Shutdown Metric for type: SINK, name: HDFS. sink.start.time == 1413148216511
2014-10-13 02:45:40,228 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:161)] Shutdown Metric for type: SINK, name: HDFS. sink.stop.time == 1413148540228
2014-10-13 02:45:40,228 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: SINK, name: HDFS. sink.batch.complete == 0
2014-10-13 02:45:40,228 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: SINK, name: HDFS. sink.batch.empty == 42
2014-10-13 02:45:40,229 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: SINK, name: HDFS. sink.batch.underflow == 0
2014-10-13 02:45:40,229 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: SINK, name: HDFS. sink.connection.closed.count == 0
2014-10-13 02:45:40,229 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: SINK, name: HDFS. sink.connection.creation.count == 0
2014-10-13 02:45:40,229 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: SINK, name: HDFS. sink.connection.failed.count == 0
2014-10-13 02:45:40,229 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: SINK, name: HDFS. sink.event.drain.attempt == 0
2014-10-13 02:45:40,229 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)
  HDFS
Shutdown Metric for type: SINK, name: HDFS. sink.event.drain.sucess == 0

HDFS没有变化。

0 个答案:

没有答案