我们可以在oozie工作流xml中访问整个hadoop作业日志吗?

时间:2016-10-26 23:39:46

标签: hadoop apache-pig oozie

oozie #emailAction #hadoop

我正在使用oozie工作流程运行一个hadoop pig作业。如何在工作流xml中访问hadoop作业的整个日志,以便我可以在成功/失败电子邮件操作中使用它?

由于

我需要的电子邮件示例日志

2016-10-26 13:58:30,385 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2016-10-26 13:58:30,480 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2016-10-26 13:58:30,522 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2016-10-26 13:58:30,522 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2016-10-26 13:58:30,608 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2016-10-26 13:58:30,639 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2016-10-26 13:58:30,640 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2016-10-26 13:58:30,647 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=2369469310
2016-10-26 13:58:30,648 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 3
2016-10-26 13:58:30,876 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - creating jar file Job5719456061273645490.jar
2016-10-26 13:58:33,816 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - jar file Job5719456061273645490.jar created
2016-10-26 13:58:33,834 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2016-10-26 13:58:33,865 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2016-10-26 13:58:33,896 [JobControl] WARN  org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2016-10-26 13:58:34,053 [JobControl] WARN  org.apache.hadoop.conf.Configuration - fs.default.name is deprecated. Instead, use fs.defaultFS
2016-10-26 13:58:34,053 [JobControl] WARN  org.apache.hadoop.conf.Configuration - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2016-10-26 13:58:34,115 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2016-10-26 13:58:34,166 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 18
2016-10-26 13:58:34,367 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2016-10-26 13:58:35,007 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201610241241_0117
2016-10-26 13:58:35,007 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases A
2016-10-26 13:58:35,007 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: A[1,4] C:  R: 
2016-10-26 13:58:35,007 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: XXX/jobdetails.jsp?jobid=job_201610241241_0117
2016-10-26 13:58:45,851 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 6% complete
2016-10-26 13:58:46,865 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 8% complete
2016-10-26 13:58:48,907 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 12% complete
2016-10-26 13:58:51,982 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 17% complete
2016-10-26 13:58:55,059 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 21% complete
2016-10-26 13:58:58,098 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 25% complete
2016-10-26 13:59:01,120 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 26% complete
2016-10-26 13:59:42,816 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 32% complete
2016-10-26 13:59:44,324 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 33% complete
2016-10-26 13:59:45,832 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 35% complete
2016-10-26 13:59:49,351 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 39% complete
2016-10-26 13:59:53,374 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 42% complete
2016-10-26 14:01:04,726 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2016-10-26 14:01:04,728 [main] INFO  org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics: 

HadoopVersion   PigVersion  UserId  StartedAt   FinishedAt  Features
2.0.0-cdh4.7.1  0.11.0-cdh4.7.1 hadoop  2016-10-26 13:58:30 2016-10-26 14:01:04 UNKNOWN

Success!

Job Stats (time in seconds):
JobId   Maps    Reduces MaxMapTime  MinMapTIme  AvgMapTime  MedianMapTime   MaxReduceTime   MinReduceTime   AvgReduceTime   MedianReducetime    Alias   Feature Outputs
job_201610241241_0117   18  0   138 24  76  79  0   0   0   0   A   MAP_ONLY    /home/hadoop/xx/xx/xx/20161015/00,

Input(s):
Successfully read 116235853 records (2369955422 bytes) from: "/home/hadoop/xx/data/xx/20161015/00/part*"

Output(s):
Successfully stored 116235853 records (5855768014 bytes) in: "/home/hadoop/xx/xx/xx/20161015/00"

Counters:
Total records written : 116235853
Total bytes written : 5855768014
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_201610241241_0117


2016-10-26 14:01:04,747 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!

1 个答案:

答案 0 :(得分:0)

基于问题和评论,我建议您执行以下操作:

一旦作业失败,不要将其直接转换为OK节点。而是将其路由到故障节点(如果您只是想在查看群集时查看faillures)或首先将其路由到邮件节点,然后根据您的偏好确定或失败。

在邮件节点中发送的邮件中,您可以添加作业ID。然后人们知道他们需要在服务器上查看这个工作,因为其中有一些失败。

您可以选择始终发送邮件,在这种情况下使用转换为mailOK或mailFail节点的设置,以便人们知道该进程完全运行,以及是否需要查看faillure。