数据流作业失败,出现以下异常并且正在传递参数staging,temp&输出GCS存储桶位置。
Java代码:
final String[] used = Arrays.copyOf(args, args.length + 1);
used[used.length - 1] = "--project=OVERWRITTEN"; final T options =
PipelineOptionsFactory.fromArgs(used).withValidation().as(clazz);
options.setProject(PROJECT_ID);
options.setStagingLocation("gs://abc/staging/");
options.setTempLocation("gs://abc/temp");
options.setRunner(DataflowRunner.class);
options.setGcpTempLocation("gs://abc");
错误:
INFO: Staging pipeline description to gs://ups-heat-dev- tmp/mniazstaging_ingest_validation/staging/
May 10, 2018 11:56:35 AM org.apache.beam.runners.dataflow.util.PackageUtil tryStagePackage
INFO: Uploading <42088 bytes, hash E7urYrjAOjwy6_5H-UoUxA> to gs://ups-heat-dev-tmp/mniazstaging_ingest_validation/staging/pipeline-E7urYrjAOjwy6_5H-UoUxA.pb
Dataflow SDK version: 2.4.0
May 10, 2018 11:56:38 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Printed job specification to gs://ups-heat-dev-tmp/mniazstaging_ingest_validation/templates/DataValidationPipeline
May 10, 2018 11:56:40 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Template successfully created.
Exception in thread "main" java.lang.NullPointerException
at org.apache.beam.runners.dataflow.DataflowPipelineJob.getJobWithRetries(DataflowPipelineJob.java:501)
at org.apache.beam.runners.dataflow.DataflowPipelineJob.getStateWithRetries(DataflowPipelineJob.java:477)
at org.apache.beam.runners.dataflow.DataflowPipelineJob.waitUntilFinish(DataflowPipelineJob.java:312)
at org.apache.beam.runners.dataflow.DataflowPipelineJob.waitUntilFinish(DataflowPipelineJob.java:248)
at org.apache.beam.runners.dataflow.DataflowPipelineJob.waitUntilFinish(DataflowPipelineJob.java:202)
at org.apache.beam.runners.dataflow.DataflowPipelineJob.waitUntilFinish(DataflowPipelineJob.java:195)
at com.example.DataValidationPipeline.main(DataValidationPipeline.java:66)
答案 0 :(得分:0)
我也面临着同样的问题,错误是抛出p.run().waitForFinish();
。然后我尝试了以下代码
PipelineResult result = p.run();
System.out.println(result.getState().hasReplacementJob());
result.waitUntilFinish();
这引发了以下异常
java.lang.UnsupportedOperationException: The result of template creation should not be used.
at org.apache.beam.runners.dataflow.util.DataflowTemplateJob.getState (DataflowTemplateJob.java:67)
然后要解决此问题,我使用了以下代码
PipelineResult result = pipeline.run();
try {
result.getState();
result.waitUntilFinish();
} catch (UnsupportedOperationException e) {
// do nothing
} catch (Exception e) {
e.printStackTrace();
}
答案 1 :(得分:0)
今天我也遇到了java.lang.UnsupportedOperationException: The result of template creation should not be used.
的问题,我试图通过检查作业是否首先是DataflowTemplateJob类型来解决它:
val (sc, args) = ContextAndArgs(cmdlineArgs)
// ...
val result = sc.run()
if (!result.isInstanceOf[DataflowTemplateJob]) result.waitUntilFinish()
我认为这应该适用于裸露的Java作业,但是如果您使用Scio,那么结果将是某种匿名类型,因此最后我还必须进行try catch版本。
try {
val result = sc.run().waitUntilFinish()
} catch {
case _: UnsupportedOperationException => // this happens during template creation
}