为什么WriteToBigQuery没有显示任何错误?

时间:2019-06-17 09:54:14

标签: google-cloud-dataflow apache-beam

最后,在发现模式不正确后,我设法将数据上传到BQ。但是,调试起来非常困难,因为DirectRunner上没有任何日志。当我有例如我如何调试WriteToBigQuery。模式错误?

我的代码:

lines = messages | 'decode' >> beam.Map(lambda x: x.decode('utf-8'))
  output = ( lines
           | 'process' >> beam.FlatMap(lambda xml: [jsons.dump(model) for model in process_xmls(xml)])
           | beam.WindowInto(window.FixedWindows(1, 0)))

  output | 'Write to BiqQuery' >> beam.io.WriteToBigQuery(
            table='dataflow.test_V1',
            schema=fp_schema,
            create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED,
            write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND)

1 个答案:

答案 0 :(得分:1)

beam.io.WriteToBigQuery PTransform返回一个字典,该字典的BigQueryWriteFn.FAILED_ROWS条目包含所有未能写入的行的PCollection。错误本身记录在https://github.com/apache/beam/blob/release-2.13.0/sdks/python/apache_beam/io/gcp/bigquery.py#L861上,因此应显示在工作日志中。