beam dataflow python name' PipelineOptions'没有定义

时间:2017-05-09 14:15:05

标签: python pipeline dataflow beam

) 这是我的第一个问题,在找到答案之前我真的很难过。 我想创建非常简单的管道,并且在开始时已经卡住了。这是我的代码:

import apache_beam as beam
options = PipelineOptions()
google_cloud_options = options.view_as(GoogleCloudOptions)
google_cloud_options.project = 'myproject'
google_cloud_options.job_name = 'mypipe'
google_cloud_options.staging_location = 'gs://mybucket/staging'
google_cloud_options.temp_location = 'gs://mybucket/temp'
options.view_as(StandardOptions).runner = 'DataflowRunner'

产生错误: NameError:name' PipelineOptions'未定义

感谢您的帮助。

3 个答案:

答案 0 :(得分:1)

模块代码已从 apache_beam.utils 更改为 apache_beam.option

您现在应该使用:

from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.options.pipeline_options import SetupOptions
from apache_beam.options.pipeline_options import GoogleCloudOptions
from apache_beam.options.pipeline_options import StandardOptions

此处的官方文档:https://beam.apache.org/releases/pydoc/2.0.0/_modules/apache_beam/options/pipeline_options.html

答案 1 :(得分:0)

您需要添加一些其他导入才能使示例正常工作:

from apache_beam.io import ReadFromText
from apache_beam.io import WriteToText
from apache_beam.metrics import Metrics
from apache_beam.utils.pipeline_options import PipelineOptions
from apache_beam.utils.pipeline_options import SetupOptions
from apache_beam.utils.pipeline_options import GoogleCloudOptions
from apache_beam.utils.pipeline_options import StandardOptions

答案 2 :(得分:0)

from apache_beam.pipeline import PipelineOptions
options = PipelineOptions()