如何在运行时配置flink作业?

时间:2017-06-28 12:56:37

标签: apache-flink flink-streaming

是否可以在运行时配置flink应用程序?例如,我有一个流应用程序,它读取输入,进行一些转换,然后过滤掉低于某个阈值的所有元素。但是,我希望此阈值在运行时可配置,这意味着我可以在不重新启动flink作业的情况下更改此阈值。示例代码:

DataStream<MyModel> myModelDataStream = // get input ...
                // do some stuff ...
                .filter(new RichFilterFunction<MyModel>() {
                    @Override
                    public boolean filter(MyModel value) throws Exception {
                        return value.someValue() > someGlobalState.getThreshold();
                    }
                })
                // write to some sink ...

DataStream<MyConfig> myConfigDataStream = // get input ...
                // ...
                .process(new RichProcessFunction<MyConfig>() {
                      someGlobalState.setThreshold(MyConfig.getThreshold());
                })
                // ...

是否有可能实现这一目标?就像可以通过配置流更改的全局状态一样。

1 个答案:

答案 0 :(得分:4)

是的,您可以使用RichCoFlatMap执行此操作。大致像这样:

DataStream<MyModel> myModelDataStream = // get input ...
DataStream<Long> controlStream = // get input ...

DataStream<MyModel> result = controlStream
  .broadcast()
  .connect(myModelDataStream)
  .flatMap(new MyCoFlatMap());

public class MyCoFlatMap extends RichCoFlatMapFunction<Long, MyModel, MyModel> {
    private ValueState<Long> threshold;

    @Override
    public void open(Configuration conf) {
        ValueStateDescriptor<Long> descriptor = 
          new ValueStateDescriptor<>("configuration", Long.class);
        threshold = getRuntimeContext().getState(descriptor);
    }

    @Override
    public void flatMap1(Long newthreshold, Collector<MyModel> out) {
        threshold.update(newthreshold);
    }

    @Override
    public void flatMap2(MyModel model, Collector<MyModel> out) {
        if (threshold.value() == null || model.getData() > threshold.value()) {
            out.collect(model);
        }
    }
}