Shortest Pig script that will use Accumulator

时间:2016-05-17 11:04:52

标签: apache-pig udf

I'm adding an Accumulator implementation to a Pig UDF, and I want to test it.

What is the shortest and simplest Pig script that will use the accumulator?

For simplicity's sake, assume that it will load a file with N integers, where N > pig.accumulative.batchsize so that the accumulate() method will be called more than once.

data = LOAD 'input' AS (val1:int);

output = ... (code which uses the UDF comes here)

STORE output INTO 'output';

1 个答案:

答案 0 :(得分:0)

看起来这已经足够了:

data = LOAD 'input' AS (val1:int);

output = FOREACH (group d all) GENERATE ACCUMULATIVE_UDF(val1);

STORE output INTO 'output';
相关问题