如何连接元组,哪些元组在包装中

时间:2019-07-12 02:49:06

标签: apache-pig

数据:

1   a   1
1   b   2
2   c   3
2   a   4

使用以下命令:

record = LOAD 'test_in' AS (id:int, company:chararray, rank:chararray);
grp = GROUP record BY id;

我明白了:

(1,{(b,2),(a,1),(d,1)})
(2,{(a,4),(c,3)})

我想得到如下结果:

(1,b:2_a:1)
(2,a:4_c:3)

以下代码返回错误:

newdata = FOREACH grp GENERATE group AS id,
      BagToString(CONCAT(record.$1, CONCAT(':', record.$2))) AS company;

错误信息是:

  

[main]错误org.apache.pig.tools.grunt.Grunt-错误1200:预期   一袋元组:{()},发现数据类型:字节数组

1 个答案:

答案 0 :(得分:0)

您如何将concat阶段分为两个阶段?

A = LOAD 'input.txt' AS (id:int, company:chararray, rank:chararray);
B = GROUP A BY id;
C = FOREACH B {
    C2 = FOREACH A GENERATE CONCAT(CONCAT(company, ':'), rank);
    GENERATE group as id, C2;
}
D = FOREACH C GENERATE id, BagToString(C2);
STORE D into 'myfile';

输出

(1,b:2_a:1)
(2,a:4_c:3)
相关问题