数据:
1 a 1
1 b 2
2 c 3
2 a 4
使用以下命令:
record = LOAD 'test_in' AS (id:int, company:chararray, rank:chararray);
grp = GROUP record BY id;
我明白了:
(1,{(b,2),(a,1),(d,1)})
(2,{(a,4),(c,3)})
我想得到如下结果:
(1,b:2_a:1)
(2,a:4_c:3)
以下代码返回错误:
newdata = FOREACH grp GENERATE group AS id,
BagToString(CONCAT(record.$1, CONCAT(':', record.$2))) AS company;
错误信息是:
[main]错误org.apache.pig.tools.grunt.Grunt-错误1200:预期 一袋元组:{()},发现数据类型:字节数组
答案 0 :(得分:0)
您如何将concat阶段分为两个阶段?
A = LOAD 'input.txt' AS (id:int, company:chararray, rank:chararray);
B = GROUP A BY id;
C = FOREACH B {
C2 = FOREACH A GENERATE CONCAT(CONCAT(company, ':'), rank);
GENERATE group as id, C2;
}
D = FOREACH C GENERATE id, BagToString(C2);
STORE D into 'myfile';
输出
(1,b:2_a:1)
(2,a:4_c:3)