猪转换元组数据到地图?

时间:2013-10-07 13:08:17

标签: dictionary tuples apache-pig

你好,任何人都可以指导使用Pig将元组数据转换为地图吗?

我有数据

2013-09-24 19:58:04.440 server120 TRSID=20,RID=e7a8-244ce04-03b6k8962890k,EXTID=e7a8a921-244c-e043-03b6k8962890k

我尝试将数据转换为地图,如下所示。但有些人无法将数据转换为地图,导致Tupletomapfailed错误。

package mymapudf;
import java.io.IOException;
import org.apache.pig.EvalFunc;

import java.util.Map;
import java.util.Map.Entry;

import org.apache.pig.data.Tuple;

public class udfmap extends EvalFunc<Map>
{
    public Map exec(Tuple input) throws IOException 
    {
        try 
        {
           //taBag values = (DataBag)input.get(1);
           //String value = (String)input.get(1);
           //Map<Object, Object> m = new HashMap<Object, Object>();
           Map<String,String> m2 = (Map<String, String>) input.get(1);
           for (Entry<String, String> entry:m2.entrySet()) 
           {
               m2.put(entry.getKey(),entry.getValue());
               //Tuple tuple = TupleFactory.newTuple(2);
           }
           return m2;
       } 
       catch(Exception e) 
       {
            throw new RuntimeException("Tupletomapfailed error", e);
       }
    }
}

请提供一些建议,以便为留下时间戳的剩余数据实现所需的地图转换?

架构:

data_load = LOAD '/user/uk01/test.log'  USING PigStorage(' ') AS ( dt:chararray, nodes:tuple(servername:chararray,TRIS:chararray,EXTID:chararray));

gen_test = FOREACH data_load GENERATE mymapudf.udf(nodes);

Pig Stack Trace:

ERROR 2999: Unexpected internal error. Tupletomapfailed error

java.lang.RuntimeException: Tupletomapfailed error
        at mymapudf.udfmap.exec(udfmap.java:30)
        at mymapudf.udfmap.exec(udfmap.java:1)
        at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:337)
        at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:428)
        at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:352)
        at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:372)
        at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:297)
        at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308)
        at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:95)

0 个答案:

没有答案