Mapreduce函数使用Java计算进度和出度并显示总和

时间:2018-10-21 12:14:54

标签: java mapreduce

我试图将一组数据的进度和出度求和。 这是示例数据:

Source  Target

1        2  
2        1  
3        1  
2        3  

所以预期的输出是:

ID     In degree   Out degree  
1       2            1  
2       1            2  
3       1            1  

如何使用mapreduce Java做到这一点,并在一行中打印出结果。

1 个答案:

答案 0 :(得分:0)

涉及一项MR工作的一种选择: 假设原始数据集看起来像[node1,node2]:

-mapper读取原始数据集,并为每行发出三元组[node1,node2,out]和[node2,node1,in]

-reducer以[key,node,label]的形式从映射器获取三元组,通过分别计算每个密钥的“ out”标签和“ in”标签来计算outdegree和indegree,并以[key,indegree,学位]

实现与下面类似(假设数据集中的node1node2用空格隔开,并且还假设数据集仅包含不同的对):

映射器:

public class YourMapper extends MapReduceBase implements Mapper<LongWritable, Text, Text, Text> {

      public void map(LongWritable key, Text value, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {

        String line = value.toString();     
        String[] line_spl = line.split(" ");

        String node1 = line_spl[0];
        String node2 = line_spl[1];

        Text node1_txt = new Text(node1);
        Text node2_txt = new Text(node2);
        Text emit_out = new Text("out_" + node2);
        Text emit_in  = new Text("in_"  + node1);

        output.collect(node1_txt, emit_out);
        output.collect(node2_txt, emit_in );

      }//end map function


}//end mapper class

减速器:

public class YourReducer extends MapReduceBase implements Reducer<Text, Text, Text, Text> {

    public void reduce(Text key, Iterator<Text> values, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {

         int count_outs = 0;
         int count_ins  = 0;

            while (values.hasNext()) {

              Text value = (Text) values.next();

              String value_str = value.toString();

              if(value_str.startsWith("out_"))
                 count_outs++;
              else
              if(value_str.startsWith("in_"))
                 count_ins++;  

            }

            Text out = new Text(count_ins + " " + count_outs);              
            output.collect(key, out);

    }//end reduce function

}//end reducer class