mapper和reducer代码不正确

时间:2018-04-08 18:14:01

标签: python mapreduce hadoop-streaming

映射器代码: -

#!/usr/bin/python

import sys

for line in sys.stdin:
    data = line.strip().split("\t")
    if len(data) == 6:
        date, time, place, temp, pressure, humidity = data
        print "{0}\t{1}".format(place, temp)

reducer code

#!/usr/bin/python

import sys

max_val = -sys.maxfloat
oldKey = None


for line in sys.stdin:
    data_mapped = line.strip().split("\t")
    if len(data_mapped) != 2:
        # Something has gone wrong. Skip this line.
        continue

    thisKey, thisVal = data_mapped

    if oldKey and oldKey != thisKey:
        print oldKey, "\t", max_val

        (oldKey, max_val) = (thisKey, float(thisVal))   
    else:
    (oldKey, max_val) = (thisKey, max(max_val, float(thisVal)))

if oldKey != None:
    print oldKey, "\t", max_val

reducer代码未运行我尝试了一切 mapper是正确的ruunung而不是reducer

我的数据集是这样的:

2012-01-01  09:00   San Jose    28  214.05  25
2012-01-01  09:00   Fort Worth  19  153.57  32
2012-01-01  09:00   San Diego   0   66.08   35
2012-01-01  09:00   Pittsburgh  18  493.51  28
2012-01-01  09:00   Omaha   10  235.63  32
2012-01-01  09:00   Stockton    28  247.18  32
2012-01-01  09:00   Austin  44  379.6   32
2012-01-01  09:00   New York    26  296.8   35
2012-01-01  09:00   Corpus Christi  32  25.38   28
2012-01-01  09:00   Fort Worth  32  213.88  32
2012-01-01  09:00   Las Vegas   21  53.26   32
2012-01-01  09:00   Newark  21  39.75   35
2012-01-01  09:00   Austin  44  469.63  32
2012-01-01  09:00   Greensboro  32  290.82  32
2012-01-01  09:00   San Francisco   0   260.65  28
2012-01-01  09:00   Lincoln 33  136.9   32
2012-01-01  09:00   Buffalo 19  483.82  32
2012-01-01  09:00   San Jose    19  215.82  35
2012-01-01  09:00   Boston  44  418.94  25

首先是日期,时间,地点,温度,压力,湿度

0 个答案:

没有答案