Hive Python UDF错误

时间:2017-05-16 17:10:18

标签: python hadoop hive

我有一个简单的Python脚本

#!/usr/local/bin/python
import sys
import datetime
for line in sys.stdin:
line = line.strip()
fname , lname = line.split('\t')
l_name = lname.lower()
print '\t'.join([fname, str(l_name)])

Hive表数据如下所示:

Akash   Gupta
Ashish  Agarwal
Aarav   Kedia
Rajesh  Lakhia
Sunita  Patel
Raj     Dutta
Nadeem  Siddiqui

表结构是:

hive> desc fullName;
OK
fname                   string
lname                   string

我将我的Python脚本添加为:

add FILE /full-path-to-the-script/convertToLowerCase.py;

现在,我正在为脚本运行Transform操作:

SELECT TRANSFORM(fname, lname) USING 'python convertToLowerCase.py' AS (fname, l_name) FROM fullName;

但是,Map Reduce作业会抛出错误: FAILED: Execution Error, return code 20003 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. An error occurred when trying to close the Operator running your custom script.

我做错了什么?

1 个答案:

答案 0 :(得分:0)

Python代码有问题。 Indentation of the For Loop

解决了这个问题。