使用Sqoop

时间:2019-03-05 10:15:17

标签: oracle apache-spark hadoop hdfs sqoop

我们已经编写了 Apache Spark ,以.csv格式将数据集存储在 HDFS 中。 我们正在尝试使用SQOOP将.csv格式的数据导出到Oracle,但是出现以下错误:

INFO mapreduce.Job:  map 0% reduce 0%
19/03/05 04:59:44 INFO mapreduce.Job:  map 100% reduce 0%
19/03/05 04:59:44 INFO mapreduce.Job: Job job_1550654261086_0043 failed with state FAILED due to: Task failed task_1550654261086_0043_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

19/03/05 04:59:44 INFO mapreduce.Job: Counters: 9
        Job Counters
                Failed map tasks=1
                Killed map tasks=1
                Launched map tasks=2
                Data-local map tasks=2
                Total time spent by all maps in occupied slots (ms)=71040
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=7104
                Total vcore-milliseconds taken by all map tasks=170496
                Total megabyte-milliseconds taken by all map tasks=727449600
19/03/05 04:59:44 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
19/03/05 04:59:44 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 13.2314 seconds (0 bytes/sec)
19/03/05 04:59:44 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
19/03/05 04:59:44 INFO mapreduce.ExportJobBase: Exported 0 records.
19/03/05 04:59:44 ERROR mapreduce.ExportJobBase: Export job failed!
19/03/05 04:59:44 DEBUG util.ClassLoaderStack: Restoring classloader: sun.misc.Launcher$AppClassLoader@36aa7bc2
19/03/05 04:59:44 ERROR tool.ExportTool: Error during export:
Export job failed!
        at org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:445)
        at org.apache.sqoop.manager.SqlManager.updateTable(SqlManager.java:965)
        at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:70)
        at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:99)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:252)

我们正在使用以下命令:

sqoop export --verbose --connect jdbc:oracle:thin:@//10.180.25.169:1521/Sparc261 --username cisadm --password cisadm --table HADOOP_TEMP --m 2 --export-dir /user/hadoop/Test  --batch --update-key TXN_DETAIL_ID  --input-fields-terminated-by ',' --input-lines-terminated-by '\n'

.csv文件数据:

184792385,TEST,PERFTXN,PERFTXN,2017-01-01T00:00:00.000+05:30,,,PERF1,,1000.000000000000000000,1000.000000000000000000,USD,N,SYSUSER,+,0,0,AUTO,USA,GBP,VIP,,,,,,,,,,,,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,,0E-18,,0E-18,,0E-18,,0E-18,,,,,,,,UPLD,,1,Y,,C1_F_ANO,1427001000,,,,,,,,,,,,,N,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,,,,,
184792386,TEST,PERFTXN,PERFTXN,1990-01-01T00:00:00.000+05:30,,,PERF1,,2000.000000000000000000,2000.000000000000000000,USD,N,SYSUSER,+,0,0,AUTO,USA,GBP,VIP,,,,,,,,,,,,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,,0E-18,,0E-18,,0E-18,,0E-18,,,,,,,,UPLD,,1,Y,,C1_F_ANO,1427001000,,,,,,,,,,,,,N,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,0E-18,,,,,

有人可以帮忙吗?

更新

尝试几次运行后,我们发现使用SQOOP导出时,DATE和NUMBER的Spark数据格式可能会引起问题。

我该如何解决?

0 个答案:

没有答案