Sqoop:将数据从HDFS导出到Teradata时将字符串转换为日期

时间:2016-02-05 05:56:45

标签: java hadoop teradata etl sqoop

我在teradata中有一个表,一列update_date被定义为DATE。格式为YYYYMMDD。我需要将HDFS中的文件上传到表中。这些文件是逗号分隔的文本文件。与update_date对应的列是格式为' YYYYMMDD'的字符串。

当我尝试使用命令导出数据时:

sqoop export --connect jdbc:teradata://tdwc/DATABASE=sandbox --username xxxx --password yyyy --table mytable --export-dir hftp://gdoop-namenode/path/to/myfiles/ 

命令失败,错误

java.io.IOException: Can't export data, please check failed map task logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) 
at org.apache.hadoop.mapred.Child.main(Child.java:262)  
Caused by: java.lang.RuntimeException: Can't parse input data: '20160129'

如何在不更改表格架构的情况下解决此问题?

0 个答案:

没有答案