在HDP 2.3.2沙箱中通过Sqoop导入表时出错

时间:2016-01-28 14:11:02

标签: hadoop import hive sqoop

我正在尝试在HDP 2.3.2沙箱中导入Hive中的70 + GB表,我在SQL Server和沙箱之间建立了连接,但是,在尝试使用以下命令导入表时: / p>

sudo -u hdfs sqoop import --connect "jdbc:sqlserver://XX.XX.XX.XX;database=XX;username=XX;password=XX" --table XX  --split-by ID --target-dir "/user/hdfs/Kunal/2" --hive-import -- --schema dbo

但是它给了我以下错误

Error: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.UnsupportedOperationException: Java Runtime Environment (JRE) version 1.7 is not supported by this driver. Use the sqljdbc4.jar class library, which provides support for JDBC 4.0.
    at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:167)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:749)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.RuntimeException: java.lang.UnsupportedOperationException: Java Runtime Environment (JRE) version 1.7 is not supported by this driver. Use the sqljdbc4.jar class library, which provides support for JDBC 4.0.
    at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:220)
    at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:165)
    ... 9 more

1 个答案:

答案 0 :(得分:0)

选项1:要使用单个映射器(-m 1),请注意单个符号。但整个70GB将在sigle线程中读取,你可能会在完成时遇到dealy,也可能写入单个hdfs文件。

选项2:使用--split-by和稀疏分布式--split-by用于拆分工作单元的表的列。防爆。 emp表中的Employee_id将是唯一且稀疏分布的。

参考:http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html最新的sqoop用户指南。

相关问题