Sqoop导入错误:用户null不属于org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setOwner中的Hadoop(FSDirAttrOp.java:89)

时间:2018-01-20 17:03:37

标签: hive sqoop

我正在使用HDP 2.6 Sandbox。我在hdfs组下创建了一个用户root用户空间,并执行了以下sqoop hive导入并遇到以下2个错误:

  1. 异常org.apache.hadoop.security.AccessControlException失败:用户null不属于org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setOwner上的Hadoop(FSDirAttrOp.java:89)
  2. 失败:执行错误,从org.apache.hadoop.hive.ql.exec.MoveTask返回代码1 但是,数据已正确导入到hive表中。
  3. 请帮助我了解此错误的重要性,以及如何克服此错误。

    [root@sandbox-hdp ~]# sqoop import \
    > --connect jdbc:mysql://sandbox.hortonworks.com:3306/retail_db \
    > --username retail_dba \
    > --password hadoop \
    > --table departments \
    > --hive-home /apps/hive/warehouse \
    > --hive-import \
    > --create-hive-table \
    > --hive-table retail_db.departments \
    > --target-dir /user/root/hive_import \
    > --outdir java_files
    Warning: /usr/hdp/2.6.3.0-235/accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    18/01/14 09:42:38 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.6.3.0-235
    18/01/14 09:42:38 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
    18/01/14 09:42:38 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
    18/01/14 09:42:38 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
    18/01/14 09:42:38 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    18/01/14 09:42:38 INFO tool.CodeGenTool: Beginning code generation
    18/01/14 09:42:38 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `departments` AS t LIMIT 1
    18/01/14 09:42:38 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `departments` AS t LIMIT 1
    18/01/14 09:42:39 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.6.3.0-235/hadoop-mapreduce
    Note: /tmp/sqoop-root/compile/e1ec5b443f92219f1f061ad4b64cc824/departments.java uses or overrides a deprecated API.
    Note: Recompile with -Xlint:deprecation for details.
    18/01/14 09:42:40 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/e1ec5b443f92219f1f061ad4b64cc824/departments.jar
    18/01/14 09:42:40 WARN manager.MySQLManager: It looks like you are importing from mysql.
    18/01/14 09:42:40 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
    18/01/14 09:42:40 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
    18/01/14 09:42:40 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
    18/01/14 09:42:40 INFO mapreduce.ImportJobBase: Beginning import of departments
    18/01/14 09:42:41 INFO client.RMProxy: Connecting to ResourceManager at sandbox-hdp.hortonworks.com/172.17.0.2:8032
    18/01/14 09:42:42 INFO client.AHSProxy: Connecting to Application History server at sandbox-hdp.hortonworks.com/172.17.0.2:10200
    18/01/14 09:42:46 INFO db.DBInputFormat: Using read commited transaction isolation
    18/01/14 09:42:46 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`department_id`), MAX(`department_id`) FROM `departments`
    18/01/14 09:42:46 INFO db.IntegerSplitter: Split size: 1; Num splits: 4 from: 2 to: 7
    18/01/14 09:42:46 INFO mapreduce.JobSubmitter: number of splits:4
    18/01/14 09:42:47 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1515818851132_0050
    18/01/14 09:42:47 INFO impl.YarnClientImpl: Submitted application application_1515818851132_0050
    18/01/14 09:42:47 INFO mapreduce.Job: The url to track the job: http://sandbox-hdp.hortonworks.com:8088/proxy/application_1515818851132_0050/
    18/01/14 09:42:47 INFO mapreduce.Job: Running job: job_1515818851132_0050
    18/01/14 09:42:55 INFO mapreduce.Job: Job job_1515818851132_0050 running in uber mode : false
    18/01/14 09:42:55 INFO mapreduce.Job:  map 0% reduce 0%
    18/01/14 09:43:05 INFO mapreduce.Job:  map 25% reduce 0%
    18/01/14 09:43:09 INFO mapreduce.Job:  map 50% reduce 0%
    18/01/14 09:43:12 INFO mapreduce.Job:  map 75% reduce 0%
    18/01/14 09:43:14 INFO mapreduce.Job:  map 100% reduce 0%
    18/01/14 09:43:14 INFO mapreduce.Job: Job job_1515818851132_0050 completed successfully
    18/01/14 09:43:16 INFO mapreduce.Job: Counters: 30
            File System Counters
                    FILE: Number of bytes read=0
                    FILE: Number of bytes written=682132
                    FILE: Number of read operations=0
                    FILE: Number of large read operations=0
                    FILE: Number of write operations=0
                    HDFS: Number of bytes read=481
                    HDFS: Number of bytes written=60
                    HDFS: Number of read operations=16
                    HDFS: Number of large read operations=0
                    HDFS: Number of write operations=8
            Job Counters
                    Launched map tasks=4
                    Other local map tasks=4
                    Total time spent by all maps in occupied slots (ms)=44760
                    Total time spent by all reduces in occupied slots (ms)=0
                    Total time spent by all map tasks (ms)=44760
                    Total vcore-milliseconds taken by all map tasks=44760
                    Total megabyte-milliseconds taken by all map tasks=11190000
            Map-Reduce Framework
                    Map input records=6
                    Map output records=6
                    Input split bytes=481
                    Spilled Records=0
                    Failed Shuffles=0
                    Merged Map outputs=0
                    GC time elapsed (ms)=1284
                    CPU time spent (ms)=5360
                    Physical memory (bytes) snapshot=561950720
                    Virtual memory (bytes) snapshot=8531210240
                    Total committed heap usage (bytes)=176685056
            File Input Format Counters
                    Bytes Read=0
            File Output Format Counters
                    Bytes Written=60
    18/01/14 09:43:16 INFO mapreduce.ImportJobBase: Transferred 60 bytes in 34.7351 seconds (1.7274 bytes/sec)
    18/01/14 09:43:16 INFO mapreduce.ImportJobBase: Retrieved 6 records.
    18/01/14 09:43:16 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners
    18/01/14 09:43:16 WARN mapreduce.PublishJobData: Unable to publish import data to publisher org.apache.atlas.sqoop.hook.SqoopHook
    java.lang.ClassNotFoundException: org.apache.atlas.sqoop.hook.SqoopHook
            at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
            at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
            at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
            at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
            at java.lang.Class.forName0(Native Method)
            at java.lang.Class.forName(Class.java:264)
            at org.apache.sqoop.mapreduce.PublishJobData.publishJobData(PublishJobData.java:46)
            at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:284)
            at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
            at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:127)
            at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:507)
            at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615)
            at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
            at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
            at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
            at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225)
            at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
            at org.apache.sqoop.Sqoop.main(Sqoop.java:243)
    18/01/14 09:43:16 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `departments` AS t LIMIT 1
    18/01/14 09:43:16 INFO hive.HiveImport: Loading uploaded data into Hive
    
    Logging initialized using configuration in jar:file:/usr/hdp/2.6.3.0-235/hive/lib/hive-common-1.2.1000.2.6.3.0-235.jar!/hive-log4j.properties
    OK
    Time taken: 10.427 seconds
    Loading data to table retail_db.departments
    Failed with exception org.apache.hadoop.security.AccessControlException: User null does not belong to Hadoop at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setOwner(FSDirAttrOp.java:89)         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setOwner(FSNamesystem.java:1873)         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setOwner(NameNodeRpcServer.java:828)
            at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setOwner(ClientNamenodeProtocolServerSideTranslatorPB.java:476)
            at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
            at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
            at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
            at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
            at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
            at java.security.AccessController.doPrivileged(Native Method)
            at javax.security.auth.Subject.doAs(Subject.java:422)
            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
            at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
    
    FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
    

4 个答案:

答案 0 :(得分:1)

第一个错误

WARN mapreduce.PublishJobData: Unable to publish import data to publisher org.apache.atlas.sqoop.hook.SqoopHook java.lang.ClassNotFoundException: org.apache.atlas.sqoop.hook.SqoopHook

您需要检查Sqoop二进制文件是否正常。最好再次复制它们,这样你就不需要逐个文件来复制文件了。

第二个错误

Failed with exception org.apache.hadoop.security.AccessControlException: User null does not belong to Hadoop

是因为你正在使用" root"执行sqoop。用户。将其更改为hadoop集群中存在的用户。

答案 1 :(得分:0)

两个想法

ClassNotFoundException: org.apache.atlas.sqoop.hook.SqoopHook

某处缺课。

我看到你正在尝试使用LINUX下的root帐户运行sqoop命令。确保root属于hdfs组。我不确定默认包含root。

答案 2 :(得分:0)

在从RDBMS将数据导入Hive时,Sqoop有时会处理null值,因此您应该使用以下键明确处理它们:

--null-string and --null-non-string

完成命令

sqoop import --connect jdbc:mysql://sandbox.hortonworks.com:3306/retail_db --username retail_dba --password hadoop --table departments --hive-home /apps/hive/warehouse --null-string 'na' --null-non-string 'na' --hive-import --create-hive-table --hive-table retail_db.departments --target-dir /user/root/hive_import

答案 3 :(得分:0)

发生这种情况是由于/etc/hive/conf/hive-site.xml中的字段:

<name>hive.warehouse.subdir.inherit.perms</name>
<value>true</value>

将值设置为false并尝试运行相同的查询,

否则,使--target-dir / user / root / hive_import读取/写入访问目录或将其删除,它将进入配置单元的主目录

相关问题