插入到Hive - 执行计划已更改

时间:2017-03-01 16:07:15

标签: hadoop hive

我们正在使用Hive 1.3.1,并运行INSERT语句将数据从HDFS上传到Hive

explain insert OVERWRITE table managedtable PARTITION(col1) select [columns] from externaltable

我注意到昨天与昨天相比,执行计划在同一张桌子上发生了变化。昨天该计划导致M / R工作有341个映射器和359个减速器,而今天该计划导致M / R工作只有映射器而没有减速器。

我真的很困惑。

  • hive如何决定如何执行查询?
  • 如何将插入选择转换为map reduce?
  • 什么会导致计划改变?

这些是计划(省略列表,因为该表有超过300列。

OK STAGE DEPENDENCIES:

  Stage-1 is a root stage

  Stage-0 depends on stages: Stage-1

  Stage-2 depends on stages: Stage-0

STAGE PLANS:

  Stage: Stage-1

    Map Reduce

      Map Operator Tree:

          TableScan
            alias: externalevents
            Statistics: Num rows: 479391 Data size: 91824480256 Basic stats: COMPLETE Column stats: NONE
            Select Operator  [columns] outputColumnNames:  [columns]  Statistics: Num rows: 479391 Data size: 91824480256 Basic stats: COMPLETE Column stats: NONE
              Reduce Output Operator
                key expressions: _col394 (type: bigint)
                sort order: +
                Map-reduce partition columns: _col394 (type: bigint)
                Statistics: Num rows: 479391 Data size: 91824480256 Basic stats: COMPLETE Column stats: NONE
                value expressions: [columns]  Reduce Operator Tree:
        Select Operator
          expressions: [columns]  outputColumnNames: [columns] Statistics: Num rows: 479391 Data size: 91824480256 Basic stats: COMPLETE Column stats: NONE
          File Output Operator
            compressed: false
            Statistics: Num rows: 479391 Data size: 91824480256 Basic stats: COMPLETE Column stats: NONE
            table:
                input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
                name: ceazip.events_test_hive

  Stage: Stage-0
    Move Operator
      tables:
          partition:
            evtf_first_date_id
          replace: true
          table:
              input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
              output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
              serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
              name: ceazip.events_test_hive

  Stage: Stage-2
    Stats-Aggr Operator

And the 2nd plan: Stage-1 is a root stage   Stage-7 depends on stages: Stage-1 , consists of Stage-4, Stage-3, Stage-5   Stage-4   Stage-0 depends on stages: Stage-4, Stage-3, Stage-6   Stage-2 depends on stages: Stage-0   Stage-3   Stage-5   Stage-6 depends on stages: Stage-5

STAGE PLANS:   Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: events_2017_02_21_11_13_18
            Statistics: Num rows: 468680 Data size: 89772957696 Basic stats: COMPLETE Column stats: NONE
            Select Operator


Statistics: Num rows: 468680 Data size: 89772957696 Basic stats: COMPLETE Column stats: NONE
              File Output Operator
                compressed: false
                Statistics: Num rows: 468680 Data size: 89772957696 Basic stats: COMPLETE Column stats: NONE
                table:
                    input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
                    output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
                    serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
                    name: default.events_test1

  Stage: Stage-7
    Conditional Operator

  Stage: Stage-4
    Move Operator
      files:
          hdfs directory: true
          destination: hdfs://isr-r0-aps-nam-1.lab.il.nice.com:8020/apps/hive/warehouse/events_test1/.hive-staging_hive_2017-03-01_07-35-18_776_4958999242494325333-1/-ext-10000

  Stage: Stage-0
    Move Operator
      tables:
          partition:
            evtf_first_date_id
          replace: true
          table:
              input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
              output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
              serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
              name: default.events_test1

  Stage: Stage-2
    Stats-Aggr Operator

  Stage: Stage-3
    Merge File Operator
      Map Operator Tree:
          ORC File Merge Operator
      merge level: stripe
      input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat

  Stage: Stage-5
    Merge File Operator
      Map Operator Tree:
          ORC File Merge Operator
      merge level: stripe
      input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat

  Stage: Stage-6
    Move Operator
      files:
          hdfs directory: true
          destination: hdfs://isr-r0-aps-nam-1.lab.il.nice.com:8020/apps/hive/warehouse/events_test1/.hive-staging_hive_2017-03-01_07-35-18_776_4958999242494325333-1/-ext-10000

谢谢, 利奥尔

0 个答案:

没有答案