HIVE 枢轴和总和/计数

时间:2021-05-20 05:19:28

标签: sql pyspark hive pivot pivot-table

我有一个假数据表,我想弄清楚如何根据这些值进行透视和求和/计数。

示例输入: enter image description here

示例输出: enter image description here

感谢您的帮助!

1 个答案:

答案 0 :(得分:0)

按水果标志过滤,聚合每个结果并合并所有结果:

select 'apple_flg'                  as fruit_name
       count(distinct student_name) as cnt_student, --if student_name is unique, no distinct needed
       sum(buy_cnt)                 as sum_buy_cnt,
       sum(buy_payment)             as sum_buy_payment 
 from tablename
where apple_flg=1 

union all

select 'banana_flg'                 as fruit_name
       count(distinct student_name) as cnt_student,
       sum(buy_cnt)                 as sum_buy_cnt,
       sum(buy_payment)             as sum_buy_payment 
 from tablename
where banana_flg=1

union all

select 'strawberry_flg'             as fruit_name
       count(distinct student_name) as cnt_student,
       sum(buy_cnt)                 as sum_buy_cnt,
       sum(buy_payment)             as sum_buy_payment 
 from tablename
where strawberry_flg=1 

union all

select 'watermelon_flg'             as fruit_name
       count(distinct student_name) as cnt_student, 
       sum(buy_cnt)                 as sum_buy_cnt,
       sum(buy_payment)             as sum_buy_payment
 from tablename 
where watermelon_flg=1 

union all

select 'lemon_flg'                  as fruit_name
       count(distinct student_name) as cnt_student, 
       sum(buy_cnt)                 as sum_buy_cnt,
       sum(buy_payment)             as sum_buy_payment 
 from tablename
where lemon_flg=1 

另一种可能的方法:

select case when apple_flg=1       then 'apple_flg'
            when banana_flg=1      then 'banana_flg'
            when strawberry_flg=1  then 'strawberry_flg'
            when watermelon_flg=1  then 'watermelon_flg'
            when lemon_flg=1       then 'lemon_flg'
        end                         as fruit,
       count(distinct student_name) as cnt_student,
       sum(buy_cnt)                 as sum_buy_cnt,
       sum(buy_payment)             as sum_buy_payment 
 from tablename
group by apple_flg, banana_flg, strawberry_flg, watermelon_flg, lemon_flg

如果一些水果从未被购买过,而您需要那些计数为 0 的行,您可能需要更复杂的解决方案:

with fruits as (
select stack(5, 'apple_flg',
                'banana_flg',
                'strawberry_flg',
                'watermelon_flg',
                'lemon_flg'
           ) as fruit
),

agg as (
select case when apple_flg=1       then 'apple_flg'
            when banana_flg=1      then 'banana_flg'
            when strawberry_flg=1  then 'strawberry_flg'
            when watermelon_flg=1  then 'watermelon_flg'
            when lemon_flg=1       then 'lemon_flg'
        end                         as fruit,
       count(distinct student_name) as cnt_student,
       sum(buy_cnt)                 as sum_buy_cnt,
       sum(buy_payment)             as sum_buy_payment 
 from tablename
group by apple_flg, banana_flg, strawberry_flg, watermelon_flg, lemon_flg
)

select f.fruit,
       nvl(s.cnt_student, 0)     as cnt_student,
       nvl(s.sum_buy_cnt, 0)     as sum_buy_cnt,
       nvl(s.sum_buy_payment, 0) as sum_buy_payment
  from fruits f
       left join agg s on f.fruit=s.fruit
相关问题