通过条件语句进行分区

时间:2020-04-08 15:14:32

标签: sql database analysis snowflake-cloud-data-platform

我有各家商店出售的产品数据。对于某些商店,它们以PROMO_FLG映射的折扣价出售。 我想显示两个COUNT PARTITION列。

+-------------------------+--------------+---------------------+
| Store                   | Item         | PROMO_FLG|
|-------------------------+--------------+---------------------|
| 1                       |            1 |                   0 |
| 2                       |            1 |                   1 |
| 3                       |            1 |                   0 |
| 4                       |            1 |                   0 |
| 5                       |            1 |                   1 |
| 6                       |            1 |                   1 |
| 7                       |            1 |                   1 |
| 8                       |            1 |                   0 |
| 9                       |            1 |                   0 |
| 10                      |            1 |                   0 |
+-------------------------+--------------+---------------------+

首先显示所有拥有此产品的商店(已完成)

COUNT(DISTINCT STORE) OVER (PARTITION ITEM)会给出10

第二个-我要寻找的-仅统计这些在PROMO_FLG = 1属性中具有价值的商店。

那应该给我们带来4

的价值

2 个答案:

答案 0 :(得分:3)

我想你想要

select t.*,
       count(*) over (partition by item) as num_stores,
       sum(promo_flg) over (partition by item) as num_promo_1
from t;

如果您确实需要不同的计数:

select t.*,
       count(distinct store) over (partition by item) as num_stores,
       count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1
from t;

Here是db <>小提琴。小提琴之所以使用Oracle,是因为它支持COUNT(DISTINCT)作为窗口函数。

如果窗口功能不起作用,请选择以下方法:

select *
from t join
     (select item, count(distinct store) as num_stores, count(distinct case when promo_flg = 1 then store end) as num_stores_promo
      from t
      group by item
     ) tt
     using (item);

答案 1 :(得分:1)

使用Gordon第二个SQL,但显示它在Snowflake中工作

select v.*
    ,count(distinct store) over (partition by item) as num_stores
    ,count(distinct iff(promo_flg = 1, store, null)) over (partition by item) as num_dis_promo_stores
    ,sum(iff(promo_flg = 1, 1, 0)) over (partition by item) as num_sum_promo_stores
from values
  (1 , 1, 0 ),
  (2 , 1, 1 ),
  (3 , 1, 0 ),
  (4 , 1, 0 ),
  (5 , 1, 1 ),
  (6 , 1, 1 ),
  (7 , 1, 1 ),
  (8 , 1, 0 ),
  (9 , 1, 0 ),
  (10, 1, 0 )
  v(store, item, promo_flg) ;

给予:

STORE   ITEM    PROMO_FLG   NUM_STORES  NUM_DIS_PROMO_STORES    NUM_SUM_PROMO_STORES
1       1       0           10          4                       4
2       1       1           10          4                       4
3       1       0           10          4                       4
4       1       0           10          4                       4
5       1       1           10          4                       4
6       1       1           10          4                       4
7       1       1           10          4                       4
8       1       0           10          4                       4
9       1       0           10          4                       4
10      1       0           10          4                       4

因此,根据您是否需要不同的计数或总和,我使用了雪花支持iff的非标准SQL形式,因为我更喜欢使用较小的sql。 但是您可以看到它们正在工作。

测试Gordons第二种情况count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1的工作方式与写作相同。

要回答有关Gordons答案的Marcin2x4问题,如果/当数据与您描述的方式有所不同时,您会从方法中获得不同的结果。因此,如果您有一个商店,其中有一个商品和多行,其中存在promo_flg。或者,如果promo_flg的值非零:

select v.*
    ,count(distinct store) over (partition by item) as num_stores
    ,count(distinct iff(promo_flg = 1, store, null)) over (partition by item) as num_dis_promo_stores
    ,sum(iff(promo_flg <> 0, 1, 0)) over (partition by item) as num_sum_promo_stores
    ,sum(promo_flg) over (partition by item) as num_promo_1
    ,count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1
from values
  (1 , 1, 0 ),
  (2 , 1, 1 ),
  (3 , 1, 0 ),
  (4 , 1, 0 ),
  (5 , 1, 1 ),
  (6 , 1, 1 ),
  (7 , 1, 1 ),
  (8 , 1, 0 ),
  (9 , 1, 0 ),
  (10, 1, 0 ),
  (7, 1, 1 ),
  (7, 1, 2 )
  v(store, item, promo_flg) ;

然后num_dis_promo_storesnum_promo_1给出4,num_sum_promo_stores给出6,而num_promo_1给出7

相关问题