我有各家商店出售的产品数据。对于某些商店,它们以PROMO_FLG
映射的折扣价出售。
我想显示两个COUNT PARTITION
列。
+-------------------------+--------------+---------------------+
| Store | Item | PROMO_FLG|
|-------------------------+--------------+---------------------|
| 1 | 1 | 0 |
| 2 | 1 | 1 |
| 3 | 1 | 0 |
| 4 | 1 | 0 |
| 5 | 1 | 1 |
| 6 | 1 | 1 |
| 7 | 1 | 1 |
| 8 | 1 | 0 |
| 9 | 1 | 0 |
| 10 | 1 | 0 |
+-------------------------+--------------+---------------------+
首先显示所有拥有此产品的商店(已完成)
COUNT(DISTINCT STORE) OVER (PARTITION ITEM)
会给出10
第二个-我要寻找的-仅统计这些在PROMO_FLG = 1
属性中具有价值的商店。
那应该给我们带来4
答案 0 :(得分:3)
我想你想要
select t.*,
count(*) over (partition by item) as num_stores,
sum(promo_flg) over (partition by item) as num_promo_1
from t;
如果您确实需要不同的计数:
select t.*,
count(distinct store) over (partition by item) as num_stores,
count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1
from t;
Here是db <>小提琴。小提琴之所以使用Oracle,是因为它支持COUNT(DISTINCT)
作为窗口函数。
如果窗口功能不起作用,请选择以下方法:
select *
from t join
(select item, count(distinct store) as num_stores, count(distinct case when promo_flg = 1 then store end) as num_stores_promo
from t
group by item
) tt
using (item);
答案 1 :(得分:1)
使用Gordon第二个SQL,但显示它在Snowflake中工作
select v.*
,count(distinct store) over (partition by item) as num_stores
,count(distinct iff(promo_flg = 1, store, null)) over (partition by item) as num_dis_promo_stores
,sum(iff(promo_flg = 1, 1, 0)) over (partition by item) as num_sum_promo_stores
from values
(1 , 1, 0 ),
(2 , 1, 1 ),
(3 , 1, 0 ),
(4 , 1, 0 ),
(5 , 1, 1 ),
(6 , 1, 1 ),
(7 , 1, 1 ),
(8 , 1, 0 ),
(9 , 1, 0 ),
(10, 1, 0 )
v(store, item, promo_flg) ;
给予:
STORE ITEM PROMO_FLG NUM_STORES NUM_DIS_PROMO_STORES NUM_SUM_PROMO_STORES
1 1 0 10 4 4
2 1 1 10 4 4
3 1 0 10 4 4
4 1 0 10 4 4
5 1 1 10 4 4
6 1 1 10 4 4
7 1 1 10 4 4
8 1 0 10 4 4
9 1 0 10 4 4
10 1 0 10 4 4
因此,根据您是否需要不同的计数或总和,我使用了雪花支持iff
的非标准SQL形式,因为我更喜欢使用较小的sql。
但是您可以看到它们正在工作。
测试Gordons第二种情况count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1
的工作方式与写作相同。
要回答有关Gordons答案的Marcin2x4问题,如果/当数据与您描述的方式有所不同时,您会从方法中获得不同的结果。因此,如果您有一个商店,其中有一个商品和多行,其中存在promo_flg。或者,如果promo_flg的值非零:
select v.*
,count(distinct store) over (partition by item) as num_stores
,count(distinct iff(promo_flg = 1, store, null)) over (partition by item) as num_dis_promo_stores
,sum(iff(promo_flg <> 0, 1, 0)) over (partition by item) as num_sum_promo_stores
,sum(promo_flg) over (partition by item) as num_promo_1
,count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1
from values
(1 , 1, 0 ),
(2 , 1, 1 ),
(3 , 1, 0 ),
(4 , 1, 0 ),
(5 , 1, 1 ),
(6 , 1, 1 ),
(7 , 1, 1 ),
(8 , 1, 0 ),
(9 , 1, 0 ),
(10, 1, 0 ),
(7, 1, 1 ),
(7, 1, 2 )
v(store, item, promo_flg) ;
然后num_dis_promo_stores
和num_promo_1
给出4,num_sum_promo_stores
给出6,而num_promo_1
给出7