我们正在跟踪我们的运输合作伙伴通过他们触发的网络钩子发送的订单状态。 webhook 每次触发时都会添加一行,因此每个订单都有多行与之关联。
表的结构 enter image description here
我们正在尝试创建一个 sql 查询以获取以下内容
查找最后接收到的 'awb' 行。获取该行中的 current_status。如果 current_status
是 'PICKUP EXCEPTION
'、'OUT FOR PICKUP
'、'PICKUP RESCHEDULED
' 中的任何一个,则查找该特定 'awb' 中第一次出现这些状态的行
检查 awb 的这些状态第一次出现和最后一次出现之间的天数
并输出相差2天以上的awbs。
这是我能够创建的查询。
WITH ranked_order_status AS (
SELECT os.*,
datediff(
now() ,
first_value(recived_at) over (partition by awb order by recived_at asc)
) as diff,
ROW_NUMBER() OVER (PARTITION BY awb ORDER BY recived_at desc) AS rn
FROM order_status AS os where current_status in ('PICKUP EXCEPTION', 'OUT FOR PICKUP', 'PICKUP RESCHEDULED')
)
SELECT * FROM ranked_order_status WHERE rn = 1 and diff > 2
不幸的是,这向我显示了所有具有这些状态的行的 awb,而不仅是最后收到的当前状态为“PICKUP EXCEPTION
”、“OUT FOR PICKUP
”、“PICKUP RESCHEDULED
”< /p>
知道如何编辑它吗?
答案 0 :(得分:0)
所以如果我理解正确的话,这应该是一个使用 RANK() 的分析函数的清晰案例。
这就是我对您提到的约束的处理方法:
WITH t1 AS (
SELECT
os.*,
FIRST_VALUE(os.received_at) OVER(PARTITION BY os.awb ORDER BY os.received_at) AS first_received_at
FROM order_status AS os
WHERE os.current_status IN ('PICKUP EXCEPTION', 'OUT FOR PICKUP', 'PICKUP RESCHEDULED')
),
t2 AS (
SELECT
RANK() OVER (PARTITION BY t1.awb ORDER BY t1.received_at DESC) AS reverse_event_sequence,
DATE_DIFF(t1.received_at, t1.first_received_at, DAY) AS day_diff
t1.*
FROM t1
),
final AS (
SELECT *
FROM t2
WHERE t2.day_diff > 2 AND t2.reverse_event_sequence = 1
)
SELECT * FROM final
基本上,您希望首先获取每一行的 received_at 的第一个值,然后您希望对每个 awb 的所有事件进行排名并按降序对其进行排序,以使最后一个事件始终为 rank=1,然后您对日期差异应用所需的约束:)
我不得不提一下,没有数据样本也无济于事。并感谢您对我的方法的任何反馈:)
答案 1 :(得分:0)
您可以通过两种方式从最近的行中枚举行:
对于每个 awb 的最后一个状态,它们之间的差异将为零。你可以选择这个然后聚合:
select awb, current_status,
min(received_at), max(received_at)
from (select os.*,
row_number() over (partition by awb order by received_at desc) as seqnum,
row_number() over (partition by awb, current_status order by received_at desc) as seqnum_2
from order_status os
) os
where seqnum = seqnum_2 and
current_status in ('PICKUP EXCEPTION', 'OUT FOR PICKUP', 'PICKUP RESCHEDULED')
group by awb, current_status
having max(received_at) > min(received_at) + interval 2 day;