对嵌套重复字段的多个级别求和

时间:2018-03-16 14:15:57

标签: google-bigquery

我在源数据库中有几个订单明细表:Order Header -> Order Line -> Shipped Line -> Received Line

我创建了一个BQ表,其中包含两层嵌套的重复字段。以下是一些示例数据的样子:

WITH stol as (
SELECT 1 AS stol_id, "stol-1.1" AS stol_number, 1 AS stol_transfer_order_line_id, 3 AS stol_quantity  
 UNION ALL
 SELECT 2 AS stol_id, "stol-2.1" AS stol_number, 2 AS stol_transfer_order_line_id, 2 AS stol_quantity 
 UNION ALL
 SELECT 3 AS stol_id, "stol-2.2" AS stol_number, 2 AS stol_transfer_order_line_id, 2 AS stol_quantity
 UNION ALL
 SELECT 4 AS stol_id, "stol-2.3" AS stol_number, 2 AS stol_transfer_order_line_id, 1 AS stol_quantity 
),

rtol as (
SELECT 1 AS stol_id, "rtol-1.1" as rtol_number, 2 as rtol_quantity
UNION ALL
SELECT 1 as stol_id, "rtol-1.2" as rtol_number, 1 AS rtol_quantity
UNION ALL
SELECT 2 as stol_id, "rtol-2.1" as rtol_number, 2 AS rtol_quantity
UNION ALL
SELECT 3 as stol_id, "rtol-2.2" as rtol_number, 1 AS rtol_quantity
),

tol as (
SELECT 1 as tol_id, "tol-1" as tol_number, 3 as tol_transfer_quantity
UNION ALL
SELECT 2 as tol_id, "tol-2" AS tol_number, 5 AS tol_transfer_quantity
),

nest AS (
 SELECT s.stol_id,
        s.stol_number,
        s.stol_quantity,
        s.stol_transfer_order_line_id,
        ARRAY_AGG(STRUCT(r.rtol_number, r.rtol_quantity)) as received
 FROM stol s
 LEFT JOIN rtol r ON s.stol_id = r.stol_id
 GROUP BY 1, 2, 3, 4
),

final as (
SELECT t.tol_id
  ,t.tol_number
  ,t.tol_transfer_quantity
  ,ARRAY_AGG(STRUCT(n.stol_number, n.stol_quantity, n.received)) as shipped
FROM tol t
LEFT JOIN nest n ON t.tol_id = n.stol_transfer_order_line_id
GROUP BY 1, 2, 3
)

我想sum每个订单行的已发货和已收货数量。我可以得到正确的结果:

shipped as (
SELECT tol_number
  ,SUM(stol_quantity) as shipped_q
FROM final t, t.shipped
GROUP BY 1
),

received as (
SELECT tol_number
  ,SUM(rtol_quantity) as received_q
FROM final t, t.shipped s, s.received
GROUP BY 1
)

SELECT t.tol_number
  ,t.tol_transfer_quantity
  ,s.shipped_q
  ,r.received_q
FROM final t
LEFT JOIN shipped s on t.tol_number = s.tol_number
LEFT JOIN received r ON t.tol_number = r.tol_number

正确的结果:

Row tol_number  tol_transfer_quantity   shipped_q   received_q   
1     tol-1             3                  3            3    
2     tol-2             5                  5            3

我想知道的是,是否有更好的方法来做到这一点?尝试这样的东西将超过第一级嵌套,但只是感觉和看起来更清洁:

SELECT tol_number
      ,tol_transfer_quantity
      ,SUM(stol_quantity) as shipped_q
      ,SUM(rtol_quantity) as shipped_r
FROM final t, t.shipped s, s.received
GROUP BY 1, 2

shipped_q错误的结果:

Row tol_number  tol_transfer_quantity   shipped_q   shipped_r    
1     tol-2              5                 5            3    
2     tol-1              3                 6            3

非常感谢任何想法。

2 个答案:

答案 0 :(得分:3)

   
#standardSQL
SELECT
  tol_id,
  tol_transfer_quantity,
  (SELECT SUM(stol_quantity) FROM final.shipped) shipped_q,
  (SELECT SUM(rtol_quantity) FROM final.shipped s, s.received) shipped_r
FROM final

答案 1 :(得分:2)

我建议您使用子选项来处理数组,例如表格:

SELECT
  tol_id,
  SUM(tol_transfer_quantity),
  SUM( (SELECT SUM(stol_quantity) FROM final.shipped) ) shipped_q,
  SUM( (SELECT SUM(rtol_quantity) FROM final.shipped s, s.received) ) shipped_r
FROM
  final
GROUP BY
  1

HTH!