每个数据分区只返回一个结果

时间:2014-08-06 19:29:17

标签: sql google-bigquery

我希望能够通过BigQuery中的分区进行一些计算,然后只为每个分区输出1行(而不是为每个分区输出一行)。例如,如果我有这样的表:

Category | Location | Count
A        | 'home'   | 20
A        | 'work'   | 10
A        | 'lab'    | 6
B        | 'home'   | 5
C        | 'lab'    | 15
C        | 'home'   | 25

我想结束这个结果

Category  | TopLocation     | TopCount | SecondLocation | SecondCount
A         | 'home'          | 20       | 'work'         | 10
B         | 'home'          | 5        | NULL           | NULL
C         | 'home'          | 25       | 'lab'          | 15

我认为我可以使用分区执行此操作,但最终会为每个值生成一行,而不是我想要的单行,因此我按类别分组并使用FIRST。有没有更好的方法来避免生成这么多中间行(并且希望避免“窗口函数的大问题”)。

SELECT
  category,
  FIRST(TopLocation) TopLocation,
  FIRST(TopCount) TopCount,
  FIRST(SecondLocation) SecondLocation,
  FIRST(SecondCount) SecondCount,
FROM
  (SELECT
      category,
      NTH_VALUE(Location, 1) OVER (PARTITION BY category ORDER BY count) TopLocation,
      NTH_VALUE(Count, 1) OVER (PARTITION BY category ORDER BY count) TopCount,
      NTH_VALUE(Location, 2) OVER (PARTITION BY category ORDER BY count) SecondLocation,
      NTH_VALUE(Count, 1) OVER (PARTITION BY category ORDER BY count) SecondCount
   FROM
      mytable
   )    
GROUP BY
  category
ORDER BY
  category DESC

2 个答案:

答案 0 :(得分:0)

更新:使用#standardSQL

的更好解决方案

怎么样:

SELECT word, word_count, corpus, rank FROM (
  SELECT word, word_count, corpus,
         RANK() OVER (PARTITION BY corpus ORDER BY word_count DESC) rank
  FROM [publicdata:samples.shakespeare] 
  WHERE word_count > 6
)
WHERE rank<3

答案 1 :(得分:0)

这应该做的工作:

select category, 
    first(if(rank = 1, location, null)) as location_1, first(if(rank = 1, count, null)) as count_1,
    first(if(rank = 2, location, null)) as location_2, first(if(rank = 2, count, null)) as count_2,
    first(if(rank = 3, location, null)) as location_3, first(if(rank = 3, count, null)) as count_3
from
    (select row_number() over (partition by category order by count desc) as rank, * 
from 
    (select 'A' as category, 'home' AS location, 20 as count),
    (select 'A' as category, 'work' AS location, 10 as count),
    (select 'A' as category, 'lab' AS location, 6 as count),
    (select 'B' as category, 'home' AS location, 5 as count),
    (select 'C' as category, 'lab' AS location, 15 as count),
    (select 'C' as category, 'home' AS location, 25 as count)
)
group by category order by category

结果:

Row category    location_1  count_1 location_2  count_2 location_3  count_3  
1   A   home    20  work    10  lab 6    
3   B   home    5   null    null    null    null
2   C   home    25  lab 15  null    null     

但可能无法通过“大查询结果”解决问题&#39;在窗口功能