数组的不同值?

时间:2016-11-25 17:59:06

标签: sql arrays postgresql distinct

下表:

CREATE TEMPORARY TABLE guys ( guy_id integer primary key, guy text );
CREATE TEMPORARY TABLE sales ( log_date date, sales_guys integer[], sales smallint );
INSERT INTO guys VALUES(1,'john'),(2,'joe');
INSERT INTO sales VALUES('2016-01-01', '{1,2}', 2),('2016-01-02','{1,2}',4);

以下查询可以很好地显示给定日期的名称:

SELECT log_date, sales_guys, ARRAY_AGG(guy), sales 
FROM sales 
JOIN guys ON 
   guys.guy_id = ANY(sales.sales_guys) 
GROUP BY log_date, sales_guys, sales 
ORDER BY log_date ASC;

  log_date  | sales_guys | array_agg  | sales 
------------+------------+------------+-------
 2016-01-01 | {1,2}      | {john,joe} |     2
 2016-01-02 | {1,2}      | {john,joe} |     4

以下查询有问题地给出了每个人每个日期的名称,所以这里每个名字两次,依此类推):

SELECT sales_guys, ARRAY_AGG(guy), SUM(sales) AS sales
FROM sales
JOIN guys ON guys.guy_id = ANY(sales.sales_guys)
GROUP BY sales_guys;

收率:

 sales_guys |      array_agg      | sales 
------------+---------------------+-------
 {1,2}      | {john,joe,john,joe} |    12

有没有办法以某种方式减少ARRAY_AGG调用以仅提供唯一名称?

2 个答案:

答案 0 :(得分:0)

您可以在聚合内使用DISTINCT

SELECT sales_guys, ARRAY_AGG(DISTINCT guy), SUM(sales) AS sales FROM sales JOIN guys ON guys.guy_id = ANY(sales.sales_guys) GROUP BY sales_guys;

答案 1 :(得分:0)

没有ORDER BY,您就无法信任任何订单。除了数组的元素,当unnested时,按数组顺序排列。如果您的查询对结果做了更多,则可能会重新排序。

您只需将ORDER BY添加到Postgres中的任何聚合函数:

SELECT s.sales_guys, ARRAY_AGG(DISTINCT g.guy ORDER BY g.guy) AS names, SUM(s.sales) AS sum_sales
FROM   sales s
JOIN   guys  g ON g.guy_id = ANY(s.sales_guys)
GROUP  BY s.sales_guys;

但显然数组元素的原始顺序。查询还有其他问题...... IN= ANY()都不关心右侧集合,列表或数组中元素的顺序:

正确的解决方案

完成这项任务(注意细节!):

获取每个数组sales的总sales_guys,其中元素的顺序有所不同(数组'{1,2}''{2,1}'不相同)和{ {1}}既没有重复也没有NULL元素。按匹配顺序添加已解析名称数组。

sales_guysunnest()一起使用。 之前聚合数组解析名称,这样更便宜,更不容易出错。

WITH ORDINALITY

SELECT s.*, g. FROM ( SELECT sales_guys, sum (sales) AS total_sales -- aggregate first in subquery FROM sales GROUP BY 1 ) s , LATERAL ( SELECT array_agg(guy ORDER BY ord) AS names -- order by original order FROM unnest(s.sales_guys) WITH ORDINALITY sg(guy_id, ord) -- with order of elements LEFT JOIN guys g USING (guy_id) -- LEFT JOIN to add NULL for missing guy_id ) g; 子查询可以与无条件LATERAL结合使用 - 逗号(CROSS JOIN)是简写符号 - 因为子查询中的聚合保证结果为每一行。否则你会使用,

详细说明: