如何删除重复的结果

时间:2019-07-20 03:38:59

标签: sql postgresql

具有以下架构:

CREATE TABLE IF NOT EXISTS companies (
  id serial,
  name text NOT NULL,

  PRIMARY KEY (id)
);

CREATE TABLE IF NOT EXISTS cars (
  id serial,
  make text NOT NULL,
  year integer NOT NULL,
  company_id INTEGER REFERENCES companies(id),

  PRIMARY KEY (id)
);


INSERT INTO companies (id, name) VALUES
  (1, 'toyota'),
  (2, 'chevy');

INSERT INTO cars (make, year, company_id) VALUES
  ('silverado', 1995, 2),
  ('malibu', 1999, 2),
  ('tacoma', 2017, 1),
  ('custom truck', 2010, null),
  ('van custom', 2005, null);

如何选择汽车行,仅显示给定公司的最新汽车?

例如

select make, companies.name as model, year from cars 
left join companies
on companies.id = cars.company_id
order by make;

输出

     make     | model  | year 
--------------+--------+------
 custom truck |        | 2010
 malibu       | chevy  | 1999
 silverado    | chevy  | 1995
 tacoma       | toyota | 2017
 van custom   |        | 2005

但是我只想显示最新的“雪佛兰”,例如

     make     | model  | year 
--------------+--------+------
 custom truck |        | 2010
 malibu       | chevy  | 1999
 tacoma       | toyota | 2017
 van custom   |        | 2005

,并且仍然能够按“品牌”进行排序,并显示没有空company_id的汽车。

小提琴链接: https://www.db-fiddle.com/f/5Vh1sFXvEvnbnUJsCYhCHf/0

3 个答案:

答案 0 :(得分:1)

借助公用表表达式和row_number函数,我们可以获得所需的输出,下面是提供所需输出的查询。

     WITH temp AS 
    (SELECT 
        make
        , companies.name AS model
        , year
        , row_number() over(PARTITION BY coalesce(companies.name, make) ORDER BY year desc) as rnk
    FROM   
       cars
    left join 
       companies
    ON 
       companies.id = cars.company_id
    )
    SELECT 
       make
       , model
       , year
    FROM
       temp
    WHERE
       rnk = 1
    ;  

答案 1 :(得分:1)

SQL可以基于Set Math(离散数学)来完成。因此,您希望所有汽车的数量减去其年数小于给定公司ID的最大年份的汽车的数量。

所有汽车的集合:

select * from cars

年份小于给定公司ID的最大年份的所有汽车的集合:

select a.id from cars a, cars b where a.company_id = b.company_id  and a.year < b.year

一套减去另一套:

select * from cars where id not in (select a.id from cars a, cars b where a.company_id = b.company_id  and a.year < b.year)

包含空的company_id的结果,因为它们不包含在ID比较中:

     make     | model  | year 
--------------+--------+------
 custom truck |        | 2010
 malibu       | chevy  | 1999
 tacoma       | toyota | 2017
 van custom   |        | 2005

答案 2 :(得分:0)

在Postgres中,最好使用distinct on

select distinct on (co.id) ca.*, co.name as model
from cars ca left join
     companies co
     on ca.company_id = co.id
order by co.id, ca.year desc;

DISTINCT ON是非常方便的Postgres语法。它在括号中为每个组合保留一行。特定的行由ORDER BY子句确定。

但是,您有些曲折,因为co.id可以是null。在这种情况下,您似乎想让所有的汽车都没有陪伴。

所以:

select distinct on (co.id, case when co.id is null then ca.id end) ca.*, co.name
from cars ca left join
     companies co
     on ca.company_id = co.id
order by co.id, case when co.id is null then ca.id end, ca.year desc;

或更简单地使用union all

-- get the ones with a company
select distinct on (co.id) ca.*, co.name
from cars ca join
     companies co
     on ca.company_id = co.id
union all
-- get the ones with no company
select ca.*, null
from cars ca
where ca.company_id is null
order by year desc;

在其他数据库中,通常使用row_number()

select ca.*
from (select ca.*, co.name as model,
             row_number() over (partition by co.id,
                                             case when co.id is null then ca.id end
                                order by year desc
                               ) as seqnum
      from cars ca left join
           companies co
           on ca.company_id = co.id
     ) ca
where seqnum = 1