以最常见的条件寻找价值

时间:2016-04-14 06:44:42

标签: sql postgresql aggregate-functions

我在使用PostgreSQL时遇到了很多麻烦,试图弄清楚如何找到符合特定条件的最常见值。 ID是图书的ID号,这意味着重复的数字表示该图书有多个副本。

我这里有两张桌子:

Table A:
=====+===================
ID   |   Condition
-------------------------
1    |   Taken
1    |   
1    |   Taken
1    |   
2    |   Taken
3    |   Taken
3    |   
3    |   Taken
3    |   Taken
4    |   
4    |   Taken  
etc.

Table B:
=====+===================
ID   |    Name
-------------------------
1    |    BookA
2    |    BookB
3    |    BookC
4    |    BookD
etc.

我需要的是简单地找到哪本书的副本最多,并简单地打印出书的名称。在这种情况下,我只需要:

BookC

问题在于,我无法弄清楚如何找出每个人身份证上有多少书。我尝试使用像这样的临时表:

CREATE TEMP TABLE MostCommon AS
    (SELECT ID
     FROM TableA
     WHERE SUM(CASE WHEN Condition>0 then 1 else 0 END)
    )
    SELECT NAME FROM TableB, MostCommon WHERE
    MostCommon.ID = TableB.ID;

但它会引发错误,或者根本就没有给我我需要的东西。任何帮助将不胜感激。

3 个答案:

答案 0 :(得分:1)

好的,首先我假设您的列和表名称区分大小写,这意味着您必须使用双引号。要打印大多数“拍摄”的书名和“拍摄”副本数量,您可以使用简单的aggragete count(),然后将输出降序排列,最后将输出限制为1行,如:

SELECT
    b."ID",
    b."Name",
    count(*) as takenCount
FROM "TableA" a
    JOIN "TableB" b ON a."ID" = b."ID"
WHERE a."Condition" = 'Taken'
GROUP BY b."ID", b."Name"
ORDER BY 3 DESC
LIMIT 1;

答案 1 :(得分:0)

CREATE TEMP TABLE MostCommon AS
(SELECT id, (sum(ID)/id) book_taken FROM tableA  where condition = 'Taken' group by id);


select name from tableB t2 join MostCommon mc on mc.id = t2.id where mc.id in (select max(book_taken) from MostCommon)

答案 2 :(得分:0)

为了使数据合理(即没有重复记录),我必须稍微改变一下模式。

CREATE TABLE book_condition (
    created TIMESTAMP,
    book_id INTEGER,
    condition VARCHAR,
    PRIMARY KEY (created, book_id));

INSERT INTO book_condition (created, book_id, condition)
VALUES
    ('2016-01-01 08:30', 1, 'Taken'),
    ('2016-01-01 08:35', 1, ''),
    ('2016-01-01 08:40', 1, 'Taken'),
    ('2016-01-01 08:45', 1, ''),
    ('2016-01-01 08:50', 2, 'Taken'),
    ('2016-01-01 08:55', 3, 'Taken'),
    ('2016-01-01 09:00', 3, ''),
    ('2016-01-01 09:05', 3, 'Taken'),
    ('2016-01-01 09:10', 3, 'Taken'),
    ('2016-01-01 09:15', 4, ''),
    ('2016-01-01 09:20', 4, 'Taken');

CREATE TABLE book (
    book_id INTEGER,
    name VARCHAR,
    PRIMARY KEY (book_id));

INSERT INTO book (book_id, name)
VALUES
    (1, 'BookA'),
    (2, 'BookB'),
    (3, 'BookC'),
    (4, 'BookD');

然后,问题分解为:

  • 每本书有多少份?
SELECT
    book_id,
    COUNT(book_id) AS total_taken
FROM book_condition
WHERE
    condition = 'Taken'
GROUP BY book_id
;
 book_id | total_taken 
---------+-------------
       1 |           2
       2 |           1
       3 |           3
       4 |           1
(4 rows)
  • 如何按total_taken值对记录进行排名?
SELECT
    book_id,
    total_taken,
    RANK() OVER (
        ORDER BY total_taken DESC
        ) AS total_taken_rank
FROM (
    SELECT
        book_id,
        COUNT(book_id) AS total_taken
    FROM book_condition
    WHERE
        condition = 'Taken'
    GROUP BY book_id
    ) AS bt
ORDER BY total_taken_rank ASC
;
 book_id | total_taken | total_taken_rank 
---------+-------------+------------------
       3 |           3 |                1
       1 |           2 |                2
       2 |           1 |                3
       4 |           1 |                3
(4 rows)
  • 如何在包含其键(id)值的查询结果中获取图书的名称?
SELECT
    b.book_id,
    b.name,
    bt.total_taken,
    RANK() OVER (
        ORDER BY bt.total_taken DESC
        ) AS total_taken_rank
FROM
    book AS b
    LEFT JOIN (
        SELECT
            book_id,
            COUNT(book_id) AS total_taken
        FROM book_condition
        WHERE
            condition = 'Taken'
        GROUP BY book_id
        ) AS bt
        USING (book_id)
ORDER BY
    total_taken_rank ASC,
    book_id ASC
;
 book_id | name  | total_taken | total_taken_rank 
---------+-------+-------------+------------------
       3 | BookC |           3 |                1
       1 | BookA |           2 |                2
       2 | BookB |           1 |                3
       4 | BookD |           1 |                3
(4 rows)
  • 如何只获得结果中排名最高的记录?
SELECT
    br.book_id,
    br.name,
    br.total_taken
FROM (
    SELECT
        b.book_id,
        b.name,
        bt.total_taken,
        RANK() OVER (
            ORDER BY bt.total_taken DESC
            ) AS total_taken_rank
    FROM
        book AS b
        LEFT JOIN (
            SELECT
                book_id,
                COUNT(book_id) AS total_taken
            FROM book_condition
            WHERE
                condition = 'Taken'
            GROUP BY book_id
            ) AS bt
            USING (book_id)
    ) AS br
WHERE
    total_taken_rank = 1
;
 book_id | name  | total_taken 
---------+-------+-------------
       3 | BookC |           3
(1 row)