Question

我遇到一个问题，我的count(*)会在不同的行被过滤之前返回行数。

这是我的查询的简化版本。请注意，我将从表中提取大量其他数据，因此group by将不会返回相同的结果，因为我必须按10列分组。它的工作方式是m是地图映射q，c和kl，因此可以有多个q.id的引用。我只想要一个。

SELECT distinct on (q.id) count(*) over() as full_count
from q, c, kl, m 
where c.id = m.chapter_id
    and q.id = m.question_id
    and q.active = 1
    and c.class_id = m.class_id
    and kl.id = m.class_id
order by q.id asc

如果我运行这个，我得到full_count = 11210，而它只返回9137行。如果我在没有distinct on (q.id)的情况下运行它，则（q.id）上的distinct确实是行数。

因此，似乎count函数无法访问已过滤的行。我怎么解决这个问题？我是否需要重新考虑我的方法？

Answer 1

我不完全确定你到底想要计算什么，但这可能会让你开始：

select id, 
       full_count,
       id_count
from (
    SELECT q.id, 
           count(*) over() as full_count,
           count(*) over (partition by q.id) as id_count,
           row_number() over (partition by q.id order by q.id) as rn
    from q
      join m on q.id = m.question_id
      join c on c.id = m.chapter_id and c.class_id = m.class_id
      join kl on kl.id = m.class_id
    where q.active = 1
) t
where rn = 1
order by q.id asc

如果您需要每个ID 的计数，那么id_count列就是您所需要的。如果您需要整体计数，但只需要每个ID的行，那么full_count可能就是您想要的。

（请注意，我重写了隐式连接语法以使用显式JOIN）

Answer 2

您可以使用子查询：

select qid, count(*) over () as full_count
from (SELECT distinct q.id
      from q, c, kl, m 
      where c.id = m.chapter_id
            and q.id = m.question_id
            and q.active = 1
            and c.class_id = m.class_id
            and kl.id = m.class_id
     ) t
order by q.id asc

但是group by 是正确的做法。 distinct中的关键词select实际上只是用于在所有非聚合函数列上执行分组的语法糖。

SELECT distinct给出了错误的计数

2 个答案: