Question

我需要了解每个表列的一些值，并希望能够在一个查询中做到这一点。

假设我们有一个包含列的表：A，B，C。

A     B      C
--------------------
Red   Red    Red
Red   Blue   Red
Blue  Green  Red
Blue  Green  Red

我想要一个输出，该输出说明A，B和C作为单独的列有多少个唯一值。所以，它会给出

2, 3, 1

A（红色和蓝色）的2个唯一值
B的3个唯一值（红色，蓝色和绿色）
C（红色）的1个唯一值

反正有机会在一个电话中得到它。

此外，我想获得最常用值的频率：

2, 2, 4

2，因为有2个红色（或蓝色，值相同），
2因为有2个绿色，
4因为有4个红色

在相同或另一个查询中。

我不想为每列进行单独的查询，因为理论上可能会有很多列。

有有效的方法吗？

Answer 1

使用aggregate functiions和DISTINCT，每列有多少个唯一值：

select
  count(distinct a) as cnt_a,
  count(distinct b) as cnt_b,
  count(distinct c) as cnt_c
from yourtable

返回：

2,3,1

使用window functions和aggregate functiions的最常见值的频率：

select 
  max(cnt_a) as fr_a,
  max(cnt_b) as fr_b,
  max(cnt_c) as fr_c
from (
  select
    count(*) over (partition by a) as cnt_a,
    count(*) over (partition by b) as cnt_b,
    count(*) over (partition by c) as cnt_c
  from yourtable
) t

返回：

2,2,4

与UNION ALL 组合在一起：

select
  'unique values' as description,
  count(distinct a) as cnt_a,
  count(distinct b) as cnt_b,
  count(distinct c) as cnt_c
from yourtable
union all
select
  'freq of most common value',
  max(cnt_a),
  max(cnt_b),
  max(cnt_c)
from (
  select
    count(*) over (partition by a) as cnt_a,
    count(*) over (partition by b) as cnt_b,
    count(*) over (partition by c) as cnt_c
  from yourtable
) t

返回：

        description        | cnt_a | cnt_b | cnt_c
---------------------------+-------+-------+-------
 unique values             |     2 |     3 |     1
 freq of most common value |     2 |     2 |     4

在PostgreSQL表中查找数据的统计信息。每列的唯一计数和最高频率

1 个答案: