比较两组数据

时间:2016-11-02 22:37:51

标签: sql postgresql

非常抱歉,如果已经以某种方式回答了这个问题。我已经全面检查过,无法弄明白。

我需要在postgresql中找到一种比较每周数据的方法。所有数据都存在于同一个表中,并且具有周数列。数据并不总是完全重叠,但我需要比较组内的数据。

说这些是数据集:

Week 2
+--------+--------+------+---------+-------+
| group  |   num  | color|  ID     | week #|
+--------+--------+------+---------+-------+
|    a   |    1   | red  | a1red   |  2    |
|    a   |    2   | blue | a2blue  |  2    |
|    b   |    3   | blue | b3blue  |  2    |
|    c   |    7   | black| c7black |  2    |
|    d   |    8   | black| d8black |  2    |
|    d   |    9   | red  | d9red   |  2    |
|    d   |    10  | gray | d10gray |  2    |
+--------+--------+------+---------+-------+

Week 3
+--------+--------+------+---------+-------+
| group  |   num  | color|  ID     | week #|
+--------+--------+------+---------+-------+
|    a   |    1   | red  | a1red   |   3   |
|    a   |    2   | green| a2green |   3   |
|    b   |    3   | blue | b3blue  |   3   |
|    b   |    5   | green| b5green |   3   |
|    c   |    7   | black| c7black |   3   |
|    e   |    11  | blue | d11blue |   3   |
|    e   |    12  | other| d12other|   3   |
|    e   |    14  | brown| d14brown|   3   |
+--------+--------+------+---------+-------+

每一行都有一个由组,数字和颜色值组成的ID。

我需要查询从第3周中获取所有组,然后在第3周中存在第3周中的任何组:

  1. 标记组中已更改的ID,例如A组。
  2. 标记是否有任何ID被添加或删除到组中,如B组。
  3. 对于第2周中不存在的组,将第3周与第1周进行比较将是一个很好的但不是必需的功能。

    我已经考虑过将两周分开并使用拦截/除了获得结果,但我无法完全理解如何使其正常工作。任何提示都将非常感激。

1 个答案:

答案 0 :(得分:0)

只有两个(已知)周你可以做这样的事情:

select coalesce(w1.group_nr, w2.group_nr) as group_nr, 
       coalesce(w1.num, w2.num) as num, 
       case 
         when w1.group_nr is null then 'missing in first week'
         when w2.group_nr is null then 'missing in second week'
         when (w1.color, w1.id) is distinct from (w2.color, w2.id) then 'data has changed'
         else 'no change'
       end as status,
       case
          when 
                 w1.group_nr is not null 
             and w2.group_nr is not null 
             and w1.color is distinct from w2.color then 'color is different'
       end as color_change,
       case 
          when 
                 w1.group_nr is not null 
             and w2.group_nr is not null 
             and w1.id is distinct from w2.id then 'id is different'
       end as id_change
from (
  select group_nr, num, color, id, hstore
  from data
  where week = 2
) as w1
  full outer join (
  select group_nr, num, color, id
    from data
    where week = 3
  ) w2 on (w1.group_nr, w1.num) = (w2.group_nr, w2.num)

获取已更改的属性有点笨拙。如果您可以使用文字表示,则可以使用hstore扩展名来显示差异:

select coalesce(w1.group_nr, w2.group_nr) as group_nr, 
       coalesce(w1.num, w2.num) as num, 
       case 
         when w1.group_nr is null then 'missing in first week'
         when w2.group_nr is null then 'missing in second week'
         when (w1.color, w1.id) is distinct from (w2.color, w2.id) then 'data has changed'
         else 'no change'
       end as status,
       w2.attributes - w1.attributes as changed_attributes
from (
  select group_nr, num, color, id, hstore(data) - 'week'::text as attributes
  from data
  where week = 2
) as w1
  full outer join (
  select group_nr, num, color, id, hstore(data) - 'week'::text as attributes
    from data
    where week = 3
  ) w2 on (w1.group_nr, w1.num) = (w2.group_nr, w2.num);