SQL在多个表中获取两个组的单独计数?

时间:2011-02-08 18:09:35

标签: sql count

我们有一项小型随机研究,我们正试图报告数字。在这个数据库中,我们有八个表包含不同的随机化组(处理与控制),每个表的设计如下:

+--------+-------+----------------------+-----------------+
| caseID | patID | randomizedDate       | randomizedGroup |
+--------+-------+----------------------+-----------------+
|  1     | 5000  | 2/17/2010 5:12:00 PM |  T              |
|  2     | 5005  | 3/11/2010 1:45:00 PM |  C              |
|  3     | 5007  | 3/22/2010 7:16:00 AM |  C              |
|  4     | 5011  | 4/10/2010 3:34:00 PM |  T              |
|  5     | 5015  | 4/19/2010 5:41:00 PM |  C              |
|  6     | 5018  | 5/23/2010 4:06:00 PM |  T              |
|  7     | 5021  | 6/27/2010 5:28:00 PM |  T              |
|  8     | NULL  | NULL                 |  C              |
|  9     | NULL  | NULL                 |  T              |
|  10    | NULL  | NULL                 |  T              |
|  11    | NULL  | NULL                 |  C              |
|  12    | NULL  | NULL                 |  C              |

这些表已经预先生成了随机Ts& Cs事先使用统计程序。因此,根据我们项目的预设标准,我们有八组等待填充。没有PatID将存在于多个表中。

我们需要的是基于randomizedGroup列对这些表中的计数进行细分。例如:

+--------------------+--------+--------+--------+----------+
| randomizationGroup | Table1 | Table2 | Table3 | So on... |
+--------------------+--------+--------+--------+----------+
|  C                 | 10     | 24     |  14    |          |
|  T                 | 11     | 16     |  21    |          |
+--------------------+--------+--------+--------+----------+

截至目前,我正在使用以下查询获取这些数字,但我想知道这是否是最佳的,或者我是否应该采用另一种方式。我使用SQL越多,我就越喜欢它,所以我总是希望提高自己的技能和学习。

SELECT randomizationGroup, SUM(count1) AS Table1, SUM(count2) AS Table2, SUM(count3) AS Table3, SUM(count4) AS Table4, SUM(count5) AS Table5, SUM(count6) AS Table6, SUM(count7) AS Table7, SUM(count8) AS Table8
FROM (
    SELECT randomizationGroup, COUNT(*) AS count1, 0 AS count2, 0 AS count3, 0 AS count4, 0 AS count5, 0 AS count6, 0 AS count7, 0 AS count8 FROM Table1 WHERE patid IS NOT NULL GROUP BY randomizationGroup
    UNION ALL
    SELECT randomizationGroup, 0 AS count1, COUNT(*) AS count2, 0 AS count3, 0 AS count4, 0 AS count5, 0 AS count6, 0 AS count7, 0 AS count8 FROM Table2 WHERE patid IS NOT NULL GROUP BY randomizationGroup
    UNION ALL
    SELECT randomizationGroup, 0 AS count1, 0 AS count2, COUNT(*) AS count3, 0 AS count4, 0 AS count5, 0 AS count6, 0 AS count7, 0 AS count8 FROM Table3 WHERE patid IS NOT NULL GROUP BY randomizationGroup
    UNION ALL
    SELECT randomizationGroup, 0 AS count1, 0 AS count2, 0 AS count3, COUNT(*) AS count4, 0 AS count5, 0 AS count6, 0 AS count7, 0 AS count8 FROM Table4 WHERE patid IS NOT NULL GROUP BY randomizationGroup
    UNION ALL
    SELECT randomizationGroup, 0 AS count1, 0 AS count2, 0 AS count3, 0 AS count4, COUNT(*) AS count5, 0 AS count6, 0 AS count7, 0 AS count8 FROM Table5 WHERE patid IS NOT NULL GROUP BY randomizationGroup
    UNION ALL
    SELECT randomizationGroup, 0 AS count1, 0 AS count2, 0 AS count3, 0 AS count4, 0 AS count5, COUNT(*) AS count6, 0 AS count7, 0 AS count8 FROM Table6 WHERE patid IS NOT NULL GROUP BY randomizationGroup
    UNION ALL
    SELECT randomizationGroup, 0 AS count1, 0 AS count2, 0 AS count3, 0 AS count4, 0 AS count5, 0 AS count6, COUNT(*) AS count7, 0 AS count8 FROM Table7 WHERE patid IS NOT NULL GROUP BY randomizationGroup
    UNION ALL
    SELECT randomizationGroup, 0 AS count1, 0 AS count2, 0 AS count3, 0 AS count4, 0 AS count5, 0 AS count6, 0 AS count7, COUNT(*) AS count8 FROM Table8 WHERE patid IS NOT NULL GROUP BY randomizationGroup) all_groups
GROUP BY randGroup

谢谢!

4 个答案:

答案 0 :(得分:1)

我会在所有表上创建一个视图,如果您决定合并数据,将来可能是单个表的结构。

CREATE VIEW AllTables as
SELECT randomizationGroup, 'Table1' Source, COUNT(*) C FROM Table1 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 'Table2', COUNT(*) C FROM Table2 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 'Table3', COUNT(*) C FROM Table3 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 'Table4', COUNT(*) C FROM Table4 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 'Table5', COUNT(*) C FROM Table5 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 'Table6', COUNT(*) C FROM Table6 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 'Table7', COUNT(*) C FROM Table7 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 'Table8', COUNT(*) C FROM Table8 WHERE patid IS NOT NULL GROUP BY randomizationGroup
GO

然后,在SQL Server 2005中使用PIVOT运算符。

SELECT randomizationGroup, Table1,Table2,Table3,Table4,Table5,Table6,Table7,Table8
FROM AllTables P
pivot (sum(C) for Source in (Table1,Table2,Table3,Table4,Table5,Table6,Table7,Table8)) V

我不会说它更快,但它肯定是你所拥有的替代品。

答案 1 :(得分:0)

以下查询应该为您提供单个表中随机组的不同计数,但我想这不是您想要的,但可能有帮助 -

SELECT randomizationGroup, COUNT(case
     when randomizedGroup='C' then 1 end)
     AS countforC, Count(case when
     randomizedGroup='T' then 1 end) AS
     countforT from Table1 group by
     randomizationGroup

答案 2 :(得分:0)

你得到的东西基本上和它一样好。这很大程度上是因为你在几个表中得到了相同类型的数据。如果你在同一个表格中有某种类型的字段,那么你的形状会更好。

那不是真的,因为那时您需要使用sum on a case statements来转动数据,除非您使用的是内置PIVOT的数据库。

当然,如果你真的想要你可以制作一个能够显示八个表的UNION的视图,如果你发现自己需要它,但这似乎有点过度设计,因为你有一个有效的解决方案(除非我缺少一些要求。

答案 3 :(得分:0)

一个好老的JOIN应该可以解决问题。

SELECT
  randomizationGroup = g.Grp,
  Table1 = t1.Cnt,
  Table2 = t2.Cnt,
  Table3 = t3.Cnt,
  Table4 = t4.Cnt,
  Table5 = t5.Cnt,
  Table6 = t6.Cnt,
  Table7 = t7.Cnt,
  Table8 = t8.Cnt
FROM (SELECT 'C' AS Grp UNION ALL SELECT 'T') g
  INNER JOIN (
    SELECT randomizationGroup, Cnt = COUNT(*)
    FROM Table1
    GROUP BY randomizationGroup) t1 ON g.Grp = t1.randomizationGroup
  INNER JOIN (
    SELECT randomizationGroup, Cnt = COUNT(*)
    FROM Table2
    GROUP BY randomizationGroup) t2 ON g.Grp = t2.randomizationGroup
  INNER JOIN (
    SELECT randomizationGroup, Cnt = COUNT(*)
    FROM Table3
    GROUP BY randomizationGroup) t3 ON g.Grp = t3.randomizationGroup
  INNER JOIN (
    SELECT randomizationGroup, Cnt = COUNT(*)
    FROM Table4
    GROUP BY randomizationGroup) t4 ON g.Grp = t4.randomizationGroup
  INNER JOIN (
    SELECT randomizationGroup, Cnt = COUNT(*)
    FROM Table5
    GROUP BY randomizationGroup) t5 ON g.Grp = t5.randomizationGroup
  INNER JOIN (
    SELECT randomizationGroup, Cnt = COUNT(*)
    FROM Table6
    GROUP BY randomizationGroup) t6 ON g.Grp = t6.randomizationGroup
  INNER JOIN (
    SELECT randomizationGroup, Cnt = COUNT(*)
    FROM Table7
    GROUP BY randomizationGroup) t7 ON g.Grp = t7.randomizationGroup
  INNER JOIN (
    SELECT randomizationGroup, Cnt = COUNT(*)
    FROM Table8
    GROUP BY randomizationGroup) t8 ON g.Grp = t8.randomizationGroup

此解决方案不像您的解决方案或使用PIVOT的解决方案那样通用,因为正如您所看到的,组标识符必须是硬编码的。但如果这对你有用,那很好。但是,通过将硬编码的子选项替换为从所有表中检索不同的randomizationGroup的子选项,可能会有所帮助。

相关问题