我们有一项小型随机研究,我们正试图报告数字。在这个数据库中,我们有八个表包含不同的随机化组(处理与控制),每个表的设计如下:
+--------+-------+----------------------+-----------------+
| caseID | patID | randomizedDate | randomizedGroup |
+--------+-------+----------------------+-----------------+
| 1 | 5000 | 2/17/2010 5:12:00 PM | T |
| 2 | 5005 | 3/11/2010 1:45:00 PM | C |
| 3 | 5007 | 3/22/2010 7:16:00 AM | C |
| 4 | 5011 | 4/10/2010 3:34:00 PM | T |
| 5 | 5015 | 4/19/2010 5:41:00 PM | C |
| 6 | 5018 | 5/23/2010 4:06:00 PM | T |
| 7 | 5021 | 6/27/2010 5:28:00 PM | T |
| 8 | NULL | NULL | C |
| 9 | NULL | NULL | T |
| 10 | NULL | NULL | T |
| 11 | NULL | NULL | C |
| 12 | NULL | NULL | C |
这些表已经预先生成了随机Ts& Cs事先使用统计程序。因此,根据我们项目的预设标准,我们有八组等待填充。没有PatID将存在于多个表中。
我们需要的是基于randomizedGroup列对这些表中的计数进行细分。例如:
+--------------------+--------+--------+--------+----------+
| randomizationGroup | Table1 | Table2 | Table3 | So on... |
+--------------------+--------+--------+--------+----------+
| C | 10 | 24 | 14 | |
| T | 11 | 16 | 21 | |
+--------------------+--------+--------+--------+----------+
截至目前,我正在使用以下查询获取这些数字,但我想知道这是否是最佳的,或者我是否应该采用另一种方式。我使用SQL越多,我就越喜欢它,所以我总是希望提高自己的技能和学习。
SELECT randomizationGroup, SUM(count1) AS Table1, SUM(count2) AS Table2, SUM(count3) AS Table3, SUM(count4) AS Table4, SUM(count5) AS Table5, SUM(count6) AS Table6, SUM(count7) AS Table7, SUM(count8) AS Table8
FROM (
SELECT randomizationGroup, COUNT(*) AS count1, 0 AS count2, 0 AS count3, 0 AS count4, 0 AS count5, 0 AS count6, 0 AS count7, 0 AS count8 FROM Table1 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 0 AS count1, COUNT(*) AS count2, 0 AS count3, 0 AS count4, 0 AS count5, 0 AS count6, 0 AS count7, 0 AS count8 FROM Table2 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 0 AS count1, 0 AS count2, COUNT(*) AS count3, 0 AS count4, 0 AS count5, 0 AS count6, 0 AS count7, 0 AS count8 FROM Table3 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 0 AS count1, 0 AS count2, 0 AS count3, COUNT(*) AS count4, 0 AS count5, 0 AS count6, 0 AS count7, 0 AS count8 FROM Table4 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 0 AS count1, 0 AS count2, 0 AS count3, 0 AS count4, COUNT(*) AS count5, 0 AS count6, 0 AS count7, 0 AS count8 FROM Table5 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 0 AS count1, 0 AS count2, 0 AS count3, 0 AS count4, 0 AS count5, COUNT(*) AS count6, 0 AS count7, 0 AS count8 FROM Table6 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 0 AS count1, 0 AS count2, 0 AS count3, 0 AS count4, 0 AS count5, 0 AS count6, COUNT(*) AS count7, 0 AS count8 FROM Table7 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 0 AS count1, 0 AS count2, 0 AS count3, 0 AS count4, 0 AS count5, 0 AS count6, 0 AS count7, COUNT(*) AS count8 FROM Table8 WHERE patid IS NOT NULL GROUP BY randomizationGroup) all_groups
GROUP BY randGroup
谢谢!
答案 0 :(得分:1)
我会在所有表上创建一个视图,如果您决定合并数据,将来可能是单个表的结构。
CREATE VIEW AllTables as
SELECT randomizationGroup, 'Table1' Source, COUNT(*) C FROM Table1 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 'Table2', COUNT(*) C FROM Table2 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 'Table3', COUNT(*) C FROM Table3 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 'Table4', COUNT(*) C FROM Table4 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 'Table5', COUNT(*) C FROM Table5 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 'Table6', COUNT(*) C FROM Table6 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 'Table7', COUNT(*) C FROM Table7 WHERE patid IS NOT NULL GROUP BY randomizationGroup
UNION ALL
SELECT randomizationGroup, 'Table8', COUNT(*) C FROM Table8 WHERE patid IS NOT NULL GROUP BY randomizationGroup
GO
然后,在SQL Server 2005中使用PIVOT运算符。
SELECT randomizationGroup, Table1,Table2,Table3,Table4,Table5,Table6,Table7,Table8
FROM AllTables P
pivot (sum(C) for Source in (Table1,Table2,Table3,Table4,Table5,Table6,Table7,Table8)) V
我不会说它更快,但它肯定是你所拥有的替代品。
答案 1 :(得分:0)
以下查询应该为您提供单个表中随机组的不同计数,但我想这不是您想要的,但可能有帮助 -
SELECT randomizationGroup, COUNT(case
when randomizedGroup='C' then 1 end)
AS countforC, Count(case when
randomizedGroup='T' then 1 end) AS
countforT from Table1 group by
randomizationGroup
答案 2 :(得分:0)
你得到的东西基本上和它一样好。这很大程度上是因为你在几个表中得到了相同类型的数据。如果你在同一个表格中有某种类型的字段,那么你的形状会更好。
那不是真的,因为那时您需要使用sum on a case statements来转动数据,除非您使用的是内置PIVOT的数据库。
当然,如果你真的想要你可以制作一个能够显示八个表的UNION的视图,如果你发现自己需要它,但这似乎有点过度设计,因为你有一个有效的解决方案(除非我缺少一些要求。
答案 3 :(得分:0)
一个好老的JOIN应该可以解决问题。
SELECT
randomizationGroup = g.Grp,
Table1 = t1.Cnt,
Table2 = t2.Cnt,
Table3 = t3.Cnt,
Table4 = t4.Cnt,
Table5 = t5.Cnt,
Table6 = t6.Cnt,
Table7 = t7.Cnt,
Table8 = t8.Cnt
FROM (SELECT 'C' AS Grp UNION ALL SELECT 'T') g
INNER JOIN (
SELECT randomizationGroup, Cnt = COUNT(*)
FROM Table1
GROUP BY randomizationGroup) t1 ON g.Grp = t1.randomizationGroup
INNER JOIN (
SELECT randomizationGroup, Cnt = COUNT(*)
FROM Table2
GROUP BY randomizationGroup) t2 ON g.Grp = t2.randomizationGroup
INNER JOIN (
SELECT randomizationGroup, Cnt = COUNT(*)
FROM Table3
GROUP BY randomizationGroup) t3 ON g.Grp = t3.randomizationGroup
INNER JOIN (
SELECT randomizationGroup, Cnt = COUNT(*)
FROM Table4
GROUP BY randomizationGroup) t4 ON g.Grp = t4.randomizationGroup
INNER JOIN (
SELECT randomizationGroup, Cnt = COUNT(*)
FROM Table5
GROUP BY randomizationGroup) t5 ON g.Grp = t5.randomizationGroup
INNER JOIN (
SELECT randomizationGroup, Cnt = COUNT(*)
FROM Table6
GROUP BY randomizationGroup) t6 ON g.Grp = t6.randomizationGroup
INNER JOIN (
SELECT randomizationGroup, Cnt = COUNT(*)
FROM Table7
GROUP BY randomizationGroup) t7 ON g.Grp = t7.randomizationGroup
INNER JOIN (
SELECT randomizationGroup, Cnt = COUNT(*)
FROM Table8
GROUP BY randomizationGroup) t8 ON g.Grp = t8.randomizationGroup
此解决方案不像您的解决方案或使用PIVOT的解决方案那样通用,因为正如您所看到的,组标识符必须是硬编码的。但如果这对你有用,那很好。但是,通过将硬编码的子选项替换为从所有表中检索不同的randomizationGroup
的子选项,可能会有所帮助。