缺少SQL Server 2008R2中一个表的记录

时间:2014-04-01 13:55:47

标签: sql-server-2008 missing-data

表1:

Date           PlacementID     CampaignID    Impressions
04/01/2014     100             10            1000
04/01/2014     101             10            1500
04/01/2014     100             11            500

表2:

Date           PlacementID      CampaignID    Cost
04/01/2014     100             10            5000
04/01/2014     101             10            6000
04/01/2014     100             11            7000
04/01/2014     103             10            8000

当我使用Full Join和Left Join语句加入此表时,我无法获得显示PlacementID 103和campaignID 10以及Cost 8000的table2中最后一行的非常见记录。但是我搜索了所有原始数据和文件,但这些丢失的记录在两个来源之间并不常见。但是,我想将这些记录包含在决赛桌中。我怎样才能做到这一点?这两个表是两个不同的源,我得到的结果只有常见的记录。

此外,当我发现缺失值是最终数字中所需的精确值时,所以想要包含所有内容。我在下面包含我的SQL脚本:

SELECT A.palcementid, 
       A.campaignid, 
       A.date, 
       Sum(A.impressions) AS Impressions, 
       Sum(CASE 
             WHEN C.placement_count > 1 THEN ( B.cost / C.placement_count ) 
             ELSE B.cost 
           END)           AS Cost 
FROM   table1 A 
       FULL JOIN table2 B 
              ON A.placementid = B.placementid 
                 AND A.campaignid = B.campaignid 
                 AND A.date = B.date 
       LEFT JOIN (SELECT Count(A.placementid) AS Placement_Count, 
                         placementid. campaignid, 
                         date 
                  FROM   table1 
                  GROUP  BY placementid, 
                            campaignid, 
                            date) c 
              ON A.placementid = C.placementid 
                 AND A.campaignid = C.campaignid 
                 AND A.date = C.date 
GROUP  BY A.placementid, 
          A.campaignid, 
          A.date 

我按照展示位置划分费用,因为在来源中费用仅分配给一个展示位置,因此我必须将这些费用分开,因为在实际的表格中,相同的Placementid在同一日期重复超过1次。

1 个答案:

答案 0 :(得分:2)

由于你没有提供任何预期的输出,我猜这里,但如果你想要的结果是这样的:

PlacementID CampaignID  Date                    Impressions Cost
----------- ----------- ----------------------- ----------- -----------
100         10          2014-04-01 02:00:00.000 1000        5000
100         11          2014-04-01 02:00:00.000 500         7000
101         10          2014-04-01 02:00:00.000 1500        6000
103         10          2014-04-01 02:00:00.000 NULL        8000

然后以下查询应该这样做:

SELECT COALESCE(A.PlacementID,b.placementid) AS PlacementID,
       COALESCE(A.campaignid, b.campaignid) AS CampaignID, 
       COALESCE(A.date, b.date) AS [Date],
       SUM(A.impressions) AS Impressions,
       SUM(CASE
             WHEN C.placement_count > 1 THEN ( B.cost / C.placement_count )
             ELSE B.cost
           END ) AS Cost
FROM   table1 A
       FULL JOIN table2 B
              ON A.[PlacementID] = B.placementid
                 AND A.campaignid = B.campaignid
                 AND A.date = B.date
       LEFT JOIN (SELECT COUNT(PlacementID) AS Placement_Count,
                         placementid, campaignid,
                         date
                  FROM   table1
                  GROUP  BY placementid,
                            campaignid,
                            date) c
              ON A.[PlacementID] = C.placementid
                 AND A.campaignid = C.campaignid
                 AND A.date = C.date
GROUP  BY COALESCE(A.PlacementID, B.PlacementID),
          COALESCE(A.campaignid, b.campaignid), 
          COALESCE(A.date, b.date)

示例SQL Fiddle