选择两列对的最新不同记录

时间:2017-05-04 23:07:59

标签: sql sql-server

要求:

我需要为每个不同的([SKU][Store])组合之一选择[Cost][Retail][SKU][Store]来自[PriceChanges],其中[Date]是最新的(不超过2017-04-25),[Flag] = 0。我还只想选择[PriceChanges] = 100的[Dept]条记录,并通过[Items]加入[SKU]来确定。

下面是我的表格中的一些混淆样本数据,但实际上我希望在我的结果集中回收大约200万条唯一记录。

[PriceChanges](示例):

+--------+-------+--------+--------+------------+------+
|  SKU   | Store |  Cost  | Retail |    Date    | Flag |
+--------+-------+--------+--------+------------+------+
| 999999 |  1000 | 4.0850 | 4.09   | 2017-04-19 | 0    |
| 999998 |  1001 | 4.0850 | 4.09   | 2017-04-19 | 1    |
| 999999 |  1000 | 4.0650 | 4.07   | 2017-04-18 | 2    |
+--------+-------+--------+--------+------------+------+

[Items](示例):

+--------+------+
|  SKU   | Dept |
+--------+------+
| 999999 |  100 |
| 999998 |  101 |
+--------+------+

我当前的解决方案:

SELECT s.[SKU],
     s.[Store],
     [Cost],
     [Retail]
FROM [PriceChanges]  s
    RIGHT JOIN
(
   SELECT [SKU],
        [Store],
        [MaxDate] = MAX([Date])
   FROM [PriceChanges]
       LEFT JOIN [Items] ON [PriceChanges].[SKU] = [Items].[SKU]
                                                 AND [Date] < '2017-04-25'
                                                 AND [Dept] = 100
                                                 AND [Flag] = 0
   GROUP BY [SKU],
          [Store]
) m ON m.[SKU] = s.[SKU]
     AND m.[Store] = s.[Store]
     AND m.[MaxDate] = s.[Date];

上述解决方案似乎没有效果,因为根据我们拥有的不同SKU和商店的数量,它的回报率比我预期的多40%。编写此查询的最有效方法是什么?

2 个答案:

答案 0 :(得分:2)

如果你肯定只想要SKU和Store返回一行,你可以使用以下查询:

SELECT
   [SKU]
  ,[Store]
  ,[Cost]
  ,[Retail]
FROM (
  SELECT
     p.[SKU]
    ,p.[Store]
    ,p.[Cost]
    ,p.[Retail]
    ,ROW_NUMBER() OVER (PARTITION BY p.[SKU], p.[Store] ORDER BY p.[Date] DESC) as ranker
  FROM [PriceChanges] p
  JOIN [Items] i
    ON p.[SKU] = i.[SKU]
  WHERE 1=1
    AND i.[Dept] = 100
    AND p.[Flag] = 0
    AND p.[Date] < '2017-04-25'
) T
WHERE 1=1
  AND ranker = 1

答案 1 :(得分:1)

试试这个,

SELECT [SKU],
     [Store],
     [Cost],
     [Retail]
FROM
(
   SELECT [SKU],
        [Store],
        [Cost],
        [Retail],
        ROW_NUMBER() OVER(PARTITION BY [SKU],
                                 [Store] ORDER BY [Date] DESC) rn
   FROM PriceChanges PC
   WHERE [Date] <= '2017-04-25'
        AND [Flag] = 0
        AND EXISTS
   (
      SELECT [SKU]
      FROM [Items] i
      WHERE pc.[SKU] = i.[SKU]
           AND [DEPT] = 100
   )
) t4
WHERE rn = 1;