选择并过滤基于2列的重复项

时间:2018-07-06 10:39:46

标签: sql tsql

我有一个连接到大型数据源csv文件的查询,我真的不喜欢这个查询正在使用,而我使用的DISTINCT确实过滤了一些日期和时间列。索引编号。应与HC序列号,日期和时间格式区分开。输出应该是这样

  SELECT        DISTINCT [Index Num#], [HC SERIAL CODE], MIN(Format([Date time], 'yyyy-MM-dd')) AS Startdate, MIN(Format([Date time], 'HH:mm')) AS Starttime
FROM            combinedKPI.csv
WHERE        Format([Date time], 'yyyy-MM-dd') BETWEEN DATE () AND DATE () - 111
GROUP BY [Index Num#], [HC SERIAL CODE],Format([Date time], 'yyyy-MM-dd')
ORDER BY [HC SERIAL CODE], Format([Date time], 'yyyy-MM-dd')

enter image description here

输出应该像这样

HC Serial Code   Index Num.      Start Date     Start time
    xx072               1        15/06/2018     17:29
    xx072               1        03/07/2018     17:02
    1401                1        12/12/2016     06:00

2 个答案:

答案 0 :(得分:0)

您只需要DISTINCT即可:

SELECT DISTINCT [HC SERIAL CODE], [Index Num#], 
       CAST([Date time] AS Date) AS StartDate, CAST([Date time] AS TIME(0)) AS Starttime
FROM combinedKPI.csv 
WHERE CAST([Date time] AS Date) >= CAST(GETDATE() AS Date) AND
      CAST([Date time] AS Date) <= DATEADD(DAY, -111, GETDATE());

答案 1 :(得分:0)

GROUP BYmin()是造成某些时间未显示的原因。只需使用DISINCT并保留GROUP BYmin()

SELECT DISTINCT
       [hc serial code]
       [index num#],
       format([date time], 'yyyy-MM-dd') startdate,
       format([date time], 'HH:mm') starttime
       FROM combinedKPI.csv
       WHERE Format([date time], 'yyyy-MM-dd') BETWEEN date()
                                               AND date() - 111
       ORDER BY [hc serial code],
                format([date time], 'yyyy-MM-dd');