Question

是否有一种非暴力/有效的方法来确定具有如下结构的sql表中持续x分钟的最小值？该表每个case_id，每个channel_index，每秒有1条记录，每个案例可以有20个通道，在这个表中有数千个案例。我将需要针对每个案例和每个渠道执行此查询。我需要找到连续3分钟内出现的最低值和最高值。

已经计算了value_duration，以使这些类型的查询更快一些。它是在几秒钟内，可以是完全随机的。这表示从通道接收的连续读数之间的时间。

case_id     channel_index start_time              dms_value              value_duration
----------- ------------- ----------------------- ---------------------- --------------
2668        0             2011-09-28 10:24:39.000 69.5769729614258       2
2668        0             2011-09-28 10:24:41.000 69.7469329833984       2
2668        0             2011-09-28 10:24:43.000 69.8547210693359       1
2668        0             2011-09-28 10:24:44.000 69.8475494384766       1
2668        0             2011-09-28 10:24:45.000 69.9703216552734       2
2668        0             2011-09-28 10:24:47.000 69.9699172973633       1
2668        0             2011-09-28 10:24:48.000 70.0099258422852       2
2668        0             2011-09-28 10:24:50.000 70.2449035644531       1
2668        0             2011-09-28 10:24:51.000 70.0424575805664       2
2668        0             2011-09-28 10:24:53.000 70.1216125488281       1
2668        0             2011-09-28 10:24:54.000 69.5616912841797       1
2668        0             2011-09-28 10:24:55.000 69.5902786254883       2
2668        0             2011-09-28 10:24:57.000 70.0330581665039       1
2668        0             2011-09-28 10:24:58.000 70.4709854125977       1
2668        0             2011-09-28 10:24:59.000 70.7001647949219       2
2668        0             2011-09-28 10:25:01.000 70.274040222168        1
2668        0             2011-09-28 10:25:02.000 69.7524795532227       1
2668        0             2011-09-28 10:25:03.000 69.4606552124023       2
2668        0             2011-09-28 10:25:05.000 69.6096954345703       1
2668        0             2011-09-28 10:25:06.000 69.8238906860352       1

我希望不必对一个值进行循环测试，递增，然后测试下一个，依此类推。

例如，从上面的数据集中，如果我想知道连续5秒的最低值，它将是69.8238906860352。如果我连续8秒需要它，它将是69.9703216552734。

以下是完整的表结构：

CREATE TABLE [dbo].[continuous_data](
    [case_id] [int] NOT NULL,
    [channel_index] [smallint] NOT NULL,
    [start_time] [datetime] NOT NULL,
    [dms_type] [char](1) NOT NULL,
    [dms_value] [float] NOT NULL,
    [value_duration] [smallint] NOT NULL,
    [error_code] [int] NULL
) ON [PRIMARY]

编辑3-5-12：所以我实施了一种蛮力的方式来计算最低的持续值，当一个特定案例有几千条记录时它似乎工作正常但是在一个有110万的情况下进行测试时我最终在37分钟后取消它......这里是我正在使用的代码。任何人都有关于优化的想法吗？

ALTER procedure [dbo].[GetSustainedValues]( 
  @case_id int,
  @time_limit int, 
  @bypass_only bit = NULL)
as 
begin

DECLARE @time DateTime, @channelindex int, @lastchannelindex int
DECLARE @tmin float, @tmax float, @min float, @max float, @caseid int

DECLARE @results TABLE(case_id int, channel_index int, max float null, min float null)
DECLARE CursorName CURSOR FAST_FORWARD
    FOR SELECT start_time, channel_index from continuous_data where case_id = @case_id order by channel_index, start_time
OPEN CursorName
FETCH NEXT FROM CursorName INTO @time, @channelindex
SET @lastchannelindex = @channelindex
WHILE @@FETCH_STATUS = 0
BEGIN
    --PRINT 'hello' --'Chennel:' + CONVERT (VARCHAR(50), @channelindex,128) + '  Time:' + CONVERT (VARCHAR(50), @time,128)
    IF @lastchannelindex != @channelindex
    BEGIN
        --PRINT 'Starting new channel:' + CONVERT (VARCHAR(50), @channelindex,128)
        -- we are starting on a new channel so insert that data into the results
        -- table and reset the min/max
        INSERT INTO @results(case_id, channel_index, max, min) VALUES(@case_id, @lastchannelindex, @max, @min)
        SET @max = null
        SET @min = null
        SET @lastchannelindex = @channelindex
    END

    Select @tmax = MAX(dms_value), @tmin = MIN(dms_value)
    from continuous_data
    where case_id = @case_id and channel_index = @channelindex and start_time between DATEADD(s, -(@time_limit-1), @time) and @time 
    HAVING SUM(value_duration) >= @time_limit
    IF @@ROWCOUNT > 0
    BEGIN
        IF @max IS null OR @tmin > @max
        BEGIN
            --PRINT 'Setting max:' + CONVERT (VARCHAR(50), @tmin,128) + ' for channel:' + CONVERT (VARCHAR(50), @channelindex,128)
            set @max = @tmin
        END

        IF @min IS null OR @tmax < @min
        BEGIN
            set @min = @tmax
        END
    END
    --PRINT 'Max:' + CONVERT (VARCHAR(50), @max,128) + '  Min:' + CONVERT (VARCHAR(50), @min,128)
    FETCH NEXT FROM CursorName INTO @time, @channelindex
END
CLOSE CursorName
DEALLOCATE CursorName
--PRINT 'Max:' + CONVERT (VARCHAR(50), @max,128) + '  Min:' + CONVERT (VARCHAR(50), @min,128)
SELECT * FROM @results
end

编辑：2012年3月7日还是没有找到答案。有没有一种更有效的方法来使用可以从存储过程调用的.Net dll来做到这一点？在这里寻找任何建议。谢谢！

Answer 1

我不确定我是否完全理解你的问题，但你的意思是这样吗？

select min(value_duration) as lowest, max(value_duration) as highest from mytable where case_id=2668 and start_time between Cast('2011-09-28 10:26:04' as datetime) and cast('2011-09-28 10:26:07' as datetime);

此查询检索案例2668的最低和最高value_duration，在10:26:04和10:26:07之间。

希望这有帮助！

Answer 2

从mytable中选择前3名value_duration，其中case_id = 2668和Cast之间的start_time（'2011-09-28 10:26:04'作为datetime）和cast（'2011-09-28 10:26:07'作为datetime ）按value_duration排序联盟选择toptable 3 value_duration from mytable where case_id = 2668和start_time介于Cast（'2011-09-28 10:26:04'作为datetime）和cast（'2011-09-28 10:26:07'作为datetime）的顺序value_duration 工会DESC;

Answer 3

您可能需要对此进行调试，因为我无法访问数据库服务器。

从概念上讲，它应该像这样工作（在 Oracle 上）：

select case_id, channel_index,
     min(su_min) as sustained_min,
     max(su_max) as sustained_max
from (
    select case_id, channel_index, start_time,
        min(dms_value) over (partition by case_id, channel_index order by start_time 
             range numtodsinterval(3, 'minute') preceeding) as su_max,
        max(dms_value) over (partition by case_id, channel_index order by start_time 
             range numtodsinterval(3, 'minute') preceeding) as su_min, 
        min(start_time) over (partition by case_id, channel_index order by start_time)
             as first_time
    from  data_table order by start_time 
    ) as su_data
where  
    first_time + numtodsinterval(3, 'minute') <= start_time
group by
    case_id, channel_index

这应该可以获得连续三分钟内未超过/欠载的最小值/最大值（可能有一些细节需要考虑时间间隔之外的值）。

基本上，您首先计算所有时间间隔的最小值/最大值，然后选择其中的最高/最低值。 first_time子句用于排除时间序列开头的值。

编辑：解决方案实际上是针对Oracle的，但重要的部分是标准的SQL：2003。

Answer 4

根据您的数据对此进行测试，并与您的结果相符。

我在[Case]表上进行自我连接并过滤c1.start_time＆lt;的行。 c2.start_time。

然后选择end_time-start_time =所需时间差异的行。

然后使用子查询获取该范围内的max（dms_value）。

然后按case_id，channel_index分组并查找min（max（dms_value））。

declare @timeDiff int = 5  -- seconds

select case_id, channel_index, MIN(dms_max)
from 
(
    select t2.*, 
        (select MAX(dms_value) 
            from [Case] c 
            where
                c.case_id = t2.case_id and 
                c.channel_index = t2.channel_index and
                c.start_time between t2.start_time and t2.end_time) dms_max
    from
    (
        select * from (
            select c1.case_id, c1.channel_index, c1.start_time, c2.start_time end_time from [Case] c1
            inner join [Case] c2 on 
                c1.case_id = c2.case_id and 
                c1.channel_index = c2.channel_index and
                c1.start_time < c2.start_time
        )
        t
        where DATEDIFF(ss, start_time, end_time) 
            between @timeDiff - 1 and @timeDiff - 1 + 
            (select max(value_duration) from [Case] c where c.case_id = case_id and c.channel_index = channel_index)
    )
    t2
)
t3
group by case_id, channel_index

如何找到最低的持续价值

4 个答案: