优化查询,复制具有最早日期的行

时间:2014-07-01 09:04:38

标签: mysql sql performance query-optimization

我想优化查询,但我不知道如何做到这一点。这是我要查询的表:

Device table:

Id    || PushId  || created

abc        aaa        10/10/13
def        aaa        10/12/13
efg        abb         9/9/12

我想要的查询如下:我想获取重复的PushIds并从表中删除两个中最旧的条目。这就是我现在所做的(选择而不是删除,因为我还处于测试阶段)

select m.* from 

(select pushId, created 
from Device 
group by pushId 
having count(*) >1)

 as m inner join Device mm on mm.pushId = m.pushId and mm.created = m.created;

这正确地返回应删除的内容,但它非常非常慢。有更快的方法吗?没有临时表有没有办法做到这一点?即单次扫描?

编辑:这是MySQL我错误地在那里放了一个MS-SQL标签。道歉家伙

7 个答案:

答案 0 :(得分:0)

您可以使用行号:

Select *
From (Select *,
             Row_Number() over(Partition by Pushid order by created) as row
      From YourTable
)z
where z.row = 1

答案 1 :(得分:0)

这将使用@ variables在MySQL中提供等效的row_number()。在这里,它找到除了每个PushId的最近2行之外的所有行

SELECT
      PushId
    , Id
    , created
FROM (
      SELECT
               @row_num :=IF(@prev_value = d.PushId,@row_num+1,1)AS RN
             , d.PushId
             , d.Id
             , d.created
             , @prev_value := d.PushId
      FROM tblDevices d
      CROSS JOIN(SELECT @row_num :=1, @prev_value :='') vars
      ORDER BY
               d.PushId
             , d.created DESC
      ) SQ
WHERE RN > 2
;

您可以通过更改顺序(例如,更改为ASC)来更改结果,以找到最旧的记录。请注意,交叉连接仅用于"附加" 2 @ vars到每一行&因为只有一行,它对实际的记录数没有影响。然后在查询中设置变量。

答案 2 :(得分:0)

可能需要进行一些操作才能使其适合您的删除语句,但尝试使用MIN函数来查找最低日期+ id组合,其中有多个条目。然后从结果中删除日期,仅提供正确的ID:

delete from Device where id in (
    select 
         right(min(cast(cast(created as unsigned) as char(5)) + id),3)
    from Device 
    group by pushid
    having count(*) > 1
)

答案 3 :(得分:0)

可能使用自联接,其中推送ID匹配且创建日期更大: -

SELECT DISTINCT b.Id
FROM table a
INNER JOIN table b
ON a.PushId = b.PushId
AND a.created > b.created

这将生成重复项,因此使用DISTINCT

答案 4 :(得分:0)

如果你必须删除很多行(取决于你的数据),..,最好用你想要的数据创建一个新表并删除旧表。 删除是“更新”的第二个最昂贵的操作

答案 5 :(得分:0)

好的,考虑到这是MySQL

 delete from Device where (push_id, created) in
 (
   select 
      pushId, 
      min(created) 
   from 
      Device 
   group by pushId 
   having count(*) >1
 )

答案 6 :(得分:0)

这是最快的(在大多数数据库系统上) 没有费用" group by"或"按顺序排列"需要

 delete from Device where (push_id, created) in
  (
   select 
     pushId, 
     created
   from 
     Device a1
   where 
     EXIST (select 1
              from Device a2
               where a1.pushId=a2.pushId
               and a2.created > a1.created
             )
    )