从Oracle中删除表中的重复行

时间:2009-02-09 17:34:44

标签: sql oracle duplicates delete-row

我正在Oracle中测试一些东西并用一些示例数据填充表格,但是在这个过程中我意外地加载了重复记录,所以现在我无法使用某些列创建主键。

如何删除所有重复的行并只留下其中一行?

23 个答案:

答案 0 :(得分:259)

使用rowid伪列。

DELETE FROM your_table
WHERE rowid not in
(SELECT MIN(rowid)
FROM your_table
GROUP BY column1, column2, column3);

column1column2column3构成每条记录的识别密钥。您可以列出所有列。

答案 1 :(得分:15)

来自Ask Tom

delete from t
 where rowid IN ( select rid
                    from (select rowid rid, 
                                 row_number() over (partition by 
                         companyid, agentid, class , status, terminationdate
                                   order by rowid) rn
                            from t)
                   where rn <> 1);

(修复了缺失的括号)

答案 2 :(得分:11)

来自DevX.com

DELETE FROM our_table
WHERE rowid not in
(SELECT MIN(rowid)
FROM our_table
GROUP BY column1, column2, column3...) ;

其中column1,column2等是您要使用的键。

答案 3 :(得分:11)

DELETE FROM tablename a
      WHERE a.ROWID > ANY (SELECT b.ROWID
                             FROM tablename b
                            WHERE a.fieldname = b.fieldname
                              AND a.fieldname2 = b.fieldname2)

答案 4 :(得分:7)

解决方案1)

delete from emp
where rowid not in
(select max(rowid) from emp group by empno);

解决方案2)

delete from emp where rowid in
               (
                 select rid from
                  (
                    select rowid rid,
                      row_number() over(partition by empno order by empno) rn
                      from emp
                  )
                where rn > 1
               );

解决方案3)

delete from emp e1
         where rowid not in
          (select max(rowid) from emp e2
           where e1.empno = e2.empno ); 

答案 5 :(得分:6)

将表t2创建为选择distinct *来自t1;

答案 6 :(得分:3)

要选择重复项,只能查询格式:

SELECT GroupFunction(column1), GroupFunction(column2),..., 
COUNT(column1), column1, column2...
FROM our_table
GROUP BY column1, column2, column3...
HAVING COUNT(column1) > 1

因此,根据其他建议的正确查询是:

DELETE FROM tablename a
      WHERE a.ROWID > ANY (SELECT b.ROWID
                             FROM tablename b
                            WHERE a.fieldname = b.fieldname
                              AND a.fieldname2 = b.fieldname2
                              AND ....so on.. to identify the duplicate rows....)

此查询将保留数据库中最早的记录,用于WHERE CLAUSE中选择的条件。

Oracle认证助理(2008年)

答案 7 :(得分:3)

你应该使用游标for循环执行一个小的pl / sql块,并删除你不想保留的行。例如:

declare
prev_var my_table.var1%TYPE;

begin

for t in (select var1 from my_table order by var 1) LOOP

-- if previous var equal current var, delete the row, else keep on going.
end loop;

end;

答案 8 :(得分:2)

使用rowid -

delete from emp
 where rowid not in
 (select max(rowid) from emp group by empno);

使用自我加入 -

delete from emp e1
 where rowid not in
 (select max(rowid) from emp e2
 where e1.empno = e2.empno );

答案 9 :(得分:2)

解决方案4)

 delete from emp where rowid in
            (
             select rid from
                (
                  select rowid rid,
                  dense_rank() over(partition by empno order by rowid
                ) rn
             from emp
            )
 where rn > 1
);

答案 10 :(得分:2)

<强> 1。溶液

delete from emp
    where rowid not in
    (select max(rowid) from emp group by empno);

<强> 2。 SLOUTION

delete from emp where rowid in
               (
                 select rid from
                  (
                    select rowid rid,
                      row_number() over(partition by empno order by empno) rn
                      from emp
                  )
                where rn > 1
               );

3.solution

delete from emp e1
         where rowid not in
          (select max(rowid) from emp e2
           where e1.empno = e2.empno ); 

<强> 4。溶液

 delete from emp where rowid in
            (
             select rid from
                (
                  select rowid rid,
                  dense_rank() over(partition by empno order by rowid
                ) rn
             from emp
            )
 where rn > 1
);

答案 11 :(得分:2)

<强> 5。溶液

delete from emp where rowid in 
    (
      select  rid from
       (
         select rowid rid,rank() over (partition by emp_id order by rowid)rn from emp     
       )
     where rn > 1
    );

答案 12 :(得分:2)

DELETE from table_name where rowid not in (select min(rowid) FROM table_name group by column_name);

您还可以用其他方式删除重复记录

DELETE from table_name a where rowid > (select min(rowid) FROM table_name b where a.column=b.column);

答案 13 :(得分:1)

DELETE FROM tableName  WHERE ROWID NOT IN (SELECT   MIN (ROWID) FROM table GROUP BY columnname);

答案 14 :(得分:1)

delete from dept
where rowid in (
     select rowid
     from dept
     minus
     select max(rowid)
     from dept
     group by DEPTNO, DNAME, LOC
);

答案 15 :(得分:1)

真正大桌子的最快方式

  1. 创建具有以下结构的异常表: exceptions_table

    ROW_ID ROWID
    OWNER VARCHAR2(30)
    TABLE_NAME VARCHAR2(30)
    CONSTRAINT VARCHAR2(30)
    
  2. 尝试创建一个唯一的约束或主键,重复项将违反该约束。您将收到错误消息,因为您有重复项。例外表将包含 重复行的rowid。

    alter table add constraint
    unique --or primary key
    (dupfield1,dupfield2) exceptions into exceptions_table;
    
  3. 通过rowid和删除重复

    ,使用exceptions_table加入您的表
    delete original_dups where rowid in (select ROW_ID from exceptions_table);
    
  4. 如果要删除的行数很大,则创建一个新表(包含所有授权和索引),使用rowid反例例连接exceptions_table,并将原始表重命名为original_dups表,并将new_table_with_no_dups重命名为原始表< / p>

    create table new_table_with_no_dups AS (
        select field1, field2 ........ 
        from original_dups t1
        where not exists ( select null from exceptions_table T2 where t1.rowid = t2.row_id )
    )
    

答案 16 :(得分:1)

create table abcd(id number(10),name varchar2(20))

insert into abcd values(1,'abc')

insert into abcd values(2,'pqr')


insert into abcd values(3,'xyz')

insert into abcd values(1,'abc')

insert into abcd values(2,'pqr')

insert into abcd values(3,'xyz')


select * from abcd
id  Name
1   abc
2   pqr
3   xyz
1   abc
2   pqr
3   xyz

Delete Duplicate record but keep Distinct Record in table 

DELETE 
FROM abcd a
WHERE ROWID > (SELECT MIN(ROWID) FROM abcd b
WHERE b.id=a.id
);

run the above query 3 rows delete 

select * from abcd

id  Name 
1   abc
2   pqr
3   xyz

答案 17 :(得分:1)

检查以下脚本 -

1

Create table test(id int,sal int); 

2。

    insert into test values(1,100);    
    insert into test values(1,100);    
    insert into test values(2,200);    
    insert into test values(2,200);    
    insert into test values(3,300);    
    insert into test values(3,300);    
    commit;

3

 select * from test;    

你会在这里看到6条记录 4.run低于查询 -

delete from 
   test
where rowid in
 (select rowid from 
   (select 
     rowid,
     row_number()
    over 
     (partition by id order by sal) dup
    from test)
  where dup > 1)
  1. select * from test;
  2. 您将看到已删除重复记录 希望这能解决您的疑问。 谢谢:)

答案 18 :(得分:1)

我没有看到任何使用公用表表达式和窗口函数的答案。 这是我最容易使用的。

DELETE FROM
 YourTable
WHERE
 ROWID IN
    (WITH Duplicates
          AS (SELECT
               ROWID RID, 
               ROW_NUMBER() 
               OVER(
               PARTITION BY First_Name, Last_Name, Birth_Date)
                  AS RN
               SUM(1)
               OVER(
               PARTITION BY First_Name, Last_Name, Birth_Date
               ORDER BY ROWID ROWS BETWEEN UNBOUNDED PRECEDING 
                                       AND UNBOUNDED FOLLOWING)
                   AS CNT
              FROM
               YourTable
              WHERE
               Load_Date IS NULL)
     SELECT
      RID
     FROM
      duplicates
     WHERE
      RN > 1);

要注意的事项:

1)我们只检查分区子句中字段的重复。

2)如果您有理由选择一个副本而不是其他副本,您可以使用order by子句使该行具有row_number()= 1

3)您可以通过将最终的where子句更改为&#34;更改保留的数字副本;其中RN&gt; N'#34; N> = 1(我以为N = 0会删除所有有重复的行,但它会删除所有行)。

4)在CTE查询中添加了Sum分区字段,该查询将使用组中的数字行标记每一行。因此,要选择具有重复项的行,包括第一项使用&#34; WHERE cnt&gt; 1&#34;

答案 19 :(得分:0)

This blog post 对于一般情况非常有帮助:

<块引用>

如果行完全复制(所有列中的所有值都可以有副本),则没有列可以使用!但是要保留一个,您仍然需要为每个组中的每一行提供一个唯一标识符。 幸运的是,Oracle 已经有了一些您可以使用的东西。行号。 Oracle 中的所有行都有一个 rowid。这是一个物理定位器。也就是说,它说明 Oracle 在磁盘上存储行的位置。这对每一行都是独一无二的。因此,您可以使用此值来识别和删除副本。为此,请将不相关删除中的 min() 替换为 min(rowid):

delete films
where  rowid not in (
  select min(rowid)
  from   films
  group  by title, uk_release_date
)

答案 20 :(得分:0)

解决方案:

delete from emp where rowid in
(
    select rid from
    (
        select rowid rid,
        row_number() over(partition by empno order by empno) rn
        from emp
    )
    where rn > 1
);

答案 21 :(得分:0)

为了获得最佳表现,这是我写的:
(见执行计划)

DELETE FROM your_table
WHERE rowid IN 
  (select t1.rowid from your_table  t1
      LEFT OUTER JOIN (
      SELECT MIN(rowid) as rowid, column1,column2, column3
      FROM your_table 
      GROUP BY column1, column2, column3
  )  co1 ON (t1.rowid = co1.rowid)
  WHERE co1.rowid IS NULL
);

答案 22 :(得分:0)

create or replace procedure delete_duplicate_enq as
    cursor c1 is
    select *
    from enquiry;
begin
    for z in c1 loop
        delete enquiry
        where enquiry.enquiryno = z.enquiryno
        and rowid > any
        (select rowid
        from enquiry
        where enquiry.enquiryno = z.enquiryno);
    end loop;
 end delete_duplicate_enq;