当前位置:Gxlcms > mysql > Oracle删除重复记录的几种方式

Oracle删除重复记录的几种方式

时间:2021-07-01 10:21:17 帮助过:32人阅读

如果把一个文件多次导入数据库,可能会引入重复记录,那么有哪些方法可以删除重复记录呢?

Oracle删除重复记录的几种方式

如果把一个文件多次导入数据库,可能会引入重复记录,那么有哪些方法可以删除重复记录呢?

REATE TABLE tbl_test(
SER_NO NUMBER,
FST_NM VARCHAR2(30),
DEPTID NUMBER,
CMNT VARCHAR2(30));

INSERT INTO tbl_test VALUES(1, 'aaaaa', 2004, 'xxx');
INSERT INTO tbl_test VALUES(2, 'bbbbb', 2005, 'yyy');
INSERT INTO tbl_test VALUES(1, 'aaaaa', 2004, 'xxx');
INSERT INTO tbl_test VALUES(1, 'aaaaa', 2004, 'xxx');
INSERT INTO tbl_test VALUES(3, 'ccccc', 2005, 'zzz');
INSERT INTO tbl_test VALUES(2, 'bbbbb', 2005, 'yyy');

1.Using MIN(rowid) 最常用的方法,但是数据量大的话执行会很长时间

DELETE FROM tbl_test
WHERE ROWID NOT IN (SELECT MIN (ROWID)
FROM tbl_test
GROUP BY ser_no, fst_nm, deptid, cmnt);

2.Using MIN(rowid) & Join 跟第一条差不多

DELETE FROM tbl_test t
WHERE t.ROWID NOT IN (SELECT MIN (b.ROWID)
FROM tbl_test b
WHERE b.ser_no = t.ser_no
AND b.fst_nm = t.fst_nm
AND b.deptid = t.deptid
AND b.cmnt = t.cmnt);

3.Using Subquery

DELETE FROM tbl_test
WHERE ser_no IN (SELECT ser_no FROM tbl_test GROUP BY ser_no, fst_nm, deptid, cmnt HAVING COUNT (*) > 1)
AND fst_nm IN (SELECT fst_nm FROM tbl_test GROUP BY ser_no, fst_nm, deptid, cmnt HAVING COUNT (*) > 1)
AND deptid IN (SELECT deptid FROM tbl_test GROUP BY ser_no, fst_nm, deptid, cmnt HAVING COUNT (*) > 1)
AND cmnt IN (SELECT cmnt FROM tbl_test GROUP BY ser_no, fst_nm, deptid, cmnt HAVING COUNT (*) > 1)
AND ROWID NOT IN (SELECT MIN (ROWID)
FROM tbl_test
GROUP BY ser_no, fst_nm, deptid, cmnt
HAVING COUNT (*) > 1)

4. Using Nested Subqueries

DELETE FROM tbl_test a WHERE (a.ser_no, a.fst_nm, a.deptid, a.cmnt) IN (SELECT b.ser_no, b.fst_nm, b.deptid, b.cmnt
FROM tbl_test b WHERE a.ser_no = b.ser_no AND a.fst_nm = b.fst_nm AND a.deptid = b.deptid AND a.cmnt = b.cmnt AND
a.ROWID > b.ROWID);

5. Using Analytic Fucntions: 对于大表这是最有效的方法

DELETE FROM tbl_test WHERE ROWID IN (SELECT rid FROM (SELECT ROWID rid,
ROW_NUMBER () OVER (PARTITION BY ser_no, fst_nm, deptid, cmnt ORDER BY ROWID) rn FROM tbl_test)WHERE rn <> 1);

6. CREATE-DROP-RENAME 对资源使用比较合理,,特别对于大表。但是如果需要回滚则会产生大量undo日志信息。

CREATE TABLE tbl_test1 NOLOGGING AS SELECT tbl_test .*
FROM tbl_test tbl_test WHERE ROWID IN (SELECT rid
FROM (SELECT ROWID rid, ROW_NUMBER() OVER (PARTITION BY ser_no, fst_nm, deptid, cmnt ORDER BY ROWID) rn
FROM tbl_test) WHERE rn=1);

DROP TABLE tbl_test; --drop the original table with lots of duplicate

RENAME tbl_test1 TO tbl_test; -- your original table without duplicates.

linux

人气教程排行