当前位置:Gxlcms > 数据库问题 > 再战mysql 数据去重

再战mysql 数据去重

时间:2021-07-01 10:21:17 帮助过:3人阅读

年初时,写过一篇去重的,在小表中还能用用,在大表中真的是效率低下,现在给了一次优化
https://www.cnblogs.com/jarjune/p/8328013.html

继上一篇文章

方法三:

DELIMITER //

DROP PROCEDURE IF EXISTS delete_rows_2;

CREATE PROCEDURE delete_rows_2(IN TABLENAME VARCHAR(50), IN FIELDNAMES VARCHAR(100), IN AUTOFIELD VARCHAR(50))
BEGIN

DECLARE DELETE_TABLE_ROWS_SQL VARCHAR(1000);

SET DELETE_TABLE_ROWS_SQL = CONCAT('
        DELETE 
        FROM 
            ', TABLENAME ,' 
        WHERE 
            (', FIELDNAMES ,') IN ( 
                SELECT ', FIELDNAMES ,' 
                FROM (
                    SELECT 
                        ', FIELDNAMES ,' 
                    FROM 
                        ', TABLENAME ,' 
                    GROUP BY 
                        ', FIELDNAMES ,' 
                    HAVING 
                        COUNT(1) > 1 
                ) t1
            ) 
        AND ', AUTOFIELD ,' NOT IN ( 
            SELECT ', AUTOFIELD ,' 
            FROM (
                SELECT 
                    MAX(', AUTOFIELD ,') ', AUTOFIELD ,' 
                FROM 
                    ', TABLENAME ,'
                GROUP BY 
                    ', FIELDNAMES ,' 
                HAVING 
                    COUNT(1) > 1 
                ) t2
            )
');

SET @DELETE_TABLE_ROWS_SQL = DELETE_TABLE_ROWS_SQL;

PREPARE DELETE_TABLE_ROWS_SQL_PRE FROM @DELETE_TABLE_ROWS_SQL;
EXECUTE DELETE_TABLE_ROWS_SQL_PRE;

END//

DELIMITER ;

CALL delete_rows_1('表名', '字段1,字段2,字段3...', '主键(唯一)字段');

之后发现删除的效率还是挺低,又优化成

方法三(优化):

DELIMITER //

DROP PROCEDURE IF EXISTS delete_rows_2;

CREATE PROCEDURE delete_rows_2(IN TABLENAME VARCHAR(50), IN FIELDNAMES VARCHAR(100), IN AUTOFIELD VARCHAR(50))
BEGIN

DECLARE DELETE_TABLE_ROWS_SQL VARCHAR(1000);

SET DELETE_TABLE_ROWS_SQL = CONCAT('
        DELETE 
        FROM 
            ', TABLENAME ,' 
        WHERE 
            ', AUTOFIELD ,' IN ( 
                SELECT 
                    ', AUTOFIELD ,' 
                FROM
                    (
                    SELECT 
                        ', AUTOFIELD ,' 
                    FROM 
                        ', TABLENAME ,' 
                    WHERE 
                        (', FIELDNAMES ,') IN ( 
                            SELECT 
                                ', FIELDNAMES ,' 
                            FROM 
                                ', TABLENAME ,' 
                            GROUP BY 
                                ', FIELDNAMES ,' 
                            HAVING 
                                COUNT(1) > 1 
                        ) 
                    AND ', AUTOFIELD ,' NOT IN ( 
                        SELECT 
                            MAX(', AUTOFIELD ,') 
                        FROM 
                            ', TABLENAME ,'
                        GROUP BY 
                            ', FIELDNAMES ,' 
                        HAVING 
                            COUNT(1) > 1 
                    ) 
                ) t2 
            ) 
    ');

SET @DELETE_TABLE_ROWS_SQL = DELETE_TABLE_ROWS_SQL;

PREPARE DELETE_TABLE_ROWS_SQL_PRE FROM @DELETE_TABLE_ROWS_SQL;
EXECUTE DELETE_TABLE_ROWS_SQL_PRE;

END//

DELIMITER ;

CALL delete_rows_2('表名', '字段1,字段2,字段3...', '主键字段');

由于上述都要group by 两次,又换了一种思路

方法四

delete t1
FROM
    l_weijij_47 t1,
    (
        SELECT
            f01,
            f02,
            f03,
            MAX(seq_value) seq_value
        FROM
            l_weijij_47
        GROUP BY
            f01,
            f02,
            f03
        HAVING
            COUNT(1) > 1
        ORDER BY NULL
    ) t2
where
    t1.f01 = t2.f01
AND t1.f02 = t2.f02
AND t1.f03 = t2.f03
and t1.seq_value < t2.seq_value

注:group by默认会进行排序,所以要加上order by NULL就避免了排序

综上,方法四是目前在用的去重。

再战mysql 数据去重

标签:RoCE   sel   方法   mit   iter   lock   let   pre   com   

人气教程排行