时间:2021-07-01 10:21:17 帮助过:11人阅读
1. 查询出4份数据的created_time边界值
查询结果如图一所示。
- select date_add(‘2017-01-01‘,interval 125000 second) dt1,
- date_add(‘2017-01-01‘,interval 2*125000 second) dt2,
- date_add(‘2017-01-01‘,interval 3*125000 second) dt3,
- max(created_time) dt4
- from t_source;
图一
2. 查看每份数据的记录数,确认数据平均分布
- select case when created_time >= ‘2017-01-01‘
- and created_time < ‘2017-01-02 10:43:20‘
- then ‘2017-01-01‘
- when created_time >= ‘2017-01-02 10:43:20‘
- and created_time < ‘2017-01-03 21:26:40‘
- then ‘2017-01-02 10:43:20‘
- when created_time >= ‘2017-01-03 21:26:40‘
- and created_time < ‘2017-01-05 08:10:00‘
- then ‘2017-01-03 21:26:40‘
- else ‘2017-01-05 08:10:00‘
- end min_dt,
- case when created_time >= ‘2017-01-01‘
- and created_time < ‘2017-01-02 10:43:20‘
- then ‘2017-01-02 10:43:20‘
- when created_time >= ‘2017-01-02 10:43:20‘
- and created_time < ‘2017-01-03 21:26:40‘
- then ‘2017-01-03 21:26:40‘
- when created_time >= ‘2017-01-03 21:26:40‘
- and created_time < ‘2017-01-05 08:10:00‘
- then ‘2017-01-05 08:10:00‘
- else ‘2017-01-06 18:53:20‘
- end max_dt,
- count(*)
- from t_source
- group by case when created_time >= ‘2017-01-01‘
- and created_time < ‘2017-01-02 10:43:20‘
- then ‘2017-01-01‘
- when created_time >= ‘2017-01-02 10:43:20‘
- and created_time < ‘2017-01-03 21:26:40‘
- then ‘2017-01-02 10:43:20‘
- when created_time >= ‘2017-01-03 21:26:40‘
- and created_time < ‘2017-01-05 08:10:00‘
- then ‘2017-01-03 21:26:40‘
- else ‘2017-01-05 08:10:00‘
- end,
- case when created_time >= ‘2017-01-01‘
- and created_time < ‘2017-01-02 10:43:20‘
- then ‘2017-01-02 10:43:20‘
- when created_time >= ‘2017-01-02 10:43:20‘
- and created_time < ‘2017-01-03 21:26:40‘
- then ‘2017-01-03 21:26:40‘
- when created_time >= ‘2017-01-03 21:26:40‘
- and created_time < ‘2017-01-05 08:10:00‘
- then ‘2017-01-05 08:10:00‘
- else ‘2017-01-06 18:53:20‘
- end;
查询结果如图二所示。
图二
- delimiter //
- create procedure sp_unique(i smallint)
- begin
- set @a:=‘0000-00-00 00:00:00‘;
- set @b:=‘ ‘;
- if (i<4) then
- insert into t_target
- select * from t_source force index (idx_sort)
- where created_time >= date_add(‘2017-01-01‘,interval (i-1)*125000 second)
- and created_time < date_add(‘2017-01-01‘,interval i*125000 second)
- and (@a!=created_time or @b!=item_name)
- and (@a:=created_time) is not null
- and (@b:=item_name) is not null
- order by created_time,item_name;
- commit;
- else
- insert into t_target
- select * from t_source force index (idx_sort)
- where created_time >= date_add(‘2017-01-01‘,interval (i-1)*125000 second)
- and created_time <= date_add(‘2017-01-01‘,interval i*125000 second)
- and (@a!=created_time or @b!=item_name)
- and (@a:=created_time) is not null
- and (@b:=item_name) is not null
- order by created_time,item_name;
- commit;
- end if;
- end
- //
- delimiter ;
查询的执行计划都如图三所示。
图三
1. shell后台进程
(1)建立duplicate_removal.sh文件,内容如下。
- #!/bin/bash
- mysql -vvv -u root -p123456 test -e "truncate t_target" &>/dev/null
- date ‘+%H:%M.%N‘
- for y in {1..4}
- do
- sql="call sp_unique($y)"
- mysql -vvv -u root -p123456 test -e "$sql" &>par_sql1_$y.log &
- done
- wait
- date ‘+%H:%M.%N‘
(2)执行脚本文件
- chmod 755 duplicate_removal.sh
- ./duplicate_removal.sh
执行输出入图四所示。
图四
这种方法用时3.4秒,并行执行的4个过程调用分别用时如图五所示。
图五
可以看到,每个过程的执行时间均不到3.4秒,因为是并行执行,总的过程执行时间也小于3.4秒,比单线程sql速度提高了近3倍。(1)建立事件历史日志表
- -- 用于查看事件执行时间等信息
- create table t_event_history (
- dbname varchar(128) not null default ‘‘,
- eventname varchar(128) not null default ‘‘,
- starttime datetime(3) not null default ‘0000-00-00 00:00:00‘,
- endtime datetime(3) default null,
- issuccess int(11) default null,
- duration int(11) default null,
- errormessage varchar(512) default null,
- randno int(11) default null
- );
(2)修改event_scheduler参数
- set global event_scheduler = 1;
(3)为每个并发线程创建一个事件
- delimiter //
- create event ev1 on schedule at current_timestamp + interval 1 hour on completion preserve disable do
- begin
- declare r_code char(5) default ‘00000‘;
- declare r_msg text;
- declare v_error integer;
- declare v_starttime datetime default now(3);
- declare v_randno integer default floor(rand()*100001);
- insert into t_event_history (dbname,eventname,starttime,randno)
- #作业名
- values(database(),‘ev1‘, v_starttime,v_randno);
- begin
- #异常处理段
- declare continue handler for sqlexception
- begin
- set v_error = 1;
- get diagnostics condition 1 r_code = returned_sqlstate , r_msg = message_text;
- end;
- #此处为实际调用的用户程序过程
- call sp_unique(1);
- end;
- update t_event_history set endtime=now(3),issuccess=isnull(v_error),duration=timestampdiff(microsecond,starttime,now(3)), errormessage=concat(‘error=‘,r_code,‘, message=‘,r_msg),randno=null where starttime=v_starttime and randno=v_randno;
- end
- //
- create event ev2 on schedule at current_timestamp + interval 1 hour on completion preserve disable do
- begin
- declare r_code char(5) default ‘00000‘;
- declare r_msg text;
- declare v_error integer;
- declare v_starttime datetime default now(3);
- declare v_randno integer default floor(rand()*100001);
- insert into t_event_history (dbname,eventname,starttime,randno)
- #作业名
- values(database(),‘ev2‘, v_starttime,v_randno);
- begin
- #异常处理段
- declare continue handler for sqlexception
- begin
- set v_error = 1;
- get diagnostics condition 1 r_code = returned_sqlstate , r_msg = message_text;
- end;
- #此处为实际调用的用户程序过程
- call sp_unique(2);
- end;
- update t_event_history set endtime=now(3),issuccess=isnull(v_error),duration=timestampdiff(microsecond,starttime,now(3)), errormessage=concat(‘error=‘,r_code,‘, message=‘,r_msg),randno=null where starttime=v_starttime and randno=v_randno;
- end
- //
- create event ev3 on schedule at current_timestamp + interval 1 hour on completion preserve disable do
- begin
- declare r_code char(5) default ‘00000‘;
- declare r_msg text;
- declare v_error integer;
- declare v_starttime datetime default now(3);
- declare v_randno integer default floor(rand()*100001);
- insert into t_event_history (dbname,eventname,starttime,randno)
- #作业名
- values(database(),‘ev3‘, v_starttime,v_randno);
- begin
- #异常处理段
- declare continue handler for sqlexception
- begin
- set v_error = 1;
- get diagnostics condition 1 r_code = returned_sqlstate , r_msg = message_text;
- end;
- #此处为实际调用的用户程序过程
- call sp_unique(3);
- end;
- update t_event_history set endtime=now(3),issuccess=isnull(v_error),duration=timestampdiff(microsecond,starttime,now(3)), errormessage=concat(‘error=‘,r_code,‘, message=‘,r_msg),randno=null where starttime=v_starttime and randno=v_randno;
- end
- //
- create event ev4 on schedule at current_timestamp + interval 1 hour on completion preserve disable do
- begin
- declare r_code char(5) default ‘00000‘;
- declare r_msg text;
- declare v_error integer;
- declare v_starttime datetime default now(3);
- declare v_randno integer default floor(rand()*100001);
- insert into t_event_history (dbname,eventname,starttime,randno)
- #作业名
- values(database(),‘ev4‘, v_starttime,v_randno);
- begin
- #异常处理段
- declare continue handler for sqlexception
- begin
- set v_error = 1;
- get diagnostics condition 1 r_code = returned_sqlstate , r_msg = message_text;
- end;
- #此处为实际调用的用户程序过程
- call sp_unique(4);
- end;
- update t_event_history set endtime=now(3),issuccess=isnull(v_error),duration=timestampdiff(microsecond,starttime,now(3)), errormessage=concat(‘error=‘,r_code,‘, message=‘,r_msg),randno=null where starttime=v_starttime and randno=v_randno;
- end
- //
- delimiter ;
说明:为了记录每个事件执行的时间,在事件定义中增加了操作日志表的逻辑,因为每个事件中只多执行了一条insert,一条update,4个事件总共多执行8条很简单的语句,对测试的影响可以忽略不计。执行时间精确到毫秒。
(4)触发事件执行
- mysql -vvv -u root -p123456 test -e "truncate t_target;alter event ev1 on schedule at current_timestamp enable;alter event ev2 on schedule at current_timestamp enable;alter event ev3 on schedule at current_timestamp enable;alter event ev4 on schedule at current_timestamp enable;"
说明:该命令行顺序触发了4个事件,但不会等前一个执行完才执行下一个,而是立即向下执行。从图六的输出也可以清楚地看到这一点。因此四次过程调用是并行执行的。
- select * from t_event_history;
查询结果如图7所示。
图七
将MySQL去重操作优化到极致之三弹连发(二):多线程并行执行
标签:分布 执行 where 存储过程 concat bin lte 分区表 日志