时间:2021-07-01 10:21:17 帮助过:23人阅读
程序错误日志大量的报死锁错误,去数据库错误日志查看确实有很多死锁(应在数据库实例启动时执行dbcc traceon(1222,-1)开启死锁跟踪): 04/29/2016 14:07:51,spid33s,δ?,waiter id=process71da6bb88 mode=IX requestType=wait04/29/2016 14:07:51,spid33s,δ?
程序错误日志大量的报死锁错误,去数据库错误日志查看确实有很多死锁(应在数据库实例启动时执行dbcc traceon(1222,-1)开启死锁跟踪):
04/29/2016 14:07:51,spid33s,δ?,waiter id=process71da6bb88 mode=IX requestType=wait 04/29/2016 14:07:51,spid33s,δ?,waiter-list 04/29/2016 14:07:51,spid33s,δ?,owner id=process5b534c8 mode=S 04/29/2016 14:07:51,spid33s,δ?,owner-list 04/29/2016 14:07:51,spid33s,δ?,pagelock fileid=1 pageid=22056721 dbid=5 objectname=ADCDB.dbo.NA_TrafficMemberOrderRelation id=lock1263c7180 mode=S associatedObjectId=72057595181989888 04/29/2016 14:07:51,spid33s,δ?,waiter id=process5b534c8 mode=S requestType=wait 04/29/2016 14:07:51,spid33s,δ?,waiter-list 04/29/2016 14:07:51,spid33s,δ?,owner id=process71da6bb88 mode=IX 04/29/2016 14:07:51,spid33s,δ?,owner-list 04/29/2016 14:07:51,spid33s,δ?,pagelock fileid=1 pageid=22075507 dbid=5 objectname=ADCDB.dbo.NA_TrafficMemberOrderRelation id=lock115c85a80 mode=IX associatedObjectId=72057595181989888 04/29/2016 14:07:51,spid33s,δ?,resource-list 04/29/2016 14:07:51,spid33s,δ?,(@p0 int@p1 int @p2 int @p3 int @p4 nvarchar(4000) @p5 nvarchar(4000) @p6 nvarchar(4000) @p7 nvarchar(4000) @p8 nvarchar(4000) @p9 nvarchar(4000) @p10 decimal(28 5) @p11 decimal(28 5) @p12 nvarchar(4000) @p13 nvarchar(4000) @p14 datetime @p15 datetime @p16 int @p17 datetime @p18 datetime @p19 datetime @p20 int)INSERT INTO NA_TrafficMemberOrderRelation (ApplySource OrderMode OrderSource OrderState ECCode ECPrdCode UserMobile MemberOrderNum MainProductCode TrafficPackageCode BillPrice FlowValue MemberApplyCode CrmApplyCode EffectTime ExprieTime UserId CreateTime OrderTime ModifyTime DiscountRate) VALUES (@p0 @p1 @p2 @p3 @p4 @p5 @p6 @p7 @p8 @p9 @p10 @p11 @p12 @p13 @p14 @p15 @p16 @p17 @p18 @p19 @p20); select SCOPE_IDENTITY() 04/29/2016 14:07:51,spid33s,δ?,inputbuf 04/29/2016 14:07:51,spid33s,δ?,unknown 04/29/2016 14:07:51,spid33s,δ?,frame procname=unknown line=1 sqlhandle=0x000000000000000000000000000000000000000000000000 04/29/2016 14:07:51,spid33s,δ?,INSERT INTO NA_TrafficMemberOrderRelation (ApplySource OrderMode OrderSource OrderState ECCode ECPrdCode UserMobile MemberOrderNum MainProductCode TrafficPackageCode BillPrice FlowValue MemberApplyCode CrmApplyCode EffectTime ExprieTime UserId CreateTime OrderTime ModifyTime DiscountRate) VALUES (@p0 @p1 @p2 @p3 @p4 @p5 @p6 @p7 @p8 @p9 @p10 @p11 @p12 @p13 @p14 @p15 @p16 @p17 @p18 @p19 @p20); 04/29/2016 14:07:51,spid33s,δ?,frame procname=adhoc line=1 stmtstart=626 stmtend=1494 sqlhandle=0x020000003e44d0396c586db1098e79558ceb28c112e75ea8 04/29/2016 14:07:51,spid33s,δ?,executionStack 04/29/2016 14:07:51,spid33s,δ?,process id=process71da6bb88 taskpriority=0 logused=2964 waitresource=PAGE: 5:1:22056721 waittime=3741 ownerId=43666495205 transactionname=user_transaction lasttranstarted=2016-04-29T14:07:45.920 XDES=0x3023d7950 lockMode=IX schedulerid=15 kpid=21872 status=suspended spid=518 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2016-04-29T14:07:47.827 lastbatchcompleted=2016-04-29T14:07:47.823 clientapp=.Net SqlClient Data Provider hostname=ADC-INTFE hostpid=10192 loginname=daemon_adcuser isolationlevel=read committed (2) xactid=43666495205 currentdb=5 lockTimeout=4294967295 clientoption1=671088672 clientoption2=128056 04/29/2016 14:07:51,spid33s,δ?,(@p0 int @p1 nvarchar(4000) @p2 nvarchar(4000) @p3 int @p4 nvarchar(4000))select TOP (@p0) trafficmem0_.TrafficMemberOrderRelationId as TrafficM1_433_ trafficmem0_.ApplySource as ApplySou2_433_ trafficmem0_.OrderMode as OrderMode433_ trafficmem0_.OrderSource as OrderSou4_433_ trafficmem0_.OrderState as OrderState433_ trafficmem0_.ECCode as ECCode433_ trafficmem0_.ECPrdCode as ECPrdCode433_ trafficmem0_.UserMobile as UserMobile433_ trafficmem0_.MemberOrderNum as MemberOr9_433_ trafficmem0_.MainProductCode as MainPro10_433_ trafficmem0_.TrafficPackageCode as Traffic11_433_ trafficmem0_.BillPrice as BillPrice433_ trafficmem0_.FlowValue as FlowValue433_ trafficmem0_.MemberApplyCode as MemberA14_433_ trafficmem0_.CrmApplyCode as CrmAppl15_433_ trafficmem0_.EffectTime as EffectTime433_ trafficmem0_.ExprieTime as ExprieTime433_ trafficmem0_.UserId as UserId433_ trafficmem0_.CreateTime as CreateTime433_ trafficmem0_.OrderTime as OrderTime433_ trafficmem0_.ModifyTime as ModifyTime433_ trafficmem0 04/29/2016 14:07:51,spid33s,δ?,inputbuf 04/29/2016 14:07:51,spid33s,δ?,unknown 04/29/2016 14:07:51,spid33s,δ?,frame procname=unknown line=1 sqlhandle=0x000000000000000000000000000000000000000000000000 04/29/2016 14:07:51,spid33s,δ?,select TOP (@p0) trafficmem0_.TrafficMemberOrderRelationId as TrafficM1_433_ trafficmem0_.ApplySource as ApplySou2_433_ trafficmem0_.OrderMode as OrderMode433_ trafficmem0_.OrderSource as OrderSou4_433_ trafficmem0_.OrderState as OrderState433_ trafficmem0_.ECCode as ECCode433_ trafficmem0_.ECPrdCode as ECPrdCode433_ trafficmem0_.UserMobile as UserMobile433_ trafficmem0_.MemberOrderNum as MemberOr9_433_ trafficmem0_.MainProductCode as MainPro10_433_ trafficmem0_.TrafficPackageCode as Traffic11_433_ trafficmem0_.BillPrice as BillPrice433_ trafficmem0_.FlowValue as FlowValue433_ trafficmem0_.MemberApplyCode as MemberA14_433_ trafficmem0_.CrmApplyCode as CrmAppl15_433_ trafficmem0_.EffectTime as EffectTime433_ trafficmem0_.ExprieTime as ExprieTime433_ trafficmem0_.UserId as UserId433_ trafficmem0_.CreateTime as CreateTime433_ trafficmem0_.OrderTime as OrderTime433_ trafficmem0_.ModifyTime as ModifyTime433_ trafficmem0_.DiscountRate as Discoun22_433_ from NA_TrafficMemberOrderRelation traffic 04/29/2016 14:07:51,spid33s,δ?,frame procname=adhoc line=1 stmtstart=148 sqlhandle=0x0200000043e4d6202dd7a9505f76a2897ed4c6cada411021 04/29/2016 14:07:51,spid33s,δ?,executionStack 04/29/2016 14:07:51,spid33s,δ?,process id=process5b534c8 taskpriority=0 logused=0 waitresource=PAGE: 5:1:22075507 waittime=3659 ownerId=43666495407 transactionname=SELECT lasttranstarted=2016-04-29T14:07:46.007 XDES=0x58b16cc90 lockMode=S schedulerid=16 kpid=18860 status=suspended spid=653 sbid=0 ecid=0 priority=0 trancount=0 lastbatchstarted=2016-04-29T14:07:46.007 lastbatchcompleted=2016-04-29T14:07:46.007 clientapp=.Net SqlClient Data Provider hostname=ADC-INTFL hostpid=8820 loginname=daemon_adcuser isolationlevel=read committed (2) xactid=43666495407 currentdb=5 lockTimeout=4294967295 clientoption1=671088672 clientoption2=128056 04/29/2016 14:07:51,spid33s,δ?,process-list 04/29/2016 14:07:51,spid33s,δ?,deadlock victim=process5b534c8 04/29/2016 14:07:51,spid33s,δ?,deadlock-list
从死锁日志的process-list部份分析,这里有两个进程 :
process id=process5b534c8正在执行对表NA_TrafficMemberOrderRelation的查询
process id=process71da6bb88正在执行对表NA_TrafficMemberOrderRelation的插入
由于死锁日志只记录当前线程开启事务正在执行的最后一句SQL,所以要想获取事务的完整SQL语句列表,必须通过Sql Sever Profiler跟踪或阅读程序代码。获得的答案是:
查询语句是单独的事务
select TOP (1) * from NA_TrafficMemberOrderRelation trafficmem0_ where trafficmem0_.UserMobile=@p1 and trafficmem0_.TrafficPackageCode=@p2 and trafficmem0_.OrderState=@p3 and trafficmem0_.ECPrdCode=@p4
插入操作的事务大概是
begin tran INSERT INTO NA_TrafficMemberOrderRelation values(...) INSERT INTO NA_TrafficMemberOrderRelation values(...) commit接着往下看resource-list部份,死锁的过程大概是
1、插入线程process71da6bb88已经获得页面(Page)22075507上的意向排他锁IX(第一个插入语句执行成功后,在页面上加的IX锁),
2、查询线程process5b534c8获得页面22056721上的共享锁
3、插入线程process71da6bb88需要获得页面22056721上的IX以执行第二条插入语句,但查询进程process5b534c8已经获得了页面22056721上的S锁,而IX和S锁是不兼容的,所以插入线程在执行第二句插入语句时,被查询进程阻塞
4、查询线程process5b534c8需要获得页面22075507上的S以继续执行,但插入线程已经获得了该页面上的IX锁,此时结合第3步,两个进程进入死锁等待
那么问题来了,表NA_TrafficMemberOrderRelation在UserMobile字段是有索引的,按正常的索引查找,只需要在索引键上加S锁,所在的页面加IS锁就可以了,而IS锁和IX锁是兼容的,就不会产生死锁。而这里查询语句确在PAGE上加了锁,那只能说明这个查询使用了索引扫描而不是索引查找(select语句会申请哪些锁,请参考(点击打开链接).
于是通过dbcc page查看页面22056721的内容发现,些页面属于表NA_TrafficMemberOrderRelation上ECPrdCode字段的索引,因此断定这个查询语句的执行计划是通过扫描ECPrdCode索引来获取结果,执行计划是根据统计信息来生成的,所以事情的经过大概是这样的:
1、表NA_TrafficMemberOrderRelation插入了大量数据,但还没有达到自动更新统计信息的阙值(表NA_TrafficMemberOrderRelation现在6500万条记录).
2、Sql server查询优化器为查询语句选择了扫描ECPrdCode索引的执行计划(此处Sql Server应有BUG)
3、通过Sql Server Profiler跟踪该语句锁的申请和释放过程发现,是先申请第一个页面的S锁,再申请第二个页面的S锁,再释放第一个页面的S锁(不知微软是出于何考虑),如此循环,导致查询语句持有一个页面的S锁,又去申请下一个页面(上一个页面的S锁还没有释放)的S锁,最终形成死锁
知道了来龙去脉,解决起来就简单了,为查询语句加上ForceSeek提示:
select TOP (1) * from NA_TrafficMemberOrderRelation trafficmem0_ with(forceseek) where trafficmem0_.UserMobile=@p1 and trafficmem0_.TrafficPackageCode=@p2 and trafficmem0_.OrderState=@p3 and trafficmem0_.ECPrdCode=@p4