时间:2021-07-01 10:21:17 帮助过:29人阅读
最近一个朋友数据库异常了,咨询我,通过分析日志发现对方人员根本不懂aix中的裸设备和Oracle数据库然后就直接使用OEM创建新表空间,导致了数据库crash而且不能正常启动 Thread 1 advanced to log sequence 4395 Current log# 1 seq# 4395 mem# 0: /dev/rorcl_r
最近一个朋友数据库异常了,咨询我,通过分析日志发现对方人员根本不懂aix中的裸设备和Oracle数据库然后就直接使用OEM创建新表空间,导致了数据库crash而且不能正常启动
Thread 1 advanced to log sequence 4395 Current log# 1 seq# 4395 mem# 0: /dev/rorcl_redo01 Thu Jun 12 19:28:38 2014 /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/orcl_redo04' SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT AUTO ORA-1119 signalled during: /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/orcl_redo04' SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT AUTO ... Thu Jun 12 19:36:23 2014 /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/orcl_redo03' SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT AUTO Thu Jun 12 19:43:56 2014 ORA-604 signalled during: /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/orcl_redo03' SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT AUTO ... Thu Jun 12 19:48:11 2014 /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/rorcl_redo03' SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT AUTO Thu Jun 12 19:48:11 2014 ORA-1537 signalled during: /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/rorcl_redo03' SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT AUTO ... Thu Jun 12 19:48:20 2014 /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/rorcl_redo04' SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT AUTO ORA-1537 signalled during: /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/rorcl_redo04' SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT AUTO ... Fri Jun 13 00:50:37 2014 Trace dumping is performing id=[cdmp_20140613005032] Fri Jun 13 00:50:40 2014 Reconfiguration started (old inc 4, new inc 6) List of nodes: 0 Global Resource Directory frozen * dead instance detected - domain 0 invalid = TRUE ………… Fri Jun 13 00:50:40 2014 Beginning instance recovery of 1 threads Reconfiguration complete Fri Jun 13 00:50:41 2014 parallel recovery started with 7 processes Fri Jun 13 00:50:43 2014 Started redo scan Fri Jun 13 00:50:43 2014 Errors in file /oracle/admin/orcl/bdump/orcl1_smon_213438.trc: ORA-00316: log 3 of thread 2, type 0 in header is not log file ORA-00312: online log 3 thread 2: '/dev/rorcl_redo03' Fri Jun 13 00:50:43 2014 Errors in file /oracle/admin/orcl/bdump/orcl1_smon_213438.trc: ORA-00316: log 3 of thread 2, type 0 in header is not log file ORA-00312: online log 3 thread 2: '/dev/rorcl_redo03' SMON: terminating instance due to error 316 Fri Jun 13 00:50:43 2014 Errors in file /oracle/admin/orcl/bdump/orcl1_lgwr_335980.trc: ORA-00316: log of thread , type in header is not log file Instance terminated by SMON, pid = 213438
从这里可以看出来,在使用OEM创建表空间的过程中犯了两个错误
1. 未分清楚aix的块设备和字符设备的命名方式
2. 对于2节点正在使用的current redo作为不适用设备当作未使用设备来创建新表空间
由于创建表空间的使用了错误的文件和错误的设备,导致2节点的当前redo(/dev/rorcl_redo03)被损坏(因为先读redo header,所以数据库中优先反馈出来的是ORA-00316: log of thread , type in header is not log file).从而导致数据库2节点先crash,然后节点1进行实例恢复,但是由于2节点的current redo已经损坏,导致实例恢复无法完成,从而两个节点都crash.因为是rac的一个节点的当前redo损坏,数据库无法正常.
如果有备份该数据库可以使用备份还原进行恢复,如果没有备份只能使用强制拉库的方法抢救数据.希望不要发生一个大的数据丢失悲剧
介绍这个案例希望给大家以警示:对数据库的裸设备操作请谨慎,不清楚切不可乱操作,否则后果严重
原文地址:Oracle安全警示录:加错裸设备导致redo异常, 感谢原作者分享。