共计 4445 个字符,预计需要花费 12 分钟才能阅读完成。
环境:Linux + Oracle 11.2.0.1 ADG
现象:发现备库没有应用日志
1. 数据库查询备库目前状态
发现备库目前没有应用日志,apply lag 已经显示备库有 3 天 21 小时多没有应用日志。
SQL> set linesize 1200
SQL> SELECT OPEN_MODE, DATABASE_ROLE, SWITCHOVER_STATUS, FORCE_LOGGING, DATAGUARD_BROKER, GUARD_STATUS FROM V$DATABASE;
OPEN_MODE DATABASE_ROLE SWITCHOVER_STATUS FOR DATAGUAR GUARD_S
-------------------- ---------------- -------------------- --- -------- -------
READ ONLY PHYSICAL STANDBY NOT ALLOWED YES DISABLED NONE
SQL> select * from v$dataguard_stats;
NAME VALUE UNIT TIME_COMPUTED DATUM_TIME
-------------------------------- ---------------------------------------------------------------- ------------------------------ ------------------------------ ------------------------------
transport lag +00 00:00:00 day(2) to second(0) interval 01/17/2017 16:07:12 01/17/2017 16:07:12
apply lag +03 21:34:49 day(2) to second(0) interval 01/17/2017 16:07:12 01/17/2017 16:07:12
apply finish time +00 03:10:34.000 day(2) to second(3) interval 01/17/2017 16:07:12
estimated startup time 15 second 01/17/2017 16:07:12
2. 查询 alert 告警日志
从 alert 告警日志中定位到 ADG 出现问题的时刻,有 600 报错信息,进而导致 MRP 进程终止,详细日志如下:
Fri Jan 13 18:32:25 2017
Errors in file /home/oracle/opt/oracle/diag/rdbms/orcl/orcl/trace/orcl_pr03_22555.trc (incident=67480):
ORA-00600: internal error code, arguments: [kcbr_apply_change_11], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /home/oracle/opt/oracle/diag/rdbms/orcl/orcl/incident/incdir_67480/orcl_pr03_22555_i67480.trc
Slave exiting with ORA-600 exception
Errors in file /home/oracle/opt/oracle/diag/rdbms/orcl/orcl/trace/orcl_pr03_22555.trc:
ORA-00600: internal error code, arguments: [kcbr_apply_change_11], [], [], [], [], [], [], [], [], [], [], []
Fri Jan 13 18:32:26 2017
Errors in file /home/oracle/opt/oracle/diag/rdbms/orcl/orcl/trace/orcl_mrp0_22547.trc (incident=67448):
ORA-00600: internal error code, arguments: [kcbr_apply_change_11], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /home/oracle/opt/oracle/diag/rdbms/orcl/orcl/incident/incdir_67448/orcl_mrp0_22547_i67448.trc
Fri Jan 13 18:32:26 2017
Trace dumping is performing id=[cdmp_20170113183226]
Recovery Slave PR03 previously exited with exception 600
Fri Jan 13 18:32:27 2017
MRP0: Background Media Recovery terminated with error 448
Errors in file /home/oracle/opt/oracle/diag/rdbms/orcl/orcl/trace/orcl_pr00_22549.trc:
ORA-00448: normal completion of background process
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Fri Jan 13 18:32:27 2017
Sweep [inc][67480]: completed
Sweep [inc][67480]: completed
Recovered data files to a consistent state at change 2010287982
Errors in file /home/oracle/opt/oracle/diag/rdbms/orcl/orcl/trace/orcl_pr00_22549.trc:
ORA-00448: normal completion of background process
Errors in file /home/oracle/opt/oracle/diag/rdbms/orcl/orcl/trace/orcl_mrp0_22547.trc:
ORA-00600: internal error code, arguments: [kcbr_apply_change_11], [], [], [], [], [], [], [], [], [], [], []
MRP0: Background Media Recovery process shutdown (orcl)
Sweep [inc][67448]: completed
Sweep [inc2][67480]: completed
Sweep [inc2][67448]: completed
Trace dumping is performing id=[cdmp_20170113183227]
Fri Jan 13 18:33:04 2017
Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST
3. 尝试手工启动备库 MRP 恢复进程
发现手工启动备库 MRP 恢复进程,告警日志中依然会报出相同 ORA-600 [kcbr_apply_change_11]错误。
4. 尝试 mount 状态启动 MRP 恢复进程
发现在 mount 状态下,可以正常启动 MRP 恢复进程,等恢复完成后,重新开启 ADG 实时应用,一切正常。
shutdown immediate
startup mount
alter database recover managed standby database disconnect from session;
此时等待恢复完成...
alter database recover managed standby database cancel;
alter database open;
alter database recover managed standby database using current logfile disconnect from session;
查询备库状态确认一切正常:
SQL> SELECT OPEN_MODE, DATABASE_ROLE, SWITCHOVER_STATUS, FORCE_LOGGING, DATAGUARD_BROKER, GUARD_STATUS FROM V$DATABASE;
OPEN_MODE DATABASE_ROLE SWITCHOVER_STATUS FOR DATAGUAR GUARD_S
-------------------- ---------------- -------------------- --- -------- -------
READ ONLY WITH APPLY PHYSICAL STANDBY NOT ALLOWED YES DISABLED NONE
SQL> select * from v$dataguard_stats;
NAME VALUE UNIT TIME_COMPUTED DATUM_TIME
-------------------------------- ---------------------------------------------------------------- ------------------------------ ------------------------------ ------------------------------
transport lag +00 00:00:00 day(2) to second(0) interval 01/17/2017 17:42:26 01/17/2017 17:42:26
apply lag +00 00:00:00 day(2) to second(0) interval 01/17/2017 17:42:26 01/17/2017 17:42:26
apply finish time +00 00:00:00.000 day(2) to second(3) interval 01/17/2017 17:42:26
estimated startup time 18 second 01/17/2017 17:42:26
5. 查询 MOS,定位根本原因
查询 MOS 发现该现象与 bug 10419984 相匹配
Bug 10419984 : ACTIVE DATA GUARD STANDBY GIVES ORA-600 [KCBR_APPLY_CHANGE_11]
建议应用该补丁防止该问题再次被触发。
更多 Oracle 相关信息见Oracle 专题页面 http://www.linuxidc.com/topicnews.aspx?tid=12
本文永久更新链接地址:http://www.linuxidc.com/Linux/2017-01/139862.htm
正文完
星哥玩云-微信公众号