ORA-00600 [kcratr_nab_less_than_odr] [4194]错误解决办法

243次阅读

共计 6463 个字符，预计需要花费 17 分钟才能阅读完成。

业务场景：公司电缆被挖断，突然断电导致的宕机。

[Oracle@slave1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.1.0 Production on Mon Apr 17 11:09:59 2017

Copyright (c) 1982, 2009, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> startup mount – 加载到 mount 状态

ORACLE instance started.

Total System Global Area 1603411968 bytes
Fixed Size 2213776 bytes
Variable Size 1023412336 bytes
Database Buffers 570425344 bytes
Redo Buffers 7360512 bytes
Database mounted.

SQL> alter database open;

alter database open
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1],
[199], [48589], [48614], [], [], [], [], [], [], []

[oracle@slave1 trace]$ cd /u01/app/oracle/diag/rdbms/zxdb/zxdb/trace

– 查看警报日志，具体错误看日志里写的 trc 文件。服务器异常断电，导致 LGWR 写联机日志文件时报错，下次重启数据库时，需要实例级恢复，而又无法从联机日志文件中获取到 redo 信息，因为上次断电，写日志报错。

Trace file /u01/app/oracle/diag/rdbms/chuzhouqyhx/chuzhouqyhx/trace/chuzhouqyhx_ora_6857.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_1
System name: Linux
Node name: slave1
Release: 2.6.32-431.el6.x86_64
Version: #1 SMP Fri Nov 22 03:15:09 UTC 2013
Machine: x86_64
Instance name: chuzhouqyhx
Redo thread mounted by this instance: 1
Oracle process number: 19
Unix process pid: 6857, image: oracle@slave1 (TNS V1-V3)

*** 2017-04-13 09:38:01.921
*** SESSION ID:(191.3) 2017-04-13 09:38:01.921
*** CLIENT ID:() 2017-04-13 09:38:01.921
*** SERVICE NAME:() 2017-04-13 09:38:01.921
*** MODULE NAME:(sqlplus@slave1 (TNS V1-V3)) 2017-04-13 09:38:01.921
*** ACTION NAME:() 2017-04-13 09:38:01.921

Successfully allocated 3 recovery slaves
Using 45 overflow buffers per recovery slave
Thread 1 checkpoint: logseq 199, block 2, scn 4644236
cache-low rba: logseq 199, block 48589
on-disk rba: logseq 199, block 48614, scn 4653640
start recovery at logseq 199, block 48589, scn 0

*** 2017-04-13 09:38:01.930
Started writing zeroblks thread 1 seq 199 blocks 48589-48596

*** 2017-04-13 09:38:01.932
Completed writing zeroblks thread 1 seq 199
==== Redo read statistics for thread 1 ====
Total physical reads (from disk and memory): 4096Kb
— Redo read_disk statistics —
Read rate (ASYNC): 0Kb in 0.01s => 0.00 Mb/sec
———————————————-
—– Recovery Hash Table Statistics ———
Hash table buckets = 32768
Longest hash chain = 0
Average hash chain = 0/0 = 0.0
Max compares per lookup = 0
Avg compares per lookup = 0/0 = 0.0
———————————————-
WARNING! Crash recovery of thread 1 seq 199 is
ending at redo block 48589 but should not have ended before
redo block 48614
Incident 6154 created, dump file: /u01/app/oracle/diag/rdbms/chuzhouqyhx/chuzhouqyhx/incident/incdir_6154/chuzhouqyhx_ora_6857_i6154.trc
ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [199], [48589], [48614], [], [], [], [], [], [], []

ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [199], [48589], [48614], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [199], [48589], [48614], [], [], [], [], [], [], []

SQL> recover database until cancel using backup controlfile; – 文件选择的时候，日志要选对，在 alert 日志里有写编号。

ORA-00279: change 4644236 generated at 04/12/2017 22:03:56 needed for thread 1
ORA-00289: suggestion :
/oradata/flash_recovery_area/CHUZHOUQYHX/archivelog/2017_04_13/o1_mf_1_199_%u_.a
rc
ORA-00280: change 4644236 for thread 1 is in sequence #199

Specify log: {<ret>=suggested | filename | AUTO | CANCEL}
/oradata/chuzhouqyhx/redo01.log
Log applied.
Media recovery complete.
</ret>

SQL> alter database open; —

alter database open
*
ERROR at line 1:
ORA-01589: must use RESETLOGS or NORESETLOGS option for database open

alter database open RESETLOGS; —

– 解决方案

从这里可以看到出现了 ORA-600[4194]/[4193]，第一感觉就是 undo 出现问题。
4193：表示 undo 和 redo 不一致（Arg [a] Undo record seq number，Arg [b] Redo record seq number）；
4194：表示也是 undo 和 redo 不一致（Arg [a] Maximum Undo record number in Undo block，Arg [b] Undo record number from Redo block）

所以出现 4193 错误，解决方法跟 4194 一样。

Sat Apr 15 10:08:17 2017
Sweep [inc][11037]: completed
Sweep [inc2][11037]: completed
Sat Apr 15 14:07:16 2017
Errors in file /u01/app/oracle/diag/rdbms/chuzhouqyhx/chuzhouqyhx/trace/chuzhouqyhx_j000_29485.trc (incident=11028):
ORA-01578: ORACLE data block corrupted (file # 2, block # 62520)
ORA-01110: data file 2: ‘/oradata/chuzhouqyhx/sysaux01.dbf’
ORA-26040: Data block was loaded using the NOLOGGING option

[oracle@slave1 trace]$ tail -100f alert_chuzhouqyhx.log

Database altered.

Thu Apr 13 10:02:58 2017
Doing block recovery for file 3 block 1661
Resuming block recovery (PMON) for file 3 block 1661
Block recovery from logseq 1, block 53 to scn 4653766
Recovery of Online Redo Log: Thread 1 Group 1 Seq 1 Reading mem 0
Mem# 0: /oradata/chuzhouqyhx/redo01.log
Block recovery completed at rba 1.63.16, scn 0.4653770
Errors in file /u01/app/oracle/diag/rdbms/chuzhouqyhx/chuzhouqyhx/trace/chuzhouqyhx_pmon_6817.trc (incident=6018):
ORA-00600: internal error code, arguments: [4194], [0x000000004], [0x000000000], [], [], [], [], [], [], [], [], []
Errors in file /u01/app/oracle/diag/rdbms/chuzhouqyhx/chuzhouqyhx/trace/chuzhouqyhx_pmon_6817.trc:
ORA-00600: internal error code, arguments: [4194], [0x000000004], [0x000000000], [], [], [], [], [], [], [], [], []
PMON (ospid: 6817): terminating the instance due to error 472
Instance terminated by PMON, pid = 6817

– 新建个临时表空间，修改默认 spflie 里的 undo_tablespace 参数。记得手动修改下 pflie 的默认表空间参数。

SQL>CREATE UNDO TABLESPACE UNDOTBS2 DATAFILE ‘/oradata/chuzhouqyhx/undotbs02.dbf’ SIZE 200M REUSE AUTOEXTEND ON NEXT 10M MAXSIZE 8G;

SQL>alter system set undo_tablespace=’UNDOTBS2′ scope=both;

[oracle@slave1 dbs]$ cat spfilechuzhouqyhx.ora – 查看下 undo_tablespace 参数

chuzhouqyhx.__Java_pool_size=16777216
chuzhouqyhx.__large_pool_size=16777216
chuzhouqyhx.__oracle_base=’/u01/app/oracle’#ORACLE_BASE set from environment
chuzhouqyhx.__pga_aggregate_target=536870912
chuzhouqyhx.__sga_target=1610612736
chuzhouqyhx.__shared_io_pool_size=0
chuzhouqyhx.__shared_pool_size=956301312
chuzhouqyhx.__streams_pool_size=33554432
*.audit_file_dest=’/u01/app/oracle/admin/chuzhouqyhx/adump’
*.audit_trail=’db’
*.compatible=’11.2CC”|0.0.0′
*.control_files=’/oradata/chuzhouqyhx/control01.ctl’,’/oradata/flash_recovery_area/chuzhouqyhx/control02.ctl’
*.db_block_size=8192
*.db_domain=”
*.db_name=’chuzhouq’
*.db_recovery_file_dest=’/oradata/flash_recovery_area’
*.db_recovery_file_dest_size=2936012800
*.db_unique_name=’chuzhouqyhx’
*.diagnostic_dest=’/u01/app/oracle’
*.dispatchers='(PROTOCOL=TCP) (SERVICE=chuzhouqyhxXDB)’
*.open_cursors=300
*.pga_aggregate_target=536870912
*.processes=150
*.remote_login_passwordfileCC”5f=’EXCLUSIVE’
*.sga_target=1610612736
*.undo_tablespace=’UNDOTBS2′