前言:

        在通过备份进行数据恢复时,经常会遇到由于备份的不完整,特别是归档日志的缺失,导致虽然数据文件已经顺利恢复,但是数据文件的恢复时间点没有完全一致或者数据文件的fuzzy的状态不一致,从而数据库无法正常的打开,面对这种情况除了继续追增量日志将数据文件恢复到时间点一致之外,我们还可以通过bbed工具修改数据文件强制打开数据库,这种非常规方式在故障恢复时可以派上用场

测试场景:

        在Oracle11.2.0.4通过备份在异地恢复数据文件,恢复出来的数据文件checkpoint_change#并不完全一致,fuzzy的状态也不一致,此时,我们无法将数据库正常的打开,即使是resetlogs也不行

        通过open resetlogs打开数据库,至少要满足以下条件

        1 每个数据文件的resetlog_change#在相同,即同一个incarnation

        2 每个数据文件的checkpoint_change#相同

        3 每个数据文件的fuzzy状态相同,并fhsta状态为0

        而当前的恢复场景,除了条件1满足之外,其他并不满足,需要通过bbed修改数据文件

bbed测试:

        首先配置好bbed的配置文件以确保识别到数据文件

$ cat /tmp/bbed.par 
blocksize=8192
listfile=/tmp/datafile.list
mode=edit
$ cat /tmp/datafile.list 
1 /u01/app/oracle/oradata/dbocs/datafile/system 2306867200
2 /u01/app/oracle/oradata/dbocs/datafile/sysaux 660602880
3 /u01/app/oracle/oradata/dbocs/datafile/undotbs1 508559360
4 /u01/app/oracle/oradata/dbocs/datafile/users 543948800
5 /u01/app/oracle/oradata/dbocs/datafile/undotbs2 104857600
6 /u01/app/oracle/oradata/dbocs/datafile/test1 209715200

        开始修改数据文件头里面的checkpoint_change#,这个在数据文件block 1所在的结构kcvfh.kcvfhckp.kcvcpscn.kscnbas(offset 484)里面,当前值为16进制0x0049cfb9换算成10进制就是4837305

        通过bbed修改该值,通过使用assign或者modify命令进行修改,assign的优势在于可以使用10进制进行修改,不用转化成16进制,并且不需要考虑操作系统的字节序,因为对于Linux平台字节序是litter,修改值是需要从小端开始

--将scn改成比原来大,要确保数据文件头的scn大于块里面的scn
--原来是4837305,改成10000000
BBED> assign dba 1,1 kcvfh.kcvfhckp.kcvcpscn.kscnbas  = 10000000
--修改之后,要重新计算修改后的数据块的 checksum 值
BBED> sum apply 
Check value for File 1, Block 1:
current = 0x1a4c, required = 0x1a4c
--验证数据块是否正常
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/app/oracle/oradata/dbocs/datafile/system
BLOCK = 1
​
​
DBVERIFY - Verification complete
​
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED
--其他的数据文件2,3,4,5,6重复以上操作

        通过bbed修改之后,checkpoint_change#统一为10000000

select file#,name,CHECKPOINT_CHANGE#,RESETLOGS_CHANGE#,FUZZY,CHECKPOINT_COUNT
from v$datafile_header
​
     FILE# NAME                                                         CHECKPOINT_CHANGE# RESETLOGS_CHANGE# FUZ CHECKPOINT_COUNT
---------- ------------------------------------------------------------ ------------------ ----------------- --- ----------------
         1 /u01/app/oracle/oradata/dbocs/datafile/system                          10000000            925702 NO               286
         2 /u01/app/oracle/oradata/dbocs/datafile/sysaux                          10000000            925702 NO               286
         3 /u01/app/oracle/oradata/dbocs/datafile/undotbs1                        10000000            925702 YES              207
         4 /u01/app/oracle/oradata/dbocs/datafile/users                           10000000            925702 NO               285
         5 /u01/app/oracle/oradata/dbocs/datafile/undotbs2                        10000000            925702 NO                 7
         6 /u01/app/oracle/oradata/dbocs/datafile/test1                           10000000            925702 YES                7

        接下来修改数据文件的fuzzy 状态,fuzzy的状态存储在数据文件block 1的kcvfhsta(offset 138)

--使用另一种方式modify修改,将kcvfhsta的值修改为0x0000
BBED>  m /x 0000 file 1 block 1 offset 138
--修改之后,要重新计算修改后的数据块的checksum 值
BBED> sum apply
--验证数据块是否正常 
BBED> verify
--其他的数据文件2,3,4,5,6重复以上操作

        修改完之后,确认fuzzy状态为0

select HXFNM,fhsta from x$kcvfh
​
HXFNM                                                             FHSTA
------------------------------------------------------------ ----------
/u01/app/oracle/oradata/dbocs/datafile/system                         0
/u01/app/oracle/oradata/dbocs/datafile/sysaux                         0
/u01/app/oracle/oradata/dbocs/datafile/undotbs1                       0
/u01/app/oracle/oradata/dbocs/datafile/users                          0
/u01/app/oracle/oradata/dbocs/datafile/undotbs2                       0
/u01/app/oracle/oradata/dbocs/datafile/test1                          0
​
 select file#,name,CHECKPOINT_CHANGE#,RESETLOGS_CHANGE#,FUZZY,CHECKPOINT_COUNT
 from v$datafile_header
​
​
     FILE# NAME                                                         CHECKPOINT_CHANGE# RESETLOGS_CHANGE# FUZ CHECKPOINT_COUNT
---------- ------------------------------------------------------------ ------------------ ----------------- --- ----------------
         1 /u01/app/oracle/oradata/dbocs/datafile/system                          10000000            925702 NO               286
         2 /u01/app/oracle/oradata/dbocs/datafile/sysaux                          10000000            925702 NO               286
         3 /u01/app/oracle/oradata/dbocs/datafile/undotbs1                        10000000            925702 NO               207
         4 /u01/app/oracle/oradata/dbocs/datafile/users                           10000000            925702 NO               285
         5 /u01/app/oracle/oradata/dbocs/datafile/undotbs2                        10000000            925702 NO                 7
         6 /u01/app/oracle/oradata/dbocs/datafile/test1                           10000000            925702 NO                 7

        checkpoint_change#以及fuzzy状态修改完之后,已经满足了打开数据库的条件,接下来需要重建一下控制文件

--重建控制文件
STARTUP NOMOUNT
CREATE CONTROLFILE REUSE DATABASE "DBOCS" RESETLOGS  ARCHIVELOG
    MAXLOGFILES 192
    MAXLOGMEMBERS 3
    MAXDATAFILES 1024
    MAXINSTANCES 32
    MAXLOGHISTORY 633
LOGFILE
  GROUP 11 '/u01/app/oracle/oradata/dbocs/onlinelog/group_11.256.1036344439'  SIZE 500M BLOCKSIZE 512,
  GROUP 14 '/u01/app/oracle/oradata/dbocs/onlinelog/group_14.291.1036352287'  SIZE 500M BLOCKSIZE 512
-- STANDBY LOGFILE
DATAFILE
  '/u01/app/oracle/oradata/dbocs/datafile/system',
  '/u01/app/oracle/oradata/dbocs/datafile/sysaux',
  '/u01/app/oracle/oradata/dbocs/datafile/undotbs1',
  '/u01/app/oracle/oradata/dbocs/datafile/users',
  '/u01/app/oracle/oradata/dbocs/datafile/undotbs2',
  '/u01/app/oracle/oradata/dbocs/datafile/test1'
CHARACTER SET ZHS16GBK
;

        使用open resetlogs打开数据库

--recover database
RECOVER DATABASE USING BACKUP CONTROLFILE;
--open database
ALTER DATABASE OPEN RESETLOGS;
--添加临时文件
ALTER TABLESPACE TEMP ADD TEMPFILE '/u01/app/oracle/oradata/dbocs/tempfile/temp.257.1036073905' size 100M;

        打开数据库之后,大概率会出现ORA-00600 [4194]的报错,因为在添加一条新的undo记录的时候,会去比较undo块里面的最大记录数跟redo日志里面的undo最大记录数,如果出现不一致,就会触发ORA-00600[4194]的错误

Block recovery completed at rba 1.73.16, scn 0.10000067
Errors in file /u01/app/oracle/diag/rdbms/dbocs/dbocs/trace/dbocs_mmon_40326.trc  (incident=14524):
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/dbocs/dbocs/incident/incdir_14524/dbocs_mmon_40326_i14524.trc
Arg [a] - Maximum Undo record number in Undo block
Arg [b] - Undo record number from Redo block
Since we are adding a new undo record to our undo block, we would expect that the new record number is equal to the maximum record number in the undo block plus one. 
Before Oracle can add a new undo record to the undo block it validates that this is correct. 
If this validation fails, then an ORA-600 [4194] will be triggered.

​        数据库打开之后才出现的ORA-00600 [4194]的报错,一般都是undo表空间里面的回滚段问题,不是system的回滚段问题,可以通过重建undo表空间解决

--关闭自动undo段管理
alter system set undo_management = manual scope=spfile;
--10513禁用事务的回滚
 alter system set    event = '10513 trace name context forever, level 2' scope=spfile;
--重启数据库,以严格模式启动
shutdown immediate
startup restrict
--检查非offline的回滚段,system的回滚段一直都是online的可以不用管,需要确保其他回滚段都是offline,不能出现'PARTLY AVAILABLE' or 'NEEDS RECOVERY'
SQL> select tablespace_name, status, segment_name from dba_rollback_segs where status != 'OFFLINE';
​
TABLESPACE_NAME                STATUS           SEGMENT_NAME
------------------------------ ---------------- ------------------------------
SYSTEM                         ONLINE           SYSTEM
​
--重建undo表空间
SQL> create undo tablespace undo1 datafile size 100M;
SQL> drop tablespace undotbs1 including contents and datafiles;
​
--设置默认表空间为新的undo表空间以及开启自动undo段管理,删除event参数
alter system set undo_tablespace=undo1 scope=spfile;
alter system set undo_management =auto scope=spfile;
alter system reset event scope=spfile;
--重启
shutdown immediate;
startup

Logo

腾讯云面向开发者汇聚海量精品云计算使用和开发经验,营造开放的云计算技术生态圈。

更多推荐