In Oracle, physical standby database is exact block by block replica of primary database and it is synchronized by replaying the modifications done to primary through the archive log files which contains this recorded information. Hence the sequence of applying the archive log files at standby database plays an important role to keep data consistent. One of the most common reason for standby database to go out of sync or break is due to 'open resetlogs' operation on primary database which will change the database incarnation and creates new branch of redo data.
But wait !!! It's not so straight forward.
According to documentation :
It's quiet confusing, so to make the concepts clear and concise let's walk through an example. The demo is conducted on 2 node 12c RAC primary having 2 node RAC standby database, when 'open resetlogs' is performed on primary which is having standby database completely in sync then standy database will break down. So let's try to simulate the scenario what has been stated in the manuals by stopping MRP process before performing 'open resetlogs' in primary. Current status of online redo log files in primary is as shown below.
SQL> select THREAD#,GROUP#,SEQUENCE#,ARCHIVED,STATUS from v$log order by thread#,group#; THREAD# GROUP# SEQUENCE# ARC STATUS ---------- ---------- ---------- --- ---------------- 1 1 1089 YES ACTIVE 2 1090 YES ACTIVE 5 1091 NO CURRENT 6 1088 YES ACTIVE 2 3 1086 YES ACTIVE 4 1087 YES ACTIVE 7 1088 NO CURRENT 8 1085 YES ACTIVE
On standby database MRP has been stopped to ensure that redo past the resetlogs SCN is not applied and the status of archive logs applied till now is as shown below.
SELECT thread#, sequence#, applied, first_change#, next_change# FROM ( SELECT thread#, sequence#, applied, first_change#, next_change#, rank( ) over ( partition by thread# ORDER BY sequence# DESC) rnk FROM V$ARCHIVED_LOG WHERE resetlogs_id = 913784186 ) WHERE rnk <= 10 ORDER BY thread#,sequence# / THREAD# SEQUENCE# APPLIED FIRST_CHANGE# NEXT_CHANGE# ---------- ---------- --------- ------------- ------------ 1 1083 YES 120289255 120296499 1084 YES 120296499 120302045 1085 NO 120302045 120302700 1086 NO 120302700 120307546 1087 NO 120307546 120307588 1087 NO 120307546 120307588 1088 NO 120307588 120307860 1089 NO 120307860 120307918 1090 NO 120307918 120307951 1091 NO 120307951 120310419 2 1079 YES 120283397 120289240 1080 YES 120289240 120296484 1081 YES 120296484 120302026 1082 NO 120302026 120302715 1083 NO 120302715 120307553 1084 NO 120307553 120307581 1085 NO 120307581 120307867 1086 NO 120307867 120307925 1087 NO 120307925 120307944 1088 NO 120307944 120310425
Since standby database is few logs behind the primary database let's proceed with 'open resetlogs' of primary database by restoring and recovering it to achieve incomplete recovery of the database.
{/home/oracle}: rman target / Recovery Manager: Release 12.1.0.2.0 - Production on Tue Jun 28 04:23:59 2016 Copyright (c) 1982, 2014, Oracle and/or its affiliates. All rights reserved. connected to target database: DEMODB (DBID=1278101452, not open) RMAN> run { restore database; set until sequence 1087 thread 1; recover database; alter database open resetlogs; } ... ..... ...... Finished restore at 28-JUN-16 executing command: SET until clause Starting recover at 28-JUN-16 using channel ORA_DISK_1 starting media recovery archived log for thread 1 with sequence 1083 is already on disk as file /oraarch1/1_1083_913784186.dbf archived log for thread 1 with sequence 1084 is already on disk as file /oraarch1/1_1084_913784186.dbf archived log for thread 1 with sequence 1085 is already on disk as file /oraarch1/1_1085_913784186.dbf archived log for thread 1 with sequence 1086 is already on disk as file /oraarch1/1_1086_913784186.dbf archived log for thread 2 with sequence 1080 is already on disk as file /oraarch1/2_1080_913784186.dbf archived log for thread 2 with sequence 1081 is already on disk as file /oraarch1/2_1081_913784186.dbf archived log for thread 2 with sequence 1082 is already on disk as file /oraarch1/2_1082_913784186.dbf archived log for thread 2 with sequence 1083 is already on disk as file /oraarch1/2_1083_913784186.dbf archived log file name=/oraarch1/1_1083_913784186.dbf thread=1 sequence=1083 archived log file name=/oraarch1/2_1080_913784186.dbf thread=2 sequence=1080 archived log file name=/oraarch1/2_1081_913784186.dbf thread=2 sequence=1081 archived log file name=/oraarch1/1_1084_913784186.dbf thread=1 sequence=1084 archived log file name=/oraarch1/2_1082_913784186.dbf thread=2 sequence=1082 archived log file name=/oraarch1/1_1085_913784186.dbf thread=1 sequence=1085 archived log file name=/oraarch1/1_1086_913784186.dbf thread=1 sequence=1086 archived log file name=/oraarch1/2_1083_913784186.dbf thread=2 sequence=1083 media recovery complete, elapsed time: 00:00:02 Finished recover at 28-JUN-16 Statement processed RMAN>
Note that in RMAN block we have set the archive log sequence number of 1087 for thread 1 to limit the recovery until this point in time and then after recovering the database we perform 'open resetlogs'. But as per the log file it has recovered until 1086 but not 1087 which is expected as we specified 1087 as the limit.
At the same time if we check the alert log file of standby database...
Tue Jun 28 04:46:34 2016 RFS: Assigned to RFS process (PID:32731) RFS: New Archival REDO Branch: 915684100 Current: 913784186 RFS: Selected log 13 for thread 1 sequence 1 dbid 1278101452 branch 915684100 A new recovery destination branch has been registered RFS: Standby in the future of new recovery destinationBranch(resetlogs_id) 915684100 Incomplete Recovery SCN: 120302435 Resetlogs SCN: 120307547 Standby Became Primary SCN: 112947489 Flashback database to SCN 112947489 to follow new branch Flashback database to SCN 112947489 to follow new branch RFS: New Archival REDO Branch(resetlogs_id): 915684100 Prior: 913784186 RFS: Archival Activation ID: 0x4dae17b1 Current: 0x4d90d989 RFS: Effect of primary database OPEN RESETLOGS Tue Jun 28 04:46:34 2016 RFS: Incarnation entry added for Branch(resetlogs_id): 915684100 (demodb) Tue Jun 28 04:46:34 2016 Setting recovery target incarnation to 2 Tue Jun 28 04:46:35 2016
We can see that it has detected resetlog operation on the primary along with the new redo branch and has also recieved the archive log file of new incarnation. Though MRP is down all these communication have been performed by RFS process and from there after standby sets the recovery target incarnation to the one currently primary is running. At this point if we check the status of archive logs it shows all the archive logs are shipped to standby including archive logs of different incarnation caused after 'open resetlogs' operation on primary but they are not applied. If we map back the previous result of V$ARCHIVED_LOG we can say that 'Resetlogs SCN: 120307547' is the FIRST_CHANGE#(120307547 - 1 = 120307546) of log sequence 1087 of thread# 1 and thus RMAN has recovered only upto 1086 log sequence when we set until sequence 1087. Similarly 'Incomplete Recovery SCN: 120302435' is the SCN at which the actual incomplete recovery was done and it belongs to log sequence 1085 of thread# 1 as SCN 120302435 falls within the range of FIRST_CHANGE#(120302045) and NEXT_CHANGE#(120302700). Also one of the main thing to consider from alert log file is the change of database incarnation from 913784186 to 915684100 which is the current incarnation of primary database.
Now start the recovery process MRP and check the status of archive logs which are applied for both the current and old resetlog id - (915684100,913784186)
SQL> SELECT resetlogs_id, thread#, sequence#, applied, first_change#, next_change# 2 FROM ( SELECT resetlogs_id,thread#, 3 sequence#, 4 applied, 5 first_change#, 6 next_change#, 7 rank( ) 8 over ( 9 partition by thread#,resetlogs_id ORDER BY sequence# DESC) rnk 10 FROM V$ARCHIVED_LOG 11 WHERE resetlogs_id in (913784186,915684100) 12 ) 13 WHERE rnk <= 10 14 ORDER BY thread#,sequence# 15 / RESETLOGS_ID THREAD# SEQUENCE# APPLIED FIRST_CHANGE# NEXT_CHANGE# ------------ ---------- ---------- --------- ------------- ------------ 915684100 1 1 YES 120307547 120307550 1 2 YES 120307550 120307564 1 3 YES 120307564 120307634 1 4 YES 120307634 120314322 913784186 1 1085 NO 120302045 120302700 1 1085 YES 120302045 120302700 1 1086 NO 120302700 120307546 1 1086 YES 120302700 120307546 1 1087 NO 120307546 120307588 1 1087 NO 120307546 120307588 1 1088 NO 120307588 120307860 1 1089 NO 120307860 120307918 1 1090 NO 120307918 120307951 1 1091 NO 120307951 120310419 1 1092 NO 120310419 120310421 915684100 2 1 YES 120307547 120307555 2 2 YES 120307555 120307631 2 3 YES 120371705 120371711 2 4 IN-MEMORY 120371711 120382308 913784186 2 1082 YES 120302026 120302715 2 1082 NO 120302026 120302715 2 1083 NO 120302715 120307553 2 1083 YES 120302715 120307553 2 1084 NO 120307553 120307581 2 1085 NO 120307581 120307867 2 1086 NO 120307867 120307925 2 1087 NO 120307925 120307944 2 1088 NO 120307944 120310425 2 1089 NO 120310425 120310472
It's interesting to see that MRP has applied all the archive logs including redo related to new incarnation but in between it has skipped some archive logs, because when we had performed incomplete recovery/point-in-time recovery of primary database we specified RMAN to recover until log sequence number 1087 and thus RMAN recovered till 1086 but not 1087 and other archive log sequences greater than 1087. On standby side all this was considered and it recovered all the archive logs but skipped 1087 and then continued with the redo data pertaining to new incarnation. Also for some archive logs pertaining to old resetlog id it has APPLIED status as both YES and NO because when these archives were shipped to standby MRP was down and hence the status NO, but later when we start MRP it recovered and hence the second record status is YES for the same archive log sequence. There is one more kind of applied status IN-MEMORY which represents that the log file has been applied in memory, but the data files have not yet been updated.
If archive log sequence get reset after every 'open resetlogs' operation then how does MRP performs recovery across the different incarnations which can have same archive log sequence number but belongs to different incarnations ? All this is possible only from 10g release in which they introduced new option %r for formatting archive log names meant for resetlogs ID that ensures unique names are constructed for the archived log files across multiple incarnations of the database. Thus %r is mandatory option to be specified in log_archive_format parameter, else database will hang when archiving is attempted. So 10g and later releases has made great progress in recovery footprints by introducing Flashback and possibility of recovering through multiple incarnations.
This is the reason in Oracle manuals it clearly specifies that MRP will automatically re-synchronizes the standby database with the new branch of redo data if "Standby has not applied redo data past the new resetlogs SCN (past the start of the new branch of redo data) and the new redo branch from OPEN RESETLOGS has been registered at the standby". And if redo data past the new resetlogs SCN has already been applied then we can leverage Flashback(Should be enabled before OPEN RESETLOGS) to revert back the standby database before resetlogs SCN and just start the MRP so that it will automatically take care of the circumstance.
There are few Organizations who prefer to enable Flashback and also have Delayed Standby database to deal with different new incarnation situation. This article has gone through the details of how we can avoid rebuilding complete standby after 'open resetlogs' operation on primary database. For MRP to automatically deal with different incarnations when applying archive logs it should ensure that redo data past the new resetlogs SCN is not applied, and the incarnation information is pass down to standby database through RFS process so that recovery target incarnation will be set to current primary database incarnation.