Written By Nassyam Basha
We've seen many articles from me on troubleshooting and optimization of physical standby database and also logical standby database. This article is purely for Snapshot standby database and how to troubleshoot the key problem and fix them.
Flashback vs. Snapshot
From 11g, the snapshot concept ruled completely over the flashback technique which is the long procedure to do. Snapshot standby database conversion is so simple and not much efforts required like traditional method. In the flashback technique we can create guarantee restore point and then we can create specific restore point and later the physical standby can failover performed, after all the testing's we can convert the Read Write mode database to Physical standby database by using the restore point we have created. If in case the restore point was deleted or few flash logs missed then we cannot perform back to physical standby and that situation will lead to rebuild whole standby database. Of course snapshot standby database will works in same passion but it will handle very professionally.
For both the techniques we should have enabled flashback and Fast Recovery area configured so that flash logs will uses the FRA location and we need to have proper size of FRA over the estimate of flash logs generating per hour or day as you wish. Apart from that there are no such high configuration required in order to play with Flashback database or the Snapshot standby database. To enable flashback on standby database we must cancel the media recovery and we need to enabled it, of course again we can start the recovery process.
SQL> recover managed standby database cancel;Media recovery complete.SQL> alter database flashback on;
SQL> recover managed standby database disconnect from session;Media recovery complete.SQL>
How Snapshot Standby works?
This article is not stating the architecture or the introduction of the standby database. To ensure everyone are aware on how it works. This article is focused on understanding the configuration and troubleshoot the most common error with the standby database.
Apart from that, as the Snapshot standby database is in read write mode , if any users are connected and they might have performed by creating new tables or new tablespaces so on, for thus such operations the snapshot standby will have the redo from online redo logs and same time it receives redo from the primary database and it keeps with physical standby for the terminal(cascade) standby databases use.
More about Snapshot Standby Database
In this environment we have primary database and only one standby database with the broker configured. Performing role transitions through broker is made easy and easily accessible. Below is the overall configuration, Canada is the primary database and india is the standby database under maximum performance mode.
DGMGRL> show configuration
Configuration - ckpt12c
Protection Mode: MaxPerformance
canada - Primary database
india - Physical standby database
Fast-Start Failover: DISABLED
There are no such things to convert it to snapshot standby database except the FRA is enabled. By the below command we have performed physical standby to the snapshot standby database.
DGMGRL> convert database india to snapshot standby;
Converting database "india" to a Snapshot Standby database, please wait...
Database "india" converted successfully
We can check the latest status of the configuration after the conversion to snapshot standby database.
india - Snapshot standby database
There are no issues reported and the conversation went successful. We can check the alert log as the converstaion was successful.
Sat Nov 28 17:10:36 2015
Managed Standby Recovery Canceled (drmcdb)
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
alter database convert to snapshot standby
Created guaranteed restore point SNAPSHOT_STANDBY_REQUIRED_11/28/2015 17:10:36
Killing 3 processes (PIDS:14815,4481,4420) (all RFS) in order to disallow current and future RFS connections. Requested by OS process 4345
Begin: Standby Redo Logfile archival
End: Standby Redo Logfile archival
RESETLOGS after incomplete recovery UNTIL CHANGE 2057477
Sat Nov 28 17:10:37 2015
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Resetting resetlogs activation ID 3793820920 (0xe2211cf8)
Online log /u02/app/oracle/fast_recovery_area/INDIA/onlinelog/o1_mf_1_bm4gf193_.log: Thread 1 Group 1 was previously cleared
Online log /u02/app/oracle/fast_recovery_area/INDIA/onlinelog/o1_mf_2_bm4gf28y_.log: Thread 1 Group 2 was previously cleared
Online log /u02/app/oracle/fast_recovery_area/INDIA/onlinelog/o1_mf_3_bm4gf2tn_.log: Thread 1 Group 3 was previously cleared
Standby became primary SCN: 2057475
Setting recovery target incarnation to 3
WARNING: STANDBY_FILE_MANAGEMENT initialization parameter is
not set to the value "AUTO".
This may cause recovery of the standby database to terminate
prior to applying all available redo data.
It may be necessary to use the ALTER DATABASE CREATE DATAFILE
command to add datafiles created on the primary database.
CONVERT TO SNAPSHOT STANDBY: Complete - Database mounted as snapshot standby
Completed: alter database convert to snapshot standby
now to execute the data load scripts we have connected to normal user and it states as the archiver error.
-bash-3.2$ sqlplus c##nassyam/oracle
SQL*Plus: Release 184.108.40.206.0 Production on Sat Nov 28 17:24:27 2015
Copyright (c) 1982, 2013, Oracle. All rights reserved.
ORA-00257: archiver error. Connect internal only, until freed.
From the alert log we can see that no online redo log files were unable to archive further.
Sat Nov 28 17:18:58 2015
Thread 1 cannot allocate new log, sequence 4
All online logs need archiving
Examine archive trace files for archiving errors
Current log# 3 seq# 3 mem# 0: /u02/app/oracle/fast_recovery_area/INDIA/onlinelog/o1_mf_3_bm4gf2tn_.log
Sat Nov 28 17:24:18 2015
ARCH: Archival stopped, error occurred. Will continue retrying
ORACLE Instance drmcdb - Archival Error
ORA-16014: log 1 sequence# 1 not archived, no available destinations
ORA-00312: online log 1 thread 1: '/u02/app/oracle/fast_recovery_area/INDIA/onlinelog/o1_mf_1_bm4gf193_.log'
Tried to perform the log switches at snapshot standby database to ensure the archiving is working properly or not, but it completely hangs and no response either and at the end i had to cancel by my end. So that mean all of the new log switches are pending.
SQL> alter system switch logfile;
alter system switch logfile
ERROR at line 1:
ORA-01013: user requested cancel of current operation
Now performed database bounce and tested by loading data by using few queries, but unable to complete the transactions even though after waiting for the much time.
SQL> shut immediate
ORACLE instance shut down.
ORACLE instance started.
Total System Global Area 730714112 bytes
Fixed Size 2292672 bytes
Variable Size 553649216 bytes
Database Buffers 171966464 bytes
Redo Buffers 2805760 bytes
-- hang (again enter initiated)
-- waited for few seconds
-- No response
Now we are clear that issue is from the snapshot standby database destination or any other parameters of the configuration, That's why prior to snapshot conversion we should have good plan to know the environment first.
SQL> select name,value from v$parameter where name in ('log_archive_dest_1','log_archive_dest_2','log_archive_dest_state_2');
log_archive_dest_1 location="USE_DB_RECOVERY_FILE_DEST", valid_for=(STANDBY_LOGFILE,STANDBY_ROLE)
Ran the above query from the snapshot standby database and we can see the local destination for example log_archive_dest_1, it shows with the standby logfiles and standby role, which is perfectly fine and conveys that local destination can send redo to the terminal databases like cascade and the destination log_archive_dest_2 is again pointing to local FRA and the status was deferred.
Back to the basics, that after the conversion to snapshot standby database it has mechanism to supply the redo stream to the other destination and same time it can produce the archive log files like the primary database. Unfortunately there are no other destination is configured.
Which we can say as huge misconfiguration and thus called to halt to the database. So it is very important to understand the parameters even though it is physical or logical or snapshot standby database. There are two things that every DBA must learn which transport attributes have to use in which situations.
On top of that oracle provided more options to use ALL_ROLES, ALL_LOGFILES which is applicable to all the types of destinations. But from the present configuration the destination type as "STANDBY_LOGFILE, STANDBY_ROLE) which valid only for the other standby databases (cascaded) and again it is same problem if we perform switchover to the primary database. In order to avoid such halt condition we can either configure two destinations. One for the standby logfile destination type and the other for the online log file destination (or) we can configure common destination type ALL_LOGFILES, ALL ROLES like below.
SQL> alter system set log_archive_dest_1='location="USE_DB_RECOVERY_FILE_DEST", valid_for=(ALL_LOGFILES,ALL_ROLES)';
SQL> alter system set log_archive_dest_1='location="USE_DB_RECOVERY_FILE_DEST", valid_for=(STANDBY_LOGFILE,STANDBY_ROLE)';
SQL> alter system set log_archive_dest_2='location="USE_DB_RECOVERY_FILE_DEST", valid_for=(ONLINE_LOGFILES, STANDBY_ROLE)';
SQL> alter system set log_archive_dest_state_2='enable';
So basically we have changed as one destination for the standby logfile support and the other destination for the snapshot standby(online logfile) purpose and so that there will be no chance for the database halt. After the changes of the destinations from the snapshot we can continue our earlier test which is failed.
log_archive_dest_1 location="USE_DB_RECOVERY_FILE_DEST", valid_for=(STANDBY_LOGFILE,STANDBY_ROLE)
log_archive_dest_2 location="USE_DB_RECOVERY_FILE_DEST", valid_for=(ONLINE_LOGFILES, STANDBY_ROLE)
SQL> conn c##nassyam/oracle
SQL> show user
USER is "C##NASSYAM"
PL/SQL procedure successfully completed.
The script now executed successfully and we can see below as the archiving of the online redo logs without any issues.
Sat Nov 28 17:40:08 2015
Thread 1 advanced to log sequence 6 (LGWR switch)
Current log# 3 seq# 6 mem# 0: /u02/app/oracle/fast_recovery_area/INDIA/onlinelog/o1_mf_3_bm4gf2tn_.log
Sat Nov 28 17:40:09 2015
Archived Log entry 28 added for thread 1 sequence 5 ID 0xe4a98daf dest 2:
Sat Nov 28 17:40:17 2015
Thread 1 advanced to log sequence 7 (LGWR switch)
Current log# 1 seq# 7 mem# 0: /u02/app/oracle/fast_recovery_area/INDIA/onlinelog/o1_mf_1_bm4gf193_.log
Sat Nov 28 17:40:18 2015
Trying to expand controlfile section 11 for Oracle Managed Files
Expanded controlfile section 11 from 28 to 56 records
Requested to grow by 28 records; added 1 blocks of records
Archived Log entry 29 added for thread 1 sequence 6 ID 0xe4a98daf dest 2:
Now even after changes we can see that the overall Data Guard status is in success status and hence the current and new settings will be valid even after the conversion back to Physical standby database.
We've seen how the snapshot standby database was irresponsive after the conversion from the physical standby database and how to resolve the situation after dealing the configuration changes and clear understanding when and how to use the destination types. Usually the snapshot standby database is used for any testing purpose if the standby server is going to be live and it is great scope to perform Real Application Testing on Snapshot standby by many iterations until it met the requirements.