As part of ongoing engagement with a client, very recently I was configuring an Oracle 12c Active Data Guard (ADG) for their one of business critical RAC databases on Exadata. Although I have successfully setup several Oracle Active Data Guard environments for many business critical databases in the past, this was the first attempt on Oracle 12c. Before start of the configuration, what I was pretty sure was there isn’t much change in the procedure configuring ADG in 12c. I did follow the same procedure that I used it in the past to configure Active data guard in pre Oracle 12c versions.
There was an interesting problem I have encountered right after completion of the ADG setup and the Oracle error messages in the respective logs were much useful diagnosing the problem. In this article, I shall be sharing what are the errors messages appeared in the logs and how do they helped me diagnosing the issue and what workaround I have applied to resolve the problem.
Note: This article doesn’t demonstrate the typical procedure to configure an Oracle Active Data Guard. There are plenty of white papers exists on how-to procedure, so, just google it.
When the managed recovery process started on the Active Standby Database, just right after completion of Active Data Guard configuration setup, I have tailed the alert.log on primary and standby database to ensure everything is working smoothly. Well, this time around, it was not just a plain vanilla for me, I have noticed an issue in the alert.log that there is a problem while shipping the redo information from Primary to the DR site.
Reviewed the primary and standby database alert.log files and found the following details about the problem (database names are modified here as PRIMARY1 and STDBYDB to maintain the privacy):
Primary database alert.log
TT01: Setting 'active' archival for destination LOG_ARCHIVE_DEST_2
Error 1033 received logging on to the standby
Mon Jun 27 14:55:16 2016
Errors in file /u01/app/oracle/diag/rdbms/primary/PRIMARY1/trace/PRIMARY1_arc1_302102.trc:
ORA-01033: ORACLE initialization or shutdown in progress
PING[ARC1]: Heartbeat failed to connect to standby 'STDBYDB'. Error is 1033.
Errors in file
The Heartbeat failed to connect to standby and Error 1033 combination of warning messages typical represents the following scenarios in any data guard configuration:
Wrong settings of remote_login_passwordfile parameter
No identical SYS passwords in the passwords files, between Primary and Standby.
Quickly I made sure the password file has the same password in the PRIMARY and STDBYDB password files, and verified the parameter settings too. Still the problem persists, and started to investigate further by reviewing the following trace file:
ORA-01033: ORACLE initialization or shutdown in progress:
*** 2016-06-26 16:44:25.500
OCISessionBegin failed. Error -1
.. Detailed OCI error val is 1017 and errmsg is 'ORA-01017: invalid username/password; logon denied
*** 2016-06-26 16:44:26.535
2016-06-26 16:44:26.617: [ GPNP]clsgpnp_dbmsGetItem_profile: [at clsgpnp_dbms.c:345] Result: (0) CLSGPNP_OK. (:GPNP00401:)got ASM-Profile.Mode='legacy'
*** 2016-06-26 16:44:27.745
OCISessionBegin failed -1
.. Detailed OCI error val is 1033 and errmsg is 'ORA-01033: ORACLE initialization or shutdown in progress
*** 2016-06-26 16:44:27.791716 4929 krsh.c
*** 2016-06-26 16:44:27.791734 2968 krsu.c
Error 1033 connecting to destination LOG_ARCHIVE_DEST_2 standby host 'STDBYDB'
krsi_verify_network: Error 1033 attaching to destination LOG_ARCHIVE_DEST_2 standby host 'STDBYDB'
*** 2016-06-26 16:44:27.792060 4929 krsh.c
Error 1033 for archive log file 2 to 'STDBYDB'
*** 2016-06-26 16:44:27.792084 6959 krsi.c
krsi_dst_fail_caller: dest:2 err:1033 force:0 blast:1
The particulars in the trace file clearly indicate the password issues. I have then looked at the STANDBY database alert.log and trace files to get the more insights about the problem.
Standy database alert.log
FAL[client, USER]: Error 16191 connecting to PRIMARY for fetching gap sequence
Mon Jun 27 15:20:25 2016
Errors in file /u01/app/oracle/diag/rdbms/stdbydb/STDBYDB1/trace/STDBYDB1_pr00_15533.trc:
ORA-16191: Primary log shipping client not logged on standby
Mon Jun 27 15:20:40 2016
Error 1017 received logging on to the standby
Check that the primary and standby are using a password file
and remote_login_passwordfile is set to SHARED or EXCLUSIVE,
and that the SYS password is same in the password files.
returning error ORA-16191
I then looked at the following trace file on standby database:
*** 2016-06-27 15:21:08.098
*** 2016-06-27 15:21:09.125
kgfpInitEnv2: hdl=0x7f34507750c8 env=0x7f34507750e0
2016-06-27 15:21:09.203: [ GPNP]clsgpnp_dbmsGetItem_profile: [at clsgpnp_dbms.c:345] Result: (0) CLSGPNP_OK. (:GPNP00401:)got ASM-Profile.Mode='legacy'
kgfn_get_beqinfo: publen=84, vlen=16
kgfpOpenPwFile: start (hdl=0x7f34507750c8 name=+DATAC1/STDBYDB/PASSWORD/pwdstdbydb)
kgfpOpenPwFile: end (hdl=0x7f34507750c8)
*** 2016-06-27 15:21:10.401
After some initial research over the support.oracle.com, I come across a document (ID 1984091.1) matching my requirements. As you probably knew that, when you create a RAC database in 12c with DBCA, the password file will be stored in an ASM. This document demonstrated how to refresh/copy the password file from PRIMARY ASM to the STANDBY ASM.
I have used the workaround mentioned in the document, as demonstrated below, and happy it worked for me.
“During the creation of a standby database or after having altered the password for any user granted the SYSDBA,SYSOPER or SYSDG privileges the password file must be copied from the primary site to the standby site. The Oracle Database Enterprise Edition 220.127.116.11 and above can now store password files in ASM diskgroups. By default if dbca is used to create a RAC database, the password file is created in an ASM diskgroup on the Primary site.”
Connect to ASM instance on the local node, and copy the password file to a temporary location:
$ . oraenv
$ ORACLE_SID = [XXX] ? +ASM1
$ asmcmd –p
ASMCMD> pwcopy +DATA1/PRIMARY/PASSWORD/pwdprimary.1758.915635849 /home/oracle/orapwstdbydb
Transfer the password file to the standby destination scp /home/oracle/orapwstdbydb standbyhost:/home/oracle
Connect to ASM instance on the local node, and copy the password file from the temporary location to ASM:
$ . oraenv $ ORACLE_SID = [XXX] ? +ASM1 $ asmcmd –p ASMCMD> pwcopy /home/oracle/orapwstdbydb +DATA1/STDBYDB/orapwstdbdy ASMCMD> ls –ltr +DATA1/STDBYDB/orapwstddby Type Redund Striped Time Sys Name PASSWORD HIGH COARSE JUN 27 15:00:00 N pwdorabisdb=> +DATA1/ASM/PASSWORD/pwdasm.280.987265849
After the password file is copied over ASM, subsequently, update the details in the cluster register, using the following command:
$ srvctl modify database –d STDBYDB –pwfile +DATA1/STDBYDB/orapwstdbdy
$ srvctl config database -d STDBYDB
Database unique name: STDBYDB
Oracle home: /u01/app/oracle/product/18.104.22.168/dbhome_1
Oracle user: oracle
Password file: +DATA1/STDBYDB/orapwstdbdy
Start options: open
After applying the above workaround, redo information started to ship over the standby site and MRP was doing its tasks.
This article demonstrated the procedure and need of copying/refreshing the password file from Primary database ASM to the Standby database ASM location, as part of active data guard configuration of a RAC database. This is was something new I have learnt while working with Oracle 12c RAC databases on Exadata. Hopefully my experience/knowledge would help some of you to resolve your similar problems.
12c: Data Guard Physical Standby - Managing password files in a RAC Physical Standby (Doc ID 1984091.1)
!! Stay tuned for next article where I will be sharing another interesting story/scenario and how we resolved it. !!
Good one, Syed.