Database performance degradation due to multipath issues

To put it in bit of an Indian context, database is not your daughter-in-law that you can blame it for every performance issue that occurs in the environment. But it does happen. Most of the time it is the database that is blamed for all such issues. Many times, the issues are in some other layer like OS, network or storage. Faced this issue recently at one of the customer sites where performance in one of the databases went down suddenly. It was a 2 node RAC on 12.1.0.2 running on Linux 7 using some kind of Hitachi SSD storage array. There were no changes as per DBA, application, OS and storage teams. But something must have changed somewhere. Otherwise why would performance degrade just like that. I & my colleague checked some details and found that something happened in the morning a day before. Starting from that point in time, the execution time for all the commonly run queries shot up. Generally speaking, when all the queries are doing bad and you are sure that nothing has been changed on the database side, the reasons could be outside the database. But being a DBA, it is not easy to prove that. We took AWRs from good and bad times and the wait events section looked like this: ...

March 22, 2021 at 12:54 PM · 3 min · 578 words · Amardeep Sidhu

[FATAL] [INS-44000] Passwordless SSH connectivity is not setup

Faced this while running installer for setting up a 2 node RAC setup (version 19.8) on an Oracle SuperCluster. The error reported in the log is: [FATAL] [INS-44000] Passwordless SSH connectivity is not setup from the local node node1 to the following nodes: [node2] [INS-06006] Passwordless SSH connectivity not set up between the following node(s): [node2] From the error it appears that the ssh is not setup between two nodes but actually that is not the case. Here the error message is bit misleading. It turned out to be an issue with scp with openssh version 8.x. Running the setup with -debug option gives the clue: ...

March 10, 2021 at 6:17 PM · 1 min · 212 words · Amardeep Sidhu

PRVF-4657 : Name resolution setup check for “db-scan” (IP address: x.x.x.101) failed

A quick note about an error I faced while running root.sh on an Exadata machine. The configuration tools failed with the following error: Error is PRVF-4657 : Name resolution setup check for "db-scan" (IP address: x.x.x.101) failed I did nslookup on the scan name and it all seemed good. So why the error ? After spending another 5 minutes, I looked at /etc/hosts and there was it. Someone had populated /etc/hosts of DB nodes with all the hostnames entries including the scan name. Something like: ...

September 25, 2020 at 7:41 PM · 1 min · 145 words · Amardeep Sidhu

root.sh fails with CRS-2101:The OLR was formatted using version 3

Got this while trying to install 11.2.0.4 RAC on Redhat Linux 7.2. root.sh fails with a message like [sql]ohasd failed to start Failed to start the Clusterware. Last 20 lines of the alert log follow: 2017-11-09 15:43:37.883: [client(37246)]CRS-2101:The OLR was formatted using version 3.[/sql] This is bug 18370031. Need to apply the patch before running root.sh.

November 18, 2017 at 4:03 PM · 1 min · 56 words · Amardeep Sidhu

Failed to create voting files on disk group RECOC1

Long story short, faced this issue while running OneCommand for one Exadata system. The root.sh step (Initialize Cluster Software) was failing with the following error on the screen Checking file root_dm01dbadm02.in.oracle.com_2017-04-27_18-13-27.log on node dm01dbadm02.somedomain.com Error: Error running root scripts, please investigate… Collecting diagnostics… Errors occurred. Send /u01/onecommand/linux-x64/WorkDir/Diag-170427_181710.zip to Oracle to receive assistance. Doesn’t make much sense. So let us check the log file of this step 2017-04-27 18:17:10,463 [INFO][ OCMDThread][ ClusterUtils:413] Checking file root_dm01dbadm02.somedomain.com_2017-04-27_18-13-27.log on node inx321dbadm02.somedomain.com 2017-04-27 18:17:10,464 [INFO][ OCMDThread][ OcmdException:62] Error: Error running root scripts, please investigate… 2017-04-27 18:17:10,464 [FINE][ OCMDThread][ OcmdException:63] Throwing OcmdException… message:Error running root scripts, please investigate… ...

April 28, 2017 at 2:31 PM · 4 min · 810 words · Amardeep Sidhu

Oracle RAC 12.1 – lsnodes exited with code 9

I was trying to do a 2 node RAC setup on Solaris 11.3 where Oracle Solaris Cluster 4.3 was already configured. Installed was running but the Cluster Node Information screen was appearing like this The install log shows this: INFO: Checking cluster configuration details INFO: Found Vendor Clusterware. Fetching Cluster Configuration INFO: Executing [/tmp/OraInstall2017-03-28_12-50-48PM/ext/bin/lsnodes] with environment variables {TERM=xterm, LC_COLLATE=, SHLVL=3, JAVA_HOME=, XFILESEARCHPATH=/usr/dt/app-defaults/%L/Dt, SSH_CLIENT=172.16.64.55 56370 22, LC_NUMERIC=, LC_MESSAGES=, MAIL=/var/mail/oracle, PWD=/export/software/grid/grid, XTERM_VERSION=XTerm(320), WINDOWID=2097165, LOGNAME=oracle, _=*50727*/export/software/grid/grid/install/.oui, NLSPATH=/usr/dt/lib/nls/msg/%L/%N.cat, SSH_CONNECTION=172.16.64.55 56370 172.16.72.18 22, OLDPWD=/export/oracle, LC_CTYPE=, CLASSPATH=, PATH=/usr/bin:/usr/ccs/bin:/usr/bin:/bin:/export/software/grid/grid/install, LC_ALL=, DISPLAY=localhost:10.0, LC_MONETARY=, USER=oracle, HOME=/export/oracle, XTERM_SHELL=/bin/bash, XAUTHORITY=/tmp/ssh-xauth-mlq21a/xauthfile, A__z="*SHLVL, XTERM_LOCALE=en_US.UTF-8, TZ=localtime, LC_TIME=, LANG=en_US.UTF-8} ...

March 28, 2017 at 10:01 PM · 2 min · 237 words · Amardeep Sidhu

addNode.sh, failed root.sh and IB listener

So this customer has an Exadata quarter rack and they have an IB listener configured on both DB nodes (for DB connections from a multi-racked Exalogic system). We were adding a new DB node to this rack. So just followed the standard procedure of creating users, directories etc on the new node, setting up ssh equivalence and running addNode.sh. All went fine but root.sh failed. Little looking into the logs revealed that it failed while running srvctl start listener –n <node_name> ...

September 13, 2016 at 7:44 PM · 2 min · 360 words · Amardeep Sidhu