Amardeep Sidhu's Tech blog

Exadata Virtualized DB node restore

There are two common scenarios when we may need this: An existing DB node has crashed and is unrecoverable (due to some failure and non-availability of any backups. Though some of the things may need to be done even if the backups were available). We have an existing Exadata rack that is virtualized. Now there is a new DB node and the existing clusters need to be extended to include the VMs on this new node. I recently faced the first scenario where a virtualized DB node crashed and wasn’t recoverable. A bare metal DB node restore is a relatively simple procedure where we just have to reimage the node, create the needed directories, users etc and add it to the RAC cluster. In case of virtualization, the creation of VMs is an additional step that needs to be done. That makes it slightly more complex. ...

dbnodeupdate.sh appears to be stuck

I was patching an Exadata db node from 18.1.5.0.0.180506 to 19.3.2.0.0.191119. It had been more than an hour and dbnodeupdate.sh appeared to be stuck. Trying to ssh to the node was giving “connection refused” and the console had this output (some output removed for brevity): [ 458.006444] upgrade[8876]: [642/676] (72%) installing exadata-sun-computenode-19.3.2.0.0.191119-1... <> [ 459.991449] upgrade[8876]: Created symlink /etc/systemd/system/multi-user.target.wants/exadata-iscsi-reconcile.service, pointing to /etc/systemd/system/exadata-iscsi-reconcile.service. [ 460.011466] upgrade[8876]: Looking for unit files in (higher priority first): [ 460.021436] upgrade[8876]: /etc/systemd/system [ 460.028479] upgrade[8876]: /run/systemd/system [ 460.035431] upgrade[8876]: /usr/local/lib/systemd/system [ 460.042429] upgrade[8876]: /usr/lib/systemd/system [ 460.049457] upgrade[8876]: Looking for SysV init scripts in: [ 460.057474] upgrade[8876]: /etc/rc.d/init.d [ 460.064430] upgrade[8876]: Looking for SysV rcN.d links in: [ 460.071445] upgrade[8876]: /etc/rc.d [ 460.076454] upgrade[8876]: Looking for unit files in (higher priority first): [ 460.086461] upgrade[8876]: /etc/systemd/system [ 460.093435] upgrade[8876]: /run/systemd/system [ 460.100433] upgrade[8876]: /usr/local/lib/systemd/system [ 460.107474] upgrade[8876]: /usr/lib/systemd/system [ 460.114432] upgrade[8876]: Looking for SysV init scripts in: [ 460.122455] upgrade[8876]: /etc/rc.d/init.d [ 460.129458] upgrade[8876]: Looking for SysV rcN.d links in: [ 460.136468] upgrade[8876]: /etc/rc.d [ 460.141451] upgrade[8876]: Created symlink /etc/systemd/system/multi-user.target.wants/exadata-multipathmon.service, pointing to /etc/systemd/system/exadata-multipathmon.service. There was not much that I could do so just waited. Also created an SR with Oracle Support and they also suggested to wait. It started moving after some time and completed successfully. Finally when the node came up, i checked that there was an NFS mount entry in /etc/rc.local and that was what created the problem. For the second node, we commented this out and it was all smooth. Important to comment out all NFS entries during patching to avoid all such issues. I had commented the ones in /etc/fstab but the one in rc.local was an unexpected one. ...

AVDF installation error

I was installing Database Firewall version 12.2.0.11.0 on a Dell x86 machine (with 5 * 500 GB local HDDs configured in RAID 10) and it got successfully installed. Later on, I came to know that this version doesn’t support host monitor functionality on Windows hosts. The latest version that supports that is 12.2.0.10.0. So that was the time to download and install 12.2.0.10.0. The installation started fine but it failed with an error: ...

OGB Appreciation Day : Thank you community ! (#ThanksOGB)

It was started by Tim Hall in 2016. This is a Thank you community post. There are so many experts posting on Oracle related forums, doing blog posts, sharing their scripts with everyone. All of you are doing a great job. I would like to mention three names especially: Tim Hall : Tim is a legend ! I don’t consider something a new feature until Tim writes about it :D Jonathan Lewis : I don’t think there is anyone on this planet who has even once worked on a performance problem and hasn’t gained something from the knowledge shared by him on forums or in one of the blog posts. ...

Understanding grid disks in Exadata

Use of Exadata storage cells seems to be a very poorly understood concept. A lot of people have confusions about how exactly ASM makes uses of disks from storage cells. Many folks assume there is some sort of RAID configured in the storage layer whereas there is nothing like that. I will try to explain some of the concepts in this post. Let’s take an example of an Exadata quarter rack that has 2 db and 3 storage nodes (node means a server here). Few things to note: ...

ORA-04080: trigger ‘PRICE_HISTORY_TRIGGERV1’ does not exist

It is actually a dumb one. I was disabling triggers in a schema and ran this SQL to generate the disable statements. (Example from here) HR@test> select 'alter trigger '||trigger_name|| ' disable;' from user_triggers where table_name='PRODUCT'; 'ALTERTRIGGER'||TRIGGER_NAME||'DISABLE;' -------------------------------------------------------------------------------- alter trigger PRICE_HISTORY_TRIGGERv1 disable; HR@test> alter trigger PRICE_HISTORY_TRIGGERv1 disable; alter trigger PRICE_HISTORY_TRIGGERv1 disable * ERROR at line 1: ORA-04080: trigger 'PRICE_HISTORY_TRIGGERV1' does not exist HR@test> WTF ? It is there but the disable didn’t work. I was in hurry, tried to connect through SQL developer and disable and it worked ! Double WTF ! Then i spotted the problem. Someone created it with one letter in the name in small. So to make it work, we need to use double quotes. ...

Error while running ggsci

This was another issue that I faced while trying to configure GoldenGate in HA mode. ggsci was working fine after normal installation but after configuring it in HA mode and trying to run ggsci, it resulted in this: [oragg@node2 product]$ ggsci Oracle GoldenGate Command Interpreter for Oracle Version 12.3.0.1.4 OGGCORE_12.3.0.1.0_PLATFORMS_180415.0359_FBO Linux, x64, 64bit (optimized), Oracle 12c on Apr 16 2018 00:53:30 Operating system character set identified as UTF-8. Copyright (C) 1995, 2018, Oracle and/or its affiliates. All rights reserved. 2019-01-08 16:28:37.913 CLSD: An error occurred while attempting to generate a full name. Logging may not be active for this process Additional diagnostics: CLSU-00100: operating system function: sclsdgcwd failed with error data: -1 CLSU-00103: error location: sclsdgcwd2 (:CLSD00183:) GGSCI (node2) 1> No obvious clues in the error message but little searching revealed that it had something to do with permissions. It was on Exadata so i tried to do a strace of ggsci and see if it could give some clues. There we go: ...

Failed to execute the command “”/u01/app/xag/bin/clsecho”

I was configuring GoldenGate in HA mode by following this document. Everything worked ok but in the end while running agctl config goldengate to view the configuration of GoldenGate resource, it was failing with the following error: [oracle@exadatadb02 ~]$ agctl config goldengate GG_TARGET Failed to execute the command ""/u01/app/xag/bin/clsecho" -p xag -f xag -m 5080 "GG_TARGET"" (rc=134), with the message: Oracle Clusterware infrastructure fatal error in clsecho.bin (OS PID 126367_140570897783808): Internal error (ID (:CLSB00107:)) - Error -1 (ORA-08275) determining Oracle base /u01/app/xag/bin/clsecho: line 45: 126367 Aborted (core dumped) ${CRS_HOME}/bin/clsecho.bin "$@" Failed to execute the command ""/u01/app/xag/bin/clsecho" -p xag -f xag -m 5081 "/u01/app/oragg/product"" (rc=134), with the message: If you look at the error in bold it sounds kinda obvious that it is not able to figure our where the ORACLE_BASE is. But somehow it didn’t strike me at that moment. So started looking around. If we look at the command it is running, it runs clsecho. This is simply a shell script which in turn calls $CRS_HOME/bin/clsecho.bin . In the script, it sets various environment variables and that is where the problem was. There are lines like: ...

dbca doesn’t list diskgroups

This is an Exadata machine running GI version 18.3.0.0.180717 and DB version 12.1.0.2.180717. On one of the DB nodes while running dbca, it doesn’t list the diskgroups. it works fine on the other node. I cheked the dbca trace and found that the kfod command was failing. I tried to run it manually and got the same error: [oracle@exadb01 ~]$ /u01/app/18.0.0.0/grid/bin/kfod op=groups verbose=true KFOD-00300: OCI error [-1] [OCI error] [Could not fetch details] [-105777048] KFOD-00105: Could not open pfile 'init@.ora' [oracle@exadb01 ~]$ I ran it with strace then: ...

New web based OEDA for Exadata

It started with an xls sheet (that was called dbm configurator) . Then OEDA (Oracle Exadata Deployment Assistant) was introduced that was a Java based GUI tool to enter all the information needed to configure an Exadata machine. Now with the latest patch released in Oct, OEDA has changed again; to become a web based tool. It is deployed on WebLogic and comes with some new features as well. SuperCluster deployments will continue to use the Java based OEDA tool. The new interface has support for Exadata, ZDLRA and ExaCC. It is backward compatible and can import the XMLs generated by older versions of OEDA. Some of the new features include the ability to configure single instance homes, create more than 2 diskgroups, create more than 1 database homes and databases, allow ILOMs to have a different subnet etc. ...