dbnodeupdate.sh appears to be stuck

I was patching an Exadata db node from 18.1.5.0.0.180506 to 19.3.2.0.0.191119. It had been more than an hour and dbnodeupdate.sh appeared to be stuck. Trying to ssh to the node was giving “connection refused” and the console had this output (some output removed for brevity):

[  458.006444] upgrade[8876]: [642/676] (72%) installing exadata-sun-computenode-19.3.2.0.0.191119-1...
<>
[  459.991449] upgrade[8876]: Created symlink /etc/systemd/system/multi-user.target.wants/exadata-iscsi-reconcile.service, pointing to /etc/systemd/system/exadata-iscsi-reconcile.service.
[  460.011466] upgrade[8876]: Looking for unit files in (higher priority first):
[  460.021436] upgrade[8876]: /etc/systemd/system
[  460.028479] upgrade[8876]: /run/systemd/system
[  460.035431] upgrade[8876]: /usr/local/lib/systemd/system
[  460.042429] upgrade[8876]: /usr/lib/systemd/system
[  460.049457] upgrade[8876]: Looking for SysV init scripts in:
[  460.057474] upgrade[8876]: /etc/rc.d/init.d
[  460.064430] upgrade[8876]: Looking for SysV rcN.d links in:
[  460.071445] upgrade[8876]: /etc/rc.d
[  460.076454] upgrade[8876]: Looking for unit files in (higher priority first):
[  460.086461] upgrade[8876]: /etc/systemd/system
[  460.093435] upgrade[8876]: /run/systemd/system
[  460.100433] upgrade[8876]: /usr/local/lib/systemd/system
[  460.107474] upgrade[8876]: /usr/lib/systemd/system
[  460.114432] upgrade[8876]: Looking for SysV init scripts in:
[  460.122455] upgrade[8876]: /etc/rc.d/init.d
[  460.129458] upgrade[8876]: Looking for SysV rcN.d links in:
[  460.136468] upgrade[8876]: /etc/rc.d
[  460.141451] upgrade[8876]: Created symlink /etc/systemd/system/multi-user.target.wants/exadata-multipathmon.service, pointing to /etc/systemd/system/exadata-multipathmon.service.

There was not much that I could do so just waited. Also created an SR with Oracle Support and they also suggested to wait. It started moving after some time and completed successfully. Finally when the node came up, i checked that there was an NFS mount entry in /etc/rc.local and that was what created the problem. For the second node, we commented this out and it was all smooth. Important to comment out all NFS entries during patching to avoid all such issues. I had commented the ones in /etc/fstab but the one in rc.local was an unexpected one.

AVDF installation error

I was installing Database Firewall version 12.2.0.11.0 on a Dell x86 machine (with 5 * 500 GB local HDDs configured in RAID 10) and it got successfully installed. Later on, I came to know that this version doesn’t support host monitor functionality on Windows hosts. The latest version that supports that is 12.2.0.10.0. So that was the time to download and install 12.2.0.10.0. The installation started fine but it failed with an error:

Exception occured

anaconda 13.21.263 exception report

File "/usr/lib/anaconda/storage/devices.py",

OSError: [Errno 2] No such file or directory:
'/dev/sr0'

From the script that it is calling i.e. device.py, I guessed it had something to do with the storage. Maybe it was not able to figure out something that was created by the latest version installation. So I removed the RAID configuration and created it again. After this the installation went through without any issues.

OGB Appreciation Day : Thank you community ! (#ThanksOGB)

It was started by Tim Hall in 2016. This is a Thank you community post. There are so many experts posting on Oracle related forums, doing blog posts, sharing their scripts with everyone. All of you are doing a great job. I would like to mention three names especially:

Tim Hall : Tim is a legend ! I don’t consider something a new feature until Tim writes about it 😀

Jonathan Lewis : I don’t think there is anyone on this planet who has even once worked on a performance problem and hasn’t gained something from the knowledge shared by him on forums or in one of the blog posts.

Tanel Poder : His hacking sessions, blog posts and scripts are awesome. And ashtop is amazing man !

Thank you all of you. We learn from everyone of you. Keep rocking !

Understanding grid disks in Exadata

Use of Exadata storage cells seems to be a very poorly understood concept. A lot of people have confusions about how exactly ASM makes uses of disks from storage cells. Many folks assume there is some sort of RAID configured in the storage layer whereas there is nothing like that. I will try to explain some of the concepts in this post.

Let’s take an example of an Exadata quarter rack that has 2 db and 3 storage nodes (node means a server here). Few things to note:

  • The space for binaries installation on db nodes comes from the local disks installed in db nodes (600GB * 4 (expandable to 8) configured in RAID5). In case you are using OVM, same disks are used for keeping configuration files, Virtual disks for VMs etc.
  • All of the ASM space comes from storage cells. The minimum configuration is 3 storage cells.

So let’s try to understand what makes a storage cell. There are 12 disks in each storage cell (latest X7 cells are coming with 10 TB disks). As I mentioned above that there are 3 storage cells in a minimum configuraiton. So we have a total of 36 disks. There is no RAID configured in the storage layer. All the redundancy is handled at ASM level. So to create a disk group:

  • First of all cell disks are created on each storage cell. 1 physical disk makes 1 cell disk. So a quarter rack has 36 cell disks.
  • To divide the space in various disk groups (by default only two disk groups are created : DATA & RECO; you can choose how much space to give to each of them) grid disks are created. grid disk is a partition on the cell disk. slice of a disk in other words. Slice from each cell disk must be part of both the disk groups. We can’t have something like say DATA has 18 disks out of 36 and the RECO has another 18. That is not supported. Let’s say you decide to allocate 5 TB to DATA grid disks and 4 TB to RECO grid disks (out of 10 TB on each disk, approx 9 TB is what you get as usable). So you will divide each cell disk into 2 parts – 5 TB and 4 TB and you would have 36 slices of 5 TB each and 36 slices of 4 TB each.
  • DATA disk group will be created using the 36 5 TB slices where grid disks from each storage cell constitute one failgroup.
  • Similarly RECO disk group will be created using the 36 4 TB slices.

What we have discussed above is a quarter rack scenario with High Capacity (HC) disks. There can be somewhat different configurations too:

  • Instead of HC disks, you can have the Extreme Flash (EF) configuration which uses flash cards in place of disks. Everything remains the same except the number. Instead of 12 HC disks there will be 8 flash cards.
  • With X3 I think, Oracle introduced an eighth rack configuration. In an eighth rack configuration db nodes come with half the cores (of quarter rack db nodes) and storage cells come with 6 disks in each of the cell. So here you would have only 18 disks in total. Everything else works in the same way.

Hope it clarified some of the doubts about grid disks.


ORA-04080: trigger ‘PRICE_HISTORY_TRIGGERV1’ does not exist

It is actually a dumb one. I was disabling triggers in a schema and ran this SQL to generate the disable statements. (Example from here)

HR@test> select 'alter trigger '||trigger_name|| ' disable;' from user_triggers where table_name='PRODUCT';

'ALTERTRIGGER'||TRIGGER_NAME||'DISABLE;'
--------------------------------------------------------------------------------
alter trigger PRICE_HISTORY_TRIGGERv1 disable;

HR@test> alter trigger PRICE_HISTORY_TRIGGERv1 disable;
alter trigger PRICE_HISTORY_TRIGGERv1 disable
*
ERROR at line 1:
ORA-04080: trigger 'PRICE_HISTORY_TRIGGERV1' does not exist


HR@test>

WTF ? It is there but the disable didn’t work. I was in hurry, tried to connect through SQL developer and disable and it worked ! Double WTF ! Then i spotted the problem. Someone created it with one letter in the name in small. So to make it work, we need to use double quotes.

HR@test> alter trigger "PRICE_HISTORY_TRIGGERv1" disable;

Trigger altered.

HR@test>

One of the reasons why you shouldn’t use case sensitive names in Oracle. That is stupid.