Adding a new cell to Exadata with asm scoped security enabled

A customer is using an Exadata X8M-2 machine with multiple VMs (hence multiple clusters). I was working on adding a new storage cell to the configuration. After creating griddisks on the new cell and updating cellip.ora on all the VMs, I noticed that none of the clusters was able to see the new griddisks. I checked the usual suspects like if asm_diskstring was set properly, private network subnet mask on new cell was same as the old ones. All looked good. I started searching about the issue and stumbled upon some references mentioning ASM scoped security. I checked on one of the existing cells and that actually was the issue. The existing nodes had it enabled while the new one hadn’t. Running this command on an existing cell ...

May 9, 2022 at 1:26 PM · 3 min · 555 words · Amardeep Sidhu

Smokescreen detects traffic from an Exadata VM

A customer who is using an Exadata X8M-2 with multiple VMs had Smokescreen deployed in their company recently and they reported an issue that one of the Smokescreen decoy servers in their DC was seeing traffic from one of the Exadata VMs on a certain port. That was rather confusing as that port was the database listener port on that VM and why would a VM with Oracle RAC deployed try to access any random IP on the listener port. Also it was happening only for this VM. Nothing for so many other VMs. ...

March 21, 2022 at 7:30 PM · 1 min · 182 words · Amardeep Sidhu

File system already present at specified mount point /dbfs_direct

It was actually funny. So thought about posting it that sometimes how we can miss the absolute basics. This customer is using a virtualized Exadata with multiple VMs. One VM hosts the database meant to be used for dbfs and another VM connects to this DB over IB to mount dbfs file system using dbfs_client. One day VMs were rebooted and due to some reason the dbfs filesystem didn’t mount on startup. It went on for few days and they couldn’t mount it. One day I got a chance to look at it and the error they were facing was: ...

June 5, 2021 at 8:19 PM · 2 min · 248 words · Amardeep Sidhu

Doing an Exadata mixed cells config with OEDA

Earlier versions of OEDA didn’t allow you to have mixed cells in the configuration i.e. High Capacity (HC) and Extreme Flash (EF). The way to deal with that configuration was that deploy the system with either HC or EF cells and then manually configure the remaining cells. I am not sure when did it change but the newer versions allow you have mixed type of cells in a single OEDA configuration. Once you select the hardware, there is an additional option called Enable Additional Storage, where you can select the other type of cells. The minimum number of cells has to be three to use this option. Also the cells that are at the bottom of the rack physically should be selected as main storage and the other cells should be added as additional storage as that is how OEDA builds the configuration files. ...

October 27, 2020 at 6:53 PM · 2 min · 285 words · Amardeep Sidhu

PRVF-4657 : Name resolution setup check for “db-scan” (IP address: x.x.x.101) failed

A quick note about an error I faced while running root.sh on an Exadata machine. The configuration tools failed with the following error: Error is PRVF-4657 : Name resolution setup check for "db-scan" (IP address: x.x.x.101) failed I did nslookup on the scan name and it all seemed good. So why the error ? After spending another 5 minutes, I looked at /etc/hosts and there was it. Someone had populated /etc/hosts of DB nodes with all the hostnames entries including the scan name. Something like: ...

September 25, 2020 at 7:41 PM · 1 min · 145 words · Amardeep Sidhu

Using Secure Fabric for network isolation in KVM environments on Exadata

Exadata storage software version 20.1 introduces a new feature called “Secure Fabric” for KVM based multi cluster deployments (Exadata X8M). It enables network isolation between multiple tenants (i.e. KVM VMs based RAC clusters). This feature aligns with Infiniband Partitioning on OVM based systems. There are customers who in such scenarios want that VMs of one RAC shouldn’t be able to see traffic of the other RAC VMs. This feature achieves that. Similar to Pkeys in IB switches, here it uses a double VLAN tagging system where the first tag identiefies the network partition and the second tag is used to denote membership level of the VM. Exadata documention has more details. ...

July 17, 2020 at 9:21 PM · 2 min · 215 words · Amardeep Sidhu

Exadata Virtualized DB node restore

There are two common scenarios when we may need this: An existing DB node has crashed and is unrecoverable (due to some failure and non-availability of any backups. Though some of the things may need to be done even if the backups were available). We have an existing Exadata rack that is virtualized. Now there is a new DB node and the existing clusters need to be extended to include the VMs on this new node. I recently faced the first scenario where a virtualized DB node crashed and wasn’t recoverable. A bare metal DB node restore is a relatively simple procedure where we just have to reimage the node, create the needed directories, users etc and add it to the RAC cluster. In case of virtualization, the creation of VMs is an additional step that needs to be done. That makes it slightly more complex. ...

May 11, 2020 at 9:31 PM · 5 min · 918 words · Amardeep Sidhu

dbnodeupdate.sh appears to be stuck

I was patching an Exadata db node from 18.1.5.0.0.180506 to 19.3.2.0.0.191119. It had been more than an hour and dbnodeupdate.sh appeared to be stuck. Trying to ssh to the node was giving “connection refused” and the console had this output (some output removed for brevity): [ 458.006444] upgrade[8876]: [642/676] (72%) installing exadata-sun-computenode-19.3.2.0.0.191119-1... <> [ 459.991449] upgrade[8876]: Created symlink /etc/systemd/system/multi-user.target.wants/exadata-iscsi-reconcile.service, pointing to /etc/systemd/system/exadata-iscsi-reconcile.service. [ 460.011466] upgrade[8876]: Looking for unit files in (higher priority first): [ 460.021436] upgrade[8876]: /etc/systemd/system [ 460.028479] upgrade[8876]: /run/systemd/system [ 460.035431] upgrade[8876]: /usr/local/lib/systemd/system [ 460.042429] upgrade[8876]: /usr/lib/systemd/system [ 460.049457] upgrade[8876]: Looking for SysV init scripts in: [ 460.057474] upgrade[8876]: /etc/rc.d/init.d [ 460.064430] upgrade[8876]: Looking for SysV rcN.d links in: [ 460.071445] upgrade[8876]: /etc/rc.d [ 460.076454] upgrade[8876]: Looking for unit files in (higher priority first): [ 460.086461] upgrade[8876]: /etc/systemd/system [ 460.093435] upgrade[8876]: /run/systemd/system [ 460.100433] upgrade[8876]: /usr/local/lib/systemd/system [ 460.107474] upgrade[8876]: /usr/lib/systemd/system [ 460.114432] upgrade[8876]: Looking for SysV init scripts in: [ 460.122455] upgrade[8876]: /etc/rc.d/init.d [ 460.129458] upgrade[8876]: Looking for SysV rcN.d links in: [ 460.136468] upgrade[8876]: /etc/rc.d [ 460.141451] upgrade[8876]: Created symlink /etc/systemd/system/multi-user.target.wants/exadata-multipathmon.service, pointing to /etc/systemd/system/exadata-multipathmon.service. There was not much that I could do so just waited. Also created an SR with Oracle Support and they also suggested to wait. It started moving after some time and completed successfully. Finally when the node came up, i checked that there was an NFS mount entry in /etc/rc.local and that was what created the problem. For the second node, we commented this out and it was all smooth. Important to comment out all NFS entries during patching to avoid all such issues. I had commented the ones in /etc/fstab but the one in rc.local was an unexpected one. ...

December 21, 2019 at 6:37 AM · 2 min · 277 words · Amardeep Sidhu

Understanding grid disks in Exadata

Use of Exadata storage cells seems to be a very poorly understood concept. A lot of people have confusions about how exactly ASM makes uses of disks from storage cells. Many folks assume there is some sort of RAID configured in the storage layer whereas there is nothing like that. I will try to explain some of the concepts in this post. Let’s take an example of an Exadata quarter rack that has 2 db and 3 storage nodes (node means a server here). Few things to note: ...

February 18, 2019 at 6:37 PM · 3 min · 564 words · Amardeep Sidhu

dbca doesn’t list diskgroups

This is an Exadata machine running GI version 18.3.0.0.180717 and DB version 12.1.0.2.180717. On one of the DB nodes while running dbca, it doesn’t list the diskgroups. it works fine on the other node. I cheked the dbca trace and found that the kfod command was failing. I tried to run it manually and got the same error: [oracle@exadb01 ~]$ /u01/app/18.0.0.0/grid/bin/kfod op=groups verbose=true KFOD-00300: OCI error [-1] [OCI error] [Could not fetch details] [-105777048] KFOD-00105: Could not open pfile 'init@.ora' [oracle@exadb01 ~]$ I ran it with strace then: ...

December 26, 2018 at 9:01 PM · 2 min · 410 words · Amardeep Sidhu