Tag Archives: Exadata

addNode.sh, failed root.sh and IB listener

So this customer has an Exadata quarter rack and they have an IB listener configured on both DB nodes (for DB connections from a multi-racked Exalogic system). We were adding a new DB node to this rack. So just followed the standard procedure of creating users, directories etc on the new node, setting up ssh equivalence and running addNode.sh. All went fine but root.sh failed. Little looking into the logs revealed that it failed while running srvctl start listener –n <node_name>

If we manually run this command, it will immediately reveal what the problem is. It is not able to start IB listener on the new node as the IB VIP doesn’t yet exist. It could happen for any of the additional networks added.

There is a MOS note that describes this exact situation but the solution that it gives is to remove the additional listener, complete addNode.sh & root.sh and add the additional listener back. That wasn’t possible in this case. After little bit of googling I stumbled upon this post by Jeremy Schneider. His colleague solved this problem with a very simple and clever workaround. Before root.sh prepares to run srvctl start listener command, run the add VIP command from another Window Winking smile. Additional network would have already got added when root.sh runs on the new node.

To be able to perform this trick, you have to have the hosts file updated with the new VIP name and IP and be ready with the command to add the VIP. While root.sh is running, it will show a message like “there is already an active cluster, restarting to join”, immediately start trying to run srvctl add vip command in another window. The moment CRS, comes up the command will succeed. Immediately after that root.sh is going to run srvctl start listener command, and this time it shouldn’t fail as the VIP is already added.

Another small mistake we made was not updating the cellip.ora on the new node before running root.sh. That caused the root.sh to fail as it couldn’t talk to ASM running on existing cell nodes. Updating cellip.ora with the existing storage node IPs fixed the problem.

OEDA–Things to keep an eye on

So if you are filling an OEDA for Exadata deployment there are few things you should take care of. Most of the screens are self explanatory but there are some bits where one should focus little more. I am running the Aug version of it and the screenshots below are from that version.

  1. On the Define customer networks screen, the client network is the actual network where your data is going to flow. So typically it is going to be bonded (for high availability) and depending upon the network in your data center you have to select one out of 1/10 G copper and 10 G optical.

    image

  2. If you are going to use trunk VLANs for your client network, remember to enabled it by clicking the Advanced button and then entering the relevant VLAN id.
  3. image

    Also if it is going to be an OVM configuration, you may want to have different VMs in different VLAN segments. It will allow you to change VLAN ids for individual VMs on the respective cluster screens like below

    image

  4. If all the cores aren’t licensed remember to enable Capacity on Demand (COD) on the Identify Compute node OS screen.
  5. image

  6. On the Define clusters screen make sure that you enter a unique (across your environment) cluster name.

    image

  7. The cluster details screen captures some of the most important details like
    1. Whether you want to have flash cache in WriteBack mode instead of WriteThrough
    2. Whether you want to have a role separated install or want to install both GI and Oracle binaries with oracle user itself.
    3. GI & Database versions and home for binaries. Always good to leave it at the Oracle recommended values as that makes the future maintenance easy and less painful.
    4. Disk Group names, redundancy and the space allocation.
    5. Default database name and type (OLTP or DW).

      image

Of course it is important to carefully fill the information in all the screens but the above ones are some of them which should be filled very carefully after capturing the required information from other teams, if needed.

ORA-56841: Master Diskmon cannot connect to a CELL

Faced this error while querying v$asm_disk after adding new storage cell IPs to cellip.ora on DB nodes of an existing cluster on Exadata. Query ends with ORA-03113 end-of-file on communication channel and ORA-56841 is reported in $ORA_CRS_HOME/log/<hostname>/diskmon/diskmon.log. Reason in my case was that the new cell was using different subnet for IB. It was pingable from the db nodes but querying v$asm_disk wasn’t working. Changing the subnet for IB on new cell to the one on existing cells fixed the issue.

 

 

Want to learn Exadata ?

Many people have asked me this question that how they can learn Exadata ? It starts sounding even more difficult as a lot of people don’t have access to Exadata environments. So thought about writing a small post on the same.

It actually is not as difficult as it sounds. There are a lot of really good resources available from where you can learn about Exadata architecture and the things that work differently from any non-Exadata platform. You might be able to do lot more RnD if you have got access to an Exadata environment but don’t worry if you haven’t. Without that also there is a lot that you can explore. So here we go:

  1. I think the best reference that one can start with is Expert Oracle Exadata book by Tanel Poder, Kerry Osborne and Randy Johnson. As a traditional book covers the subject topic by topic from ground up so it makes a fun read. This book is also no different. It will teach you a lot. They are already working on the second edition. (See here).
  2. Next you can jump to whitepapers on Oracle website Exadata page, blog posts (keep an eye on OraNA.info) and whitepapers written by other folks. There is a lot of useful material out there. You just need to Google a bit.
  3. Exadata documentation (not public yet) should be your next stop if you have got access to it. Patch 10386736 on MOS if you have got the access.
  4. Try to attend an Oracle Users Group conference if there is one happening in your area. Most likely someone would be presenting on Exadata so you can use that opportunity to learn about it. Also you will get a chance to ask him questions.
  5. Lastly if you have an Exadata machine available do all the RnD you can.

Happy New Year and Happy Learning !

Updating to Exadata 11.2.3.1.1

Just a quick note about change in the way the compute nodes are patched starting from version 11.2.3.1.1. For earlier versions Oracle provided the minimal pack for patching the compute nodes. Starting with version 11.2.3.1.1 Oracle has discontinued the minimal pack and the updates to compute nodes are done via Unbreakable Linux Network (ULN).

Now there are three ways to update the compute nodes:

1) You have internet access on the Compute nodes. In this case you can download patch 13741363, complete the one time setup and start the update.

2) In case you don’t have internet access on the Compute nodes you can choose some intermediate system (that has internet access) to create a local repository and then point the Compute nodes to this system to install the updates.

3) Oracle will also provide all the future updates via an downloadable ISO image file (patch 14245540 for 11.2.3.1.1). You can download that ISO image file, mount it on some local system and point the compute nodes to this system for updating the rpms (the readme has all the details on how to do this).

Some useful links:

https://blogs.oracle.com/XPSONHA/entry/updating_exadata_compute_nodes_using

https://blogs.oracle.com/XPSONHA/entry/new_channels_for_exadata_11

Metalink note 1466459.1