I’ve been meaning to write this up for a while now, but just haven’t found the time. Anyway, this is a little “gotcha” for those installing 11.2 Grid Infrastructure that care about consistency of naming… Maybe you don’t? Maybe I shouldn’t?
While building a 4 node RAC system I got to the point of:
You must run the
root.shscript on the first node and wait for it to finish. If your cluster has four or more nodes, then
root.shcan be run concurrently on all nodes but the first and last. As with the first node, the
root.shscript on the last node must be run separately.
So, I merrily run
root.sh and afterwards find that my ASM instances are named in a way I didn’t like or expect. My 4 servers were named: 0ra11-2-1, 0ra11-2-2, 0ra11-2-3, 0ra11-2-4; and I ended up with ASM instances: +ASM1, +ASM2, +ASM3, +ASM4. All as you’d expect. However, +ASM2 was running on ora11-2-3 and +ASM3 was running on ora11-2-2!
Q1: Does it really matter?
A1: No. At least I can’t see a reason why it would matter, but if you can think of any then please comment.
Q2: Did I want to understand why it happened and how to avoid it?
A2: Of course.
So, a little digging and experimentation later I found what I believe to be the cause of the “problem”. In the rootcrs_`hostname`.log files I found the start time and the point where the ASM instance is created.
Note: There wasn’t anything specifically stating that the ASM instance was being created, but while running
root.sh during later tests I watched for the creation of the ASM record in /etc/oratab and correlated that with the log file.
Start of the
root.sh on nodes 2 and 3:
[root@ora11-2-2 ~]# grep "The configuration" $ORACLE_HOME/cfgtoollogs/crsconfig/rootcrs_ora11-2-*.log 2011-01-08 00:48:48: The configuration parameter file /u01/app/18.104.22.168/grid/crs/install/crsconfig_params is valid [root@ora11-2-3 ~]# grep "The configuration" $ORACLE_HOME/cfgtoollogs/crsconfig/rootcrs_ora11-2-*.log 2011-01-08 00:48:54: The configuration parameter file /u01/app/22.214.171.124/grid/crs/install/crsconfig_params is valid
Creation of ASM instance on nodes 2 and 3:
[root@ora11-2-2 ~]# grep "Start of resource \"ora.cluster_interconnect.haip\" Succeeded" $ORACLE_HOME/cfgtoollogs/crsconfig/rootcrs_ora11-2-*.log 2011-01-08 00:56:50: Start of resource "ora.cluster_interconnect.haip" Succeeded [root@ora11-2-3 ~]# grep "Start of resource \"ora.cluster_interconnect.haip\" Succeeded" $ORACLE_HOME/cfgtoollogs/crsconfig/rootcrs_ora11-2-*.log 2011-01-08 00:56:34: Start of resource "ora.cluster_interconnect.haip" Succeeded
The key thing to note is the times. The running of root.sh on ora11-2-2 started before ora11-2-3, but for whatever reason it got to the creation of the ASM instance on ora11-2-3 before it did on ora11-2-3.
I found it impossible to leave the system with the naming mismatch, so used
rootcrs.pl to deconfigure Clusterware and re-ran
root.sh, this time allowing it to finish on each node before starting the next. I ended with the ASM instance names matching the hostnames and got on with creating databases.
I haven’t tested this or dug deep enough into the code to be 100% sure of the above explanation, so if anyone has alternative suggestions then please share them.