Deterministic Service Failover

Until some testing about 6 months ago I was under the impression that if a database service was defined as being “preferred” on 1 instance of a RAC database and “available” on N other instances (where N is greater than 1); and the preferred instance is killed then the instance that the service moves to is not deterministic, i.e., Oracle picks one of the available instances to start the service on in a unpredictable way. I cannot work out what gave me this impression, so will just have to put it down to an assumption I made at some point in the past and never validated. From conversations with other DBAs I was not alone.

Anyway, I now believe that the failover it deterministic as demonstrated below.

Creating some test services

$ srvctl add service -d DB_TST -s DB_TST_SVC1 -r "DB_TST1" -a "DB_TST2,DB_TST3,DB_TST4"
$ srvctl add service -d DB_TST -s DB_TST_SVC2 -r "DB_TST1" -a "DB_TST3,DB_TST4,DB_TST2"
$ srvctl add service -d DB_TST -s DB_TST_SVC3 -r "DB_TST1" -a "DB_TST4,DB_TST2,DB_TST3"

Verifying the state of the services

$ srvctl status service -d DB_TST
Service DB_TST_SVC1 is running on instance(s) DB_TST1
Service DB_TST_SVC2 is running on instance(s) DB_TST1
Service DB_TST_SVC3 is running on instance(s) DB_TST1

Killing instance 1

At this point I killed instance 1 using “shutdown abort” in SQL*Plus.

Checking the status of the services

$ srvctl status service -d DB_TST
Service DB_TST_SVC1 is running on instance(s) DB_TST2
Service DB_TST_SVC2 is running on instance(s) DB_TST3
Service DB_TST_SVC3 is running on instance(s) DB_TST4

As you can see the service failed over to the instance based on the order the instances are defined in the “available” list. I did a few more tests to convince myself that this wasn’t just a one off, but won’t fill this post with the output.

If you’re reading this and thinking, “Yes, that was obvious.” Sorry!

During the testing I ran into something else that appears very trivial, but worth being aware of… When is shutdown abort not an immediate abort.

Oracle 6 VM Ignoring “shutdown” from Host

This seems like a great candidate for a MEEK (maybe everyone else knows) post…

When I started creating Oracle Linux 6 guest VMs in my KVM base environment it wasn’t long before I noticed that I was not able to shutdown the VMs via virt-manager. This seemed a bit odd as it worked fine for my Oracle Linux 5 guests.

A consequence of libvirt, via virt-manager or otherwise, not being able to stop my guests was that I could no longer just issue a shutdown in my host machine when I’d done with it for the day in order to shutdown everything in one go… The solution to that soon became to issue the shutdown for each of my Oracle Linux 6 guests via ssh before stopping the host. It wasn’t pretty, but it worked.

Well, today, in the process of working on something unrelated, I ran into something that looked to be a solution:

acpid

It’s completely obvious now I think about it, but it had never occurred to me that it would not be installed and running by default on Oracle Linux 6.

I ran yum install acpid in one of my VMs and started the service via service acpid start. Hey presto, I’m now able to stop that guest VM via virt-manager.

Incredibly simple, but will make my life just a little bit easier.

Missing file#

Another in my MEEK (maybe everyone else knows) series…

Without going into too much detail I found myself looking at parameters for “buffer busy waits” after getting a report that a single statement had spent over 5 hours waiting on this event!

The output below shows the parameters for the “buffer busy waits” as they occurred:

SQL> select p1, p2, p3, count(*) from gv$session
  2   where event = 'buffer busy waits' group by p1, p2, p3;

        P1         P2         P3   COUNT(*)
---------- ---------- ---------- ----------
      5002          2         13         42

SQL>

Finding the file and block is easy right? For the file you just take P1 (for file#) and query v$datafile on file#, as covered in the documentation here and to find the segment use both P1 (for file#) and P2 (for block#) with the query found here.

All cool. Well, until you get this:

SQL> select name from v$datafile where file# = 5002;

no rows selected

SQL>

My next thought was, “OK, what about v$tempfile?”

SQL> select name from v$tempfile where file# = 5002;

no rows selected

SQL>

Working on this was not a priority to the project, but it had me confused so kept rattling about in my head. I asked a few people, but that didn’t turn up any ideas regarding this apparently missing file.

I kept coming back to it when I had a few spare minutes and the person with the original issue would periodically ask me if I’d worked it out yet, just to make sure I had not forgotten him. He was convinced it was temporary tablespace related and I was happy to believe that he was right, but I needed proof. I needed to understand where the wait event parameters were coming from…

One thing that made me question the temporary tablespace theory was that the trace file I’d generated from one of the parallel slaves during a re-run of the statement, to confirm the problem was not intermittent, did not show any “temp” specific wait events. All I had in the trace for file#=5002 were:

  • ‘buffer busy waits’
  • ‘db file single write’
  • ‘gc buffer busy acquire’
  • ‘gc buffer busy release’
  • ‘gc current multi block request’
  • ‘local write wait’

I shouldn’t have let that sway me. It wasn’t a complete trace.

I’d done a fair bit of digging on MOS, but hadn’t been able to find anything applicable so I turned to Google. I guess I hit upon the right search term as I found this thread on Oracle-L which points out that the file# for tempfiles in wait events and trace files is db_files (parameter) + v$tempfile.file#.

This tied up nicely. db_files was set to 5000 and we had a couple of temporary tablespaces…

I passed this information back and said I’d look for official documentation that confirms this. After a bit of searching in MOS and the documentation I gave up, having realised that I could easily verify the “theory” on my local XE installation. After all, empirical evidence is what counts… The information may be there in MOS or the documentation, I just failed to find it!

For anyone interested in the test, this is what I did:

SQL> col name for a60
SQL> select file#, name from v$tempfile

     FILE# NAME
---------- ------------------------------------------------------------
         1 C:\ORACLEXE\APP\ORACLE\ORADATA\XE\TEMP.DBF

SQL> show parameter db_files

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_files                             integer     200
SQL> create global temporary table gtt (c1 number);

Table created.

SQL> exec dbms_session.session_trace_enable

PL/SQL procedure successfully completed.

SQL> insert into gtt select rownum from dual connect by rownum <= 10000;

10000 rows created.

SQL> exec dbms_session.session_trace_disable

PL/SQL procedure successfully completed.

SQL> conn / as sysdba
Connected.
SQL> alter system set db_files = 100 scope = spfile;

System altered.

SQL> startup force
.
.
.
SQL> conn mnash
Enter password:
Connected.
SQL> exec dbms_session.session_trace_enable

PL/SQL procedure successfully completed.

SQL> insert into gtt select rownum from dual connect by rownum <= 10000;

10000 rows created.

SQL> exec dbms_session.session_trace_disable

PL/SQL procedure successfully completed.

SQL>

As you’d expect writing rows to a global temporary table will write to the temporary tablespace and inspecting the two trace files shows that file# (actually “fileno” on Windows) changes by the change in the db_files parameter.

From trace file 1

WAIT #255303800: nam='Disk file operations I/O' ela= 21086 FileOperation=2 fileno=201 filetype=2 obj#=-1 tim=283987684761

From trace file 2

WAIT #1086755940: nam='Disk file operations I/O' ela= 3527 FileOperation=2 fileno=101 filetype=2 obj#=-1 tim=285287446239

METHOD_OPT of NULL

This is the start of a series of MEEK (maybe everyone else knows) posts as I seem to keep running into things about Oracle that I feel I should have known.

Until very recently I did not know what using METHOD_OPT => NULL in a DBMS_STATS call does. If you do then stop reading here 🙂

Creation of a test table…

MNASH@orcl> create table t as select * from dba_objects;

Table created.

MNASH@orcl> select count(*) from t;

  COUNT(*)
----------
     72563

MNASH@orcl> select table_name, num_rows from user_tables where table_name = 'T';

TABLE_NAME   NUM_ROWS
---------- ----------
T

MNASH@orcl> select table_name, column_name, num_distinct from user_tab_columns where table_name = 'T';

TABLE_NAME COLUMN_NAME                    NUM_DISTINCT
---------- ------------------------------ ------------
T          OWNER
T          OBJECT_NAME
T          SUBOBJECT_NAME
T          OBJECT_ID
T          DATA_OBJECT_ID
T          OBJECT_TYPE
T          CREATED
T          LAST_DDL_TIME
T          TIMESTAMP
T          STATUS
T          TEMPORARY
T          GENERATED
T          SECONDARY
T          NAMESPACE
T          EDITION_NAME

15 rows selected.

MNASH@orcl>

Gathering statistics using DBMS_STATS.GATHER_TABLE_STATS with METHOD_OPT => NULL

MNASH@orcl> exec dbms_stats.gather_table_stats(null,'T',method_opt=>null)

PL/SQL procedure successfully completed.

MNASH@orcl> select table_name, num_rows from user_tables where table_name = 'T';

TABLE_NAME   NUM_ROWS
---------- ----------
T               72563

MNASH@orcl> select table_name, column_name, num_distinct from user_tab_columns where table_name = 'T';

TABLE_NAME COLUMN_NAME                    NUM_DISTINCT
---------- ------------------------------ ------------
T          OWNER
T          OBJECT_NAME
T          SUBOBJECT_NAME
T          OBJECT_ID
T          DATA_OBJECT_ID
T          OBJECT_TYPE
T          CREATED
T          LAST_DDL_TIME
T          TIMESTAMP
T          STATUS
T          TEMPORARY
T          GENERATED
T          SECONDARY
T          NAMESPACE
T          EDITION_NAME

15 rows selected.

MNASH@orcl>

Statistics are gathered for the table, but not for the columns of the table.

I found out about this through a friend after puzzling over how some of the tables in a database had statistics at a table level, but none for the columns.

Just for completeness, the following shows what happens when METHOD_OPT is allowed to default:

MNASH@orcl> exec dbms_stats.gather_table_stats(null,'T')

PL/SQL procedure successfully completed.

MNASH@orcl> select table_name, num_rows from user_tables where table_name = 'T';

TABLE_NAME   NUM_ROWS
---------- ----------
T               72563

MNASH@orcl> select table_name, column_name, num_distinct from user_tab_columns where table_name = 'T';

TABLE_NAME COLUMN_NAME                    NUM_DISTINCT
---------- ------------------------------ ------------
T          OWNER                                    32
T          OBJECT_NAME                           43784
T          SUBOBJECT_NAME                          108
T          OBJECT_ID                             72563
T          DATA_OBJECT_ID                         7705
T          OBJECT_TYPE                              44
T          CREATED                                1174
T          LAST_DDL_TIME                          1272
T          TIMESTAMP                              1318
T          STATUS                                    1
T          TEMPORARY                                 2
T          GENERATED                                 2
T          SECONDARY                                 2
T          NAMESPACE                                21
T          EDITION_NAME                              0

15 rows selected.

MNASH@orcl>

Whether or not you want to use the default, which is to allow Oracle to create histograms as it sees appropriate, is a separate point which has received plenty of attention elsewhere.

In the case I was looking into there was actually a bug in the code used as a wrapper to DBMS_STATS and it was not intended that no column statistics were gathered, but I started to wonder if there would be a situation where using this undocumented option would be attractive. I did some fairly simple testing to establish the overhead of collecting column statistics…

The test was:

1 – Increase the number of rows in table T to around 1 million
2 – Delete table statistics for table T
3 – Flush the buffer cache
4 – Gather table statistics using compute with method_opt => null
5 – Repeat (2) to (4) 5 times
6 – Delete table statistics for table T
7 – Flush the buffer cache
8 – Gather table statistics using compute with method_opt => ‘for all columns size 1’
9 – Repeat (6) to (8) 5 times

Looking at the elapsed time for the statistics gathering I got averages of:

null: 2.3 seconds
for all columns size 1: 14.5 seconds

That’s a pretty big overhead, but a lot of information you are robbing the CBO of if you don’t gather them!

I then ran the same test, but with AUTO_SAMPLE_SIZE as this is a more realistic case and got averages of:

null: 1.5 seconds
for all columns size 1: 3.4 seconds

Note: The tests were performed on a 11.2.0.1 database.