OSWatcher Startup/Restart On Exadata

When the question of what starts OSWatcher (OSW) on Exadata was raised at a client site I thought I’d take a quick look. It took me a little longer than I expected to work out the detail and therefore it seems worth sharing.

If you’re simply looking to change the “snapshot interval”, “archive retention” or “compression command” then /opt/oracle.cellos/validations/init.d/oswatcher is what you need to modify and you’ll find a line with ./startOSW.sh X Y Z. Where X is the snapshot interval, Y is the archive retention and Z is the compression command used to compress the output files.

If you’re curious to know the details of what starts and restarts OSWatcher than read on.

The following is applicable to the X2-2 I regularly get my hands on which is running 11.2.2.4.2 and I don’t know if things change with later versions, so apologies if this isn’t applicable to your Exadata environment.

Startup of OSWatcher on boot is indirectly handled by /etc/init.d/rc.local, which includes:

########### BEGIN DO NOT REMOVE Added by Oracle Exadata ###########
if [ -x /etc/rc.d/rc.Oracle.Exadata ]; then
  . /etc/rc.d/rc.Oracle.Exadata
fi
########### END DO NOT REMOVE Added by Oracle Exadata ###########

/etc/rc.d/rc.Oracle.Exadata includes:

# Perform validations step
/opt/oracle.cellos/vldrun -all

The main purpose of /opt/oracle.cellos/vldrun and the Perl script /opt/oracle.cellos/validations/bin/vldrun.pl appears to be ensuring configuration changes are made on initial boot and after upgrades, although I haven’t looked into all the detail yet. The part of /opt/oracle.cellos/vldrun that is relevant in the context of starting OSWatcher on every boot is:

$VLDRUN_PL -quiet "$@"

This executes /opt/oracle.cellos/validations/bin/vldrun.pl with the -quiet and -all arguments (as that was passed to /opt/oracle.cellos/vldrun)

The “quiet” argument is pretty obvious and a little reading reveals that “all” simply means that all scripts in /opt/oracle.cellos/validations/init.d/ should be executed.

So off to /opt/oracle.cellos/validations/init.d/ we go:

root@my-host ~]# ls -1 /opt/oracle.cellos/validations/init.d/
beginfirstboot
biosbootorder
cellpreconfig
checkconfigs
checkdeveachboot
checklsi
diskhealth
ipmisettings
misceachboot
misczeroboot
oswatcher
postinstall
sosreport
syscheck
[root@my-host ~]#

… and in oswatcher, as already mentioned in the second paragraph of the post, you’ll find ./startOSW.sh X Y Z, where X is the snapshot interval, Y is the archive retention and Z is the compression command used to compress the output files.

OK, so that’s what starts OSWatcher on boot, but you should also know that OSWatcher is restarted daily by /etc/cron.daily/cellos, which includes:

/opt/oracle.cellos/validations/bin/vldrun.pl -script oswatcher > /dev/null 2>&1

The only bit of all this that doesn’t really sit right with me is that OSWatcher is included with “validations”. That doesn’t seem like an appropriate description to me.

Trivial as it may be, I hope that later version of the Exadata software move from what is described above to the “service” based approach used on non-Exadata platforms and documented in How To Start OSWatcher Black Box Every System Boot [ID 580513.1]. This feel like a much more standard approach and allows control of the service using the /sbin/service and /sbin/chkconfig commands.

RAC/Exadata Shell Aliases

I’ve been working on Oracle RAC systems for a while now and the subject of how best to setup shell environment variables and aliases seems to be something that there is a fair range of opinion on.

  • Do you have entries in oratab for both the database and the instance on that server?
  • Do you use oraenv or set the variables required for each environment via another method?
  • If you use oraenv, do you customise it?
  • How do you switch ORACLE_SID between the db_name and instance_name, if you do?

Please feel free to answer these questions via a comment below, however, the main point of this post is to share a few aliases that I’m finding very useful at the moment, and to encourage you to share aliases you use in your RAC environments. Actually, the non-RAC specific, Oracle related, aliases you create would be interesting to me too, but more on that later…

As I mentioned in a much earlier post I feel that in the world of RAC databases there is scope for an environment variable of ORACLE_DB or ORA_DB_NAME… But, I also feel that there’s a lot to be said for not introducing more variables then is necessary. To this end I have taken to creating the following aliases in RAC environments that I’m working in[1].

alias inst='ORACLE_SID=${ORACLE_SID}$($(grep "^+ASM[1-8]" /etc/oratab | cut -d":" -f2)/bin/olsnodes -l -n | cut -f2)'

alias db='ORACLE_SID=${ORACLE_SID%$($(grep "^+ASM[1-8]" /etc/oratab | cut -d":" -f2)/bin/olsnodes -l -n | cut -f2)}'

These aliases allow me to quickly switch ORACLE_SID from being set to the relevant value for the database to it being set to the relevant value for the instance on the given host. Obviously it doesn’t stop you modifying the value of ORACLE_SID to something that is not valid, but at least it is easy to undo a mistake you might have made.

He’s a quick example:

[ORCL@rac-02 ~]$ inst
[ORCL2@rac-02 ~]$ inst
[ORCL22@rac-02 ~]$ inst
[ORCL222@rac-02 ~]$ inst
[ORCL2222@rac-02 ~]$ db
[ORCL222@rac-02 ~]$ db
[ORCL22@rac-02 ~]$ db
[ORCL2@rac-02 ~]$ db
[ORCL@rac-02 ~]$ 

And, for a little more alias sharing… In my time working on Exadata environments I’ve found myself using dcli a lot and created the aliases below to save me typing on the basis that I, almost always, either want to use dcli to communicate with all “compute nodes” or all “storage cells” so why not simplify the process and shorten the commands:

alias ddcli='dcli -g ~/dbs_group -l oracle'

alias cdcli='dcli -g ~/cell_group -l cellmonitor'

I’d be keen to get a page going for useful aliases so we can all benefit from the ideas of each other… If you’ve got something to share please comment and if the collection builds up we’ll start a page for alias sharing.
__________
[1] – If I’m working in an environment where a shared “oracle” account is in use then I define my aliases in a separate file and source this when I login so that my preferences are not inflicted on others :-)

cell smart file creation

One of the offload capabilities you get with Exadata that I’ve been appreciating recently is “cell smart file creation”. In a nutshell, the fact that formatting of blocks is offloaded to the cells rather than being performed by a process on the database server, or “compute node” if you want to use Exadata speak, makes the creation of datafiles faster.

As as quick demonstration look at the elapsed time for the creation of a 100G datafile. First with _cell_fast_file_create at the default value of TRUE and then set to FALSE in order to disable the offloading of the datafile creation:

SQL> create bigfile tablespace martin01 datafile '+DATA01' size 100G;

Tablespace created.

Elapsed: 00:01:40.39
SQL> alter session set "_cell_fast_file_create" = false;

Session altered.

Elapsed: 00:00:00.00
SQL> create bigfile tablespace martin02 datafile '+DATA01' size 100G;

Tablespace created.

Elapsed: 00:04:15.17

101 secs with offload vs. 255 secs without represents a saving of 60%. Examining the wait events associated with the 2 different datafile creations we see “cell smart file creation” for the offloaded file creation and “Data file init write” for the non-offloaded file creation.

Great, creating datafiles is faster on Exadata… But what about redo logs?

OK, I take the point that redo log files aren’t generally created with the same regularity as datafiles, but one of the things I’ve been working on recently has involved a “roll your own” DUPLICATE FROM ACTIVE DATABASE which itself involves creating a reasonable number of redo logs (4G log files, 4 redo groups, 8 threads = waiting). I don’t like waiting, especially when I’ve been asked to make something go as fast as I can, so I thought I’d have a little look and confirm that creation of redo logs is not offloaded…

Tracing the creation of 2 x 4G redo logs revealed the following:

SQL ID: 3r832sm0yj1gv Plan Hash: 0

alter database add logfile thread 1 group 99 ('+DATA01','+RECO') size 4096m
  blocksize 512

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      3.81      37.96          0          0          2           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      3.81      37.96          0          0          2           0

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 36

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  enq: FS - contention                            2        0.00          0.00
  control file sequential read                   58        0.05          0.13
  KSV master wait                                20        1.71          3.27
  ASM file metadata operation                     6        0.00          0.00
  kfk: async disk IO                              4        0.00          0.00
  Disk file operations I/O                        4        0.00          0.00
  Log file init write                          4381        0.13         31.35
  enq: CF - contention                            1        0.00          0.00
  control file parallel write                    14        0.01          0.03
  log file single write                           2        0.00          0.00
  ges inquiry response                            1        0.00          0.00
  log file sync                                   1        0.00          0.00
  SQL*Net message to client                       1        0.00          0.00
  SQL*Net message from client                     1        0.00          0.00
********************************************************************************

38 seconds might not be long, but when you consider the number of redo log groups and number of redo threads it soon adds up. And when you have 7 databases to do this for, the opportunity to save 60% of the time in this phase starts to look attractive. Add to this that ALTER DATABASE OPEN RESETLOGS spends the vast majority of its time waiting on “Log file init write” and I can’t help feeling that “cell smart log creation” would be a nicer event to be waiting on :-)

I realise that it’s probably way down on the list of enhancements, but I for one would appreciate it.

Fun with _SERIAL_DIRECT_READ

Serial Direct Reads are a subject that has been getting a reasonable amount of attention since 11g was released. Doug Burns, Dion Cho (and here), Kerry Osborne and Miladin Modrakovic have all written on the subject. Tanel Põder also covered it in his presentation at the UKOUG Exadata Special Event.

Kerry and Tanel both cover the importance of direct reads in the context Smart Scan when running Exadata. I won’t reproduce their material here, you can read it direct from them, but to summarise: seeing “storage” an execution plan simply means that Smart Scan is possible, but does not mean that you’ll get it. You must have direct reads.

There are a number of ways to ensure your statement executes using direct reads and the one I’ll cover here is the (hidden) parameter _serial_direct_read.

In the process of using _serial_direct_read to ensure a serial delete used Smart Scan I ran into something that I find both interesting and slightly concerning. Let me demonstrate.

Create an environment to test in:

SQL> create tablespace sdr_test datafile '+DATA01' size 1G autoextend on maxsize 5G
  2  /

Tablespace created.

Elapsed: 00:00:01.41
SQL> create user sdr_test identified by martin
  2    default tablespace sdr_test
  3    quota unlimited on sdr_test
  4  /

User created.

Elapsed: 00:00:00.05
SQL> grant create session
  2  	 , create table
  3  	 , alter session
  4  	 , alter system
  5  	to sdr_test
  6  /

Grant succeeded.

Elapsed: 00:00:00.01

Connect as the test user and create a test table:

SQL> connect sdr_test/martin
Connected.
SDR_TEST - SESSION 1> create table a as
  2  (
  3  select rownum id
  4  	  , rpad(rownum,200,'X') c1
  5    from dual
  6  connect by rownum <= 10000000
  7  )
  8  /

Table created.

Elapsed: 00:00:25.43

I’m sure it’s obvious, but I’ll state it for clarity: with no index on the table all access will be via full table scan.

Flush the buffer cache, set a trace file identifier, start tracing, run DELETE statement and rollback:

SDR_TEST - SESSION 1> alter system flush buffer_cache
  2  /

System altered.

Elapsed: 00:00:02.65
SDR_TEST - SESSION 1> alter session set tracefile_identifier='sdr_1a'
  2  /

Session altered.

Elapsed: 00:00:00.00
SDR_TEST - SESSION 1> exec dbms_session.session_trace_enable(true,true,'ALL_EXECUTIONS')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.02
SDR_TEST - SESSION 1> var id number
SDR_TEST - SESSION 1> exec :id := 1

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.00
SDR_TEST - SESSION 1> delete from a where id = :id
  2  /

1 row deleted.

Elapsed: 00:00:19.93
SDR_TEST - SESSION 1> rollback
  2  /

Rollback complete.

Elapsed: 00:00:00.01

Check the trace file to see what we get:

WAIT #47706004018352: nam='gc cr multi block request' ela= 324 file#=5382 block#=2584 class#=1 obj#=761740 tim=1308902615410001
WAIT #47706004018352: nam='gc cr multi block request' ela= 806 file#=5382 block#=2680 class#=1 obj#=761740 tim=1308902615411418
WAIT #47706004018352: nam='cell multiblock physical read' ela= 12689 cellhash#=764630511 diskhash#=3671767124 bytes=1032192 obj#=761740 tim=1308902615424291
WAIT #47706004018352: nam='gc cr multi block request' ela= 408 file#=5382 block#=2744 class#=1 obj#=761740 tim=1308902615426480
WAIT #47706004018352: nam='gc cr multi block request' ela= 1037 file#=5382 block#=2808 class#=1 obj#=761740 tim=1308902615428066
WAIT #47706004018352: nam='cell multiblock physical read' ela= 5475 cellhash#=764630511 diskhash#=3671767124 bytes=1032192 obj#=761740 tim=1308902615433706
WAIT #47706004018352: nam='gc cr multi block request' ela= 696 file#=5382 block#=2881 class#=1 obj#=761740 tim=1308902615436139
WAIT #47706004018352: nam='gc cr multi block request' ela= 895 file#=5382 block#=2943 class#=1 obj#=761740 tim=1308902615437558
WAIT #47706004018352: nam='cell multiblock physical read' ela= 5300 cellhash#=764630511 diskhash#=3671767124 bytes=1032192 obj#=761740 tim=1308902615443082
WAIT #47706004018352: nam='gc cr multi block request' ela= 916 file#=5382 block#=3009 class#=1 obj#=761740 tim=1308902615445733

… That’s not a Smart Scan.

Set _serial_direct_read to TRUE, set new trace file identifier, flush buffer cache, run DELETE and rollback:

SDR_TEST - SESSION 1> alter session set "_serial_direct_read"=true
  2  /

Session altered.

Elapsed: 00:00:00.00
SDR_TEST - SESSION 1> alter session set tracefile_identifier='sdr_1b'
  2  /

Session altered.

Elapsed: 00:00:00.00
SDR_TEST - SESSION 1> alter system flush buffer_cache
  2  /

System altered.

Elapsed: 00:00:00.45
SDR_TEST - SESSION 1> delete from a where id = :id
  2  /

1 row deleted.

Elapsed: 00:00:14.54
SDR_TEST - SESSION 1> rollback
  2  /

Rollback complete.

Elapsed: 00:00:00.00

Check the trace file to see what we get:

WAIT #47706540316360: nam='cell multiblock physical read' ela= 8168 cellhash#=764630511 diskhash#=1422992472 bytes=1048576 obj#=761740 tim=1308902638029103
WAIT #47706540316360: nam='cell multiblock physical read' ela= 5468 cellhash#=764630511 diskhash#=1422992472 bytes=1048576 obj#=761740 tim=1308902638035744
WAIT #47706540316360: nam='cell multiblock physical read' ela= 5190 cellhash#=764630511 diskhash#=1422992472 bytes=1015808 obj#=761740 tim=1308902638042035
WAIT #47706540316360: nam='cell multiblock physical read' ela= 13723 cellhash#=2714602255 diskhash#=266393524 bytes=1048576 obj#=761740 tim=1308902638056851
WAIT #47706540316360: nam='cell multiblock physical read' ela= 7911 cellhash#=2714602255 diskhash#=266393524 bytes=1048576 obj#=761740 tim=1308902638066297
WAIT #47706540316360: nam='cell multiblock physical read' ela= 5505 cellhash#=2714602255 diskhash#=266393524 bytes=1048576 obj#=761740 tim=1308902638072991
WAIT #47706540316360: nam='cell multiblock physical read' ela= 5218 cellhash#=2714602255 diskhash#=266393524 bytes=32768 obj#=761740 tim=1308902638079404
WAIT #47706540316360: nam='cell multiblock physical read' ela= 1424 cellhash#=1730663586 diskhash#=755155853 bytes=1048576 obj#=761740 tim=1308902638082024
WAIT #47706540316360: nam='cell multiblock physical read' ela= 1493 cellhash#=1730663586 diskhash#=755155853 bytes=1048576 obj#=761740 tim=1308902638084697
WAIT #47706540316360: nam='cell multiblock physical read' ela= 1314 cellhash#=1730663586 diskhash#=755155853 bytes=1048576 obj#=761740 tim=1308902638087102

… That’s still not a Smart Scan :-(

Set new trace file identifier, flush buffer cache, run DELETE (with an embedded comment to force hard parse) and rollback:

SDR_TEST - SESSION 1> alter session set tracefile_identifier='sdr_1c'
  2  /

Session altered.

Elapsed: 00:00:00.00
SDR_TEST - SESSION 1> alter system flush buffer_cache
  2  /

System altered.

Elapsed: 00:00:00.23
SDR_TEST - SESSION 1> delete /* SESSION 1 */ from a where id = :id
  2  /

1 row deleted.

Elapsed: 00:00:00.29
SDR_TEST - SESSION 1> rollback
  2  /

Rollback complete.

Elapsed: 00:00:00.01

Check the trace file to see what we get:

WAIT #47706540290640: nam='cell smart table scan' ela= 26 cellhash#=3018218633 p2=0 p3=0 obj#=761740 tim=1308902650538344
WAIT #47706540290640: nam='cell smart table scan' ela= 8 cellhash#=3018218633 p2=0 p3=0 obj#=761740 tim=1308902650538400
WAIT #47706540290640: nam='cell smart table scan' ela= 310 cellhash#=3018218633 p2=0 p3=0 obj#=761740 tim=1308902650538741
WAIT #47706540290640: nam='cell smart table scan' ela= 9 cellhash#=3018218633 p2=0 p3=0 obj#=761740 tim=1308902650538821
WAIT #47706540290640: nam='cell smart table scan' ela= 26 cellhash#=3018218633 p2=0 p3=0 obj#=761740 tim=1308902650538887
WAIT #47706540290640: nam='cell smart table scan' ela= 25 cellhash#=3018218633 p2=0 p3=0 obj#=761740 tim=1308902650538961
WAIT #47706540290640: nam='cell smart table scan' ela= 25 cellhash#=1078241440 p2=0 p3=0 obj#=761740 tim=1308902650539712
WAIT #47706540290640: nam='cell smart table scan' ela= 35 cellhash#=1078241440 p2=0 p3=0 obj#=761740 tim=1308902650539801
WAIT #47706540290640: nam='cell smart table scan' ela= 7 cellhash#=1078241440 p2=0 p3=0 obj#=761740 tim=1308902650539856
WAIT #47706540290640: nam='cell smart table scan' ela= 564 cellhash#=1078241440 p2=0 p3=0 obj#=761740 tim=1308902650540456

… Smart Scan, Smart Scan! :-)

Start a new session, set variable, set new trace file identifier, start tracing, run DELETE and rollback:

SQL> connect sdr_test/martin
Connected.
SDR_TEST - SESSION 2> var id number
SDR_TEST - SESSION 2> exec :id := 1

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.00
SDR_TEST - SESSION 2> alter session set tracefile_identifier='sdr_2a'
  2  /

Session altered.

Elapsed: 00:00:00.00
SDR_TEST - SESSION 2> exec dbms_session.session_trace_enable(true,true,'ALL_EXECUTIONS')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.00
SDR_TEST - SESSION 2> alter system flush buffer_cache
  2  /

System altered.

Elapsed: 00:00:00.12
SDR_TEST - SESSION 2> delete /* SESSION 2 */ from a where id = :id
  2  /

1 row deleted.

Elapsed: 00:00:14.07
SDR_TEST - SESSION 2> 
SDR_TEST - SESSION 2> rollback
  2  /

Rollback complete.

Elapsed: 00:00:00.01

Check the trace file to see what we get:

WAIT #47830984391856: nam='cell multiblock physical read' ela= 5305 cellhash#=2714602255 diskhash#=1846921898 bytes=1032192 obj#=761740 tim=1308902651164153
WAIT #47830984391856: nam='cell multiblock physical read' ela= 1356 cellhash#=2714602255 diskhash#=1846921898 bytes=1032192 obj#=761740 tim=1308902651168319
WAIT #47830984391856: nam='cell multiblock physical read' ela= 1411 cellhash#=1857444352 diskhash#=3010233083 bytes=1032192 obj#=761740 tim=1308902651174085
WAIT #47830984391856: nam='cell multiblock physical read' ela= 1346 cellhash#=1857444352 diskhash#=3010233083 bytes=1032192 obj#=761740 tim=1308902651178202
WAIT #47830984391856: nam='cell multiblock physical read' ela= 1286 cellhash#=1857444352 diskhash#=3010233083 bytes=1032192 obj#=761740 tim=1308902651182195
WAIT #47830984391856: nam='cell multiblock physical read' ela= 1259 cellhash#=1857444352 diskhash#=3010233083 bytes=1032192 obj#=761740 tim=1308902651186208
WAIT #47830984391856: nam='cell multiblock physical read' ela= 15622 cellhash#=281363246 diskhash#=3630261201 bytes=1032192 obj#=761740 tim=1308902651206140
WAIT #47830984391856: nam='cell multiblock physical read' ela= 5431 cellhash#=281363246 diskhash#=3630261201 bytes=1032192 obj#=761740 tim=1308902651214612
WAIT #47830984391856: nam='cell multiblock physical read' ela= 5315 cellhash#=281363246 diskhash#=3630261201 bytes=1032192 obj#=761740 tim=1308902651222641
WAIT #47830984391856: nam='cell multiblock physical read' ela= 5303 cellhash#=281363246 diskhash#=3630261201 bytes=1032192 obj#=761740 tim=1308902651230702

… Not a Smart Scan, but that’s expected.

Set a new trace file identifier, flush buffer cache and run DELETE statement from session 1 in session 2:

SDR_TEST - SESSION 2> alter session set tracefile_identifier='sdr_2b'
  2  /

Session altered.

Elapsed: 00:00:00.00
SDR_TEST - SESSION 2> alter system flush buffer_cache
  2  /

System altered.

Elapsed: 00:00:00.21
SDR_TEST - SESSION 2> delete /* SESSION 1 */ from a where id = :id
  2  /

1 row deleted.

Elapsed: 00:00:00.25
SDR_TEST - SESSION 2> rollback
  2  /

Rollback complete.

Elapsed: 00:00:00.01

Check the trace file to see what we get:

WAIT #47830986253416: nam='cell smart table scan' ela= 48 cellhash#=281363246 p2=0 p3=0 obj#=761740 tim=1308902665561263
WAIT #47830986253416: nam='cell smart table scan' ela= 8 cellhash#=281363246 p2=0 p3=0 obj#=761740 tim=1308902665561338
WAIT #47830986253416: nam='cell smart table scan' ela= 3944 cellhash#=281363246 p2=0 p3=0 obj#=761740 tim=1308902665565316
WAIT #47830986253416: nam='cell smart table scan' ela= 10 cellhash#=764630511 p2=0 p3=0 obj#=761740 tim=1308902665565394
WAIT #47830986253416: nam='cell smart table scan' ela= 3869 cellhash#=764630511 p2=0 p3=0 obj#=761740 tim=1308902665569302
WAIT #47830986253416: nam='cell smart table scan' ela= 14 cellhash#=764630511 p2=0 p3=0 obj#=761740 tim=1308902665569389
WAIT #47830986253416: nam='cell smart table scan' ela= 6876 cellhash#=764630511 p2=0 p3=0 obj#=761740 tim=1308902665576298
WAIT #47830986253416: nam='cell smart table scan' ela= 26 cellhash#=3970173858 p2=0 p3=0 obj#=761740 tim=1308902665576745
WAIT #47830986253416: nam='cell smart table scan' ela= 25 cellhash#=3970173858 p2=0 p3=0 obj#=761740 tim=1308902665576816
WAIT #47830986253416: nam='cell smart table scan' ela= 25 cellhash#=281363246 p2=0 p3=0 obj#=761740 tim=1308902665576988

… Smart Scan! But, hold on a minute, that means it’s using direct reads.

Set _serial_direct_read to NEVER, set new trace file identifier, flush buffer cache, run delete from session 1 in session 2 and rollback:

SDR_TEST - SESSION 2> alter session set "_serial_direct_read"=never
  2  /

Session altered.

Elapsed: 00:00:00.00
SDR_TEST - SESSION 2> alter session set tracefile_identifier='sdr_2c'
  2  /

Session altered.

Elapsed: 00:00:00.00
SDR_TEST - SESSION 2> alter system flush buffer_cache
  2  /

System altered.

Elapsed: 00:00:00.13
SDR_TEST - SESSION 2> delete /* SESSION 1 */ from a where id = :id
  2  /

1 row deleted.

Elapsed: 00:00:00.20
SDR_TEST - SESSION 2> rollback
  2  /

Rollback complete.

Elapsed: 00:00:00.00

Check the trace file to see what we get:

WAIT #47830984369624: nam='cell smart table scan' ela= 201 cellhash#=3108293528 p2=0 p3=0 obj#=761740 tim=1308902665858678
WAIT #47830984369624: nam='cell smart table scan' ela= 8 cellhash#=3108293528 p2=0 p3=0 obj#=761740 tim=1308902665858742
WAIT #47830984369624: nam='cell smart table scan' ela= 201 cellhash#=3108293528 p2=0 p3=0 obj#=761740 tim=1308902665858973
WAIT #47830984369624: nam='cell smart table scan' ela= 8 cellhash#=3108293528 p2=0 p3=0 obj#=761740 tim=1308902665859048
WAIT #47830984369624: nam='cell smart table scan' ela= 288 cellhash#=3108293528 p2=0 p3=0 obj#=761740 tim=1308902665859367
WAIT #47830984369624: nam='cell smart table scan' ela= 7 cellhash#=3108293528 p2=0 p3=0 obj#=761740 tim=1308902665859415
WAIT #47830984369624: nam='cell smart table scan' ela= 23 cellhash#=3108293528 p2=0 p3=0 obj#=761740 tim=1308902665859471
WAIT #47830984369624: nam='cell smart table scan' ela= 7 cellhash#=3108293528 p2=0 p3=0 obj#=761740 tim=1308902665859521
WAIT #47830984369624: nam='cell smart table scan' ela= 868 cellhash#=3108293528 p2=0 p3=0 obj#=761740 tim=1308902665860420
WAIT #47830984369624: nam='cell smart table scan' ela= 7 cellhash#=3108293528 p2=0 p3=0 obj#=761740 tim=1308902665860485

… Still doing Smart Scan!

During his UKOUG presentation Tanel emphasised that the CBO knows nothing about Smart Scan, which is confirmed by execution plans and CBO traces, but it seems that there must be information about how the execution plan will actually be executed stored in memory, and associated with the execution plan… I don’t know where that is yet, but I’m guessing Tanel does.

Anyway, the key point is that the value of _serial_direct_read is not considered by the CBO to be part of the “optimizer environment”, but it appears (to me) that it should be. After all, wouldn’t you expect ALTER SESSION to do just that, rather than having the ability to affect other sessions? I appreciate that this is a hidden parameter, which shouldn’t be used without agreement from Oracle Support, but it seems like something has been overlooked. In Kerry’s PDF Tuning Exadata he mentions that _serial_direct_read is not compatible with the OPT_PARAM hint, which is a more succinct way of saying all the above, but the implications weren’t clear to me when I first read that.

Maybe at some point the parameter will lose its underscore and become part of the the optimizer environment… Time will tell.

UK OUG conference: RAC & HA special interest group round table

The RAC and HA roundtable gives users a chance to share experience and gotchas. Often the discussion points that generate most interest are when something unusual or unexplained has happened, with the participants attempting to drill down to where the problem was by firing questions at the DBA telling the tale. The discussion this time there was such a discussion around a RAC stretch cluster where the storage connection had been deliberately broken during a test (but not the RAC interconnect between the two sites). The end result of this was a failure of the cluster but after a restart, the database had been rolled back to a point in time several minutes before the failure. This is discussed here: http://davidburnham.wordpress.com/category/oracle-rac-stretched-cluster-testing/

Some participants suggested that Active Data Guard was a better option than a stretch RAC cluster with one saying that Active Data Guard should be the default position with RAC only deployed if Active DataGuard didn’t fit the bill. Unfortunately as a presentation on DataGuard was taking place at the same time, most people using it were presumably elsewhere.

Another discussion point was whether RAC actually reduces the volume of outages, ensuring high availability. Participants generally agreed that it is great for installing rolling updates and so on without disruption – in other words, planned maintenance work – and dealing with some other modes of failure. But sometimes the added complexity could cause unplanned outages that may never have happened on a single instance system. One participant argued that some of the outages cited should not happen, since they were down to human error.

Talk moved on to whether an Oracle Exadata box represents good value for money. One participant mentioned that he could easily build a similar set up from commodity hardware for a fraction of the cost, but it would not include a high-speed interconnect or the Exadata software. He cited some data-warehouse type queries that were sped up by an order of magnitude when implemented using Exadata. This had transformed the attitude to these queries in the company concerned: because they were quick to run, users were more likely to use them. So the answer is that it depends.

A somewhat inconclusive discussion took place on how best to deploy ASM when there are multiple databases: should all databases share the same ASM instance or diskgroup? One participant suggested that ASM was effectively a single point of failure which could cause all databases to fail in the event on a software bug. Another responded that he had been using ASM for four years without any such disruption. I guess ASM should just be viewed as another filesystem for this point of view and what are the chances of GPFS or even raw devices failing?

The number of RAC nodes in a cluster was another talking point. With a two node cluster, in the event of a failure you are left with just a single node which ought to be able to manage the full system load. Many participants seemed to prefer a three or four node cluster as 4 to 3 or 3 to 2 isn’t as dramatic a change as 2 to 1.

So, it was an interesting discussion and definitely a refreshing change from sitting passively through presentations.