cell smart file creation
One of the offload capabilities you get with Exadata that I’ve been appreciating recently is “cell smart file creation”. In a nutshell, the fact that formatting of blocks is offloaded to the cells rather than being performed by a process on the database server, or “compute node” if you want to use Exadata speak, makes the creation of datafiles faster.
As as quick demonstration look at the elapsed time for the creation of a 100G datafile. First with _cell_fast_file_create at the default value of TRUE and then set to FALSE in order to disable the offloading of the datafile creation:
SQL> create bigfile tablespace martin01 datafile '+DATA01' size 100G; Tablespace created. Elapsed: 00:01:40.39 SQL> alter session set "_cell_fast_file_create" = false; Session altered. Elapsed: 00:00:00.00 SQL> create bigfile tablespace martin02 datafile '+DATA01' size 100G; Tablespace created. Elapsed: 00:04:15.17
101 secs with offload vs. 255 secs without represents a saving of 60%. Examining the wait events associated with the 2 different datafile creations we see “cell smart file creation” for the offloaded file creation and “Data file init write” for the non-offloaded file creation.
Great, creating datafiles is faster on Exadata… But what about redo logs?
OK, I take the point that redo log files aren’t generally created with the same regularity as datafiles, but one of the things I’ve been working on recently has involved a “roll your own” DUPLICATE FROM ACTIVE DATABASE which itself involves creating a reasonable number of redo logs (4G log files, 4 redo groups, 8 threads = waiting). I don’t like waiting, especially when I’ve been asked to make something go as fast as I can, so I thought I’d have a little look and confirm that creation of redo logs is not offloaded…
Tracing the creation of 2 x 4G redo logs revealed the following:
SQL ID: 3r832sm0yj1gv Plan Hash: 0
alter database add logfile thread 1 group 99 ('+DATA01','+RECO') size 4096m
blocksize 512
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 3.81 37.96 0 0 2 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 3.81 37.96 0 0 2 0
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 36
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
enq: FS - contention 2 0.00 0.00
control file sequential read 58 0.05 0.13
KSV master wait 20 1.71 3.27
ASM file metadata operation 6 0.00 0.00
kfk: async disk IO 4 0.00 0.00
Disk file operations I/O 4 0.00 0.00
Log file init write 4381 0.13 31.35
enq: CF - contention 1 0.00 0.00
control file parallel write 14 0.01 0.03
log file single write 2 0.00 0.00
ges inquiry response 1 0.00 0.00
log file sync 1 0.00 0.00
SQL*Net message to client 1 0.00 0.00
SQL*Net message from client 1 0.00 0.00
********************************************************************************
38 seconds might not be long, but when you consider the number of redo log groups and number of redo threads it soon adds up. And when you have 7 databases to do this for, the opportunity to save 60% of the time in this phase starts to look attractive. Add to this that ALTER DATABASE OPEN RESETLOGS spends the vast majority of its time waiting on “Log file init write” and I can’t help feeling that ”cell smart log creation” would be a nicer event to be waiting on
I realise that it’s probably way down on the list of enhancements, but I for one would appreciate it.
A Session Busy Doing What? – Work of Course!
I was recently called out to look at a batch job exceeding it’s usual execution time. Ordinarily a quick look at the wait events would tell me what I needed to know, however due to a poor execution plan (caused by incorrect statistics) the query nicely highlighted how not all work can be classified as a wait. Below is a reproduction of the scenario I’m describing.
Create some test tables.
CREATE TABLE tab1 AS SELECT ROWNUM rn, 'xxxxx' pad1, 'yyyyy' pad2 FROM dual CONNECT BY LEVEL <= 25000; CREATE TABLE tab2 AS SELECT ROWNUM rn, 'xxxxx' pad1, 'yyyyy' pad2 FROM dual CONNECT BY LEVEL <= 25000; exec dbms_stats.gather_table_stats(null,'tab1'); exec dbms_stats.gather_table_stats(null,'tab2');
A test query is below, it has been hinted to ensure it repeatedly scans the same blocks in memory.
SELECT /*+ USE_NL(tab1 tab2) ORDERED */ COUNT(*) FROM tab1 INNER JOIN tab2 ON tab1.rn = tab2.rn; ------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 10 | | 477K (6)| 01:35:30 | | 1 | SORT AGGREGATE | | 1 | 10 | | | | |* 2 | NESTED LOOPS | | 24859 | 242K| 477K| 477K (6)| 01:35:30 | | 3 | TABLE ACCESS FULL| TAB1 | 24918 | 121K| | 21 (5)| 00:00:01 | | 4 | TABLE ACCESS FULL| TAB2 | 1 | 5 | | 19 (6)| 00:00:01 | ------------------------------------------------------------------------------------
First off run the query to get it parsed and charge the buffer cache and then reset the session and run it again.
-- may want to ctrl^c once this has run for a few seconds SELECT /*+ USE_NL(tab1 tab2) ORDERED */ COUNT(*) FROM tab1 INNER JOIN tab2 ON tab1.rn = tab2.rn; -- conn fresh session conn / SELECT sid FROM v$mystat WHERE ROWNUM = 1; SELECT /*+ USE_NL(tab1 tab2) ORDERED */ COUNT(*) FROM tab1 INNER JOIN tab2 ON tab1.rn = tab2.rn;
From another session check what the original session is up to:
SELECT sid, status, event, state
FROM v$session
WHERE sid = &sid_of_other_session;
SID STATUS EVENT STATE
---------- -------- ----------------------------------- -------------------
149 ACTIVE SQL*Net message to client WAITED SHORT TIME
SELECT event, time_waited
FROM v$session_event
WHERE sid = &sid_of_other_session;
EVENT TIME_WAITED
----------------------------------- -----------
SQL*Net message to client 0
SQL*Net message from client 1111
No wait events are being recorded yet “top” shows the session to be the number 1 session on the box with 78% of CPU resources.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28008 oracle 25 0 344m 20m 18m R 78.3 2.7 5:57.28 oracle
The missing link is that an Oracle instance does not count logical IO as a wait – it is work. V$SESSTAT is where we have visibility of this work.
SELECT n.name, s.value FROM v$sesstat s, v$statname n WHERE s.sid = &sid_of_other_session AND n.statistic# = s.statistic# AND n.name = 'consistent gets' ORDER BY s.value; NAME VALUE ---------------------------------------------------------------- ---------- consistent gets 1975123
leave a comment