Sometimes we don’t need to pack data in, we need to spread it out

I’ve gotten away with doing a presentation called “Contentious Small Tables” (download here) for the UKOUG three times now so I think it’s time to retire it from duty and serialise it here.

The planned instalments are below, I’ll change the items to links as the posts are published. The methods discussed in these posts are not exhaustive but hopefully cover most of the useful or interesting options.

  1. Sometimes we don’t need to pack data in, we need to spread it out (this post)
  2. Spreading out data – old school methods
  3. Spreading out data – with minimize records per block (also old school)
  4. Spreading out data – some partitioning methods
  5. Spreading out data – some modern methods

This post is the first instalment.

Sometimes we don’t need to pack data in, we need to spread it out

As data sets continue to grow there is more and more focus on packing data as tightly as we can. For performance reasons there are cases where we DBAs & developers need to turn this on its head and try to spread data out. An example is a small table with frequently updated/locked rows. There is no TX blocking but instead contention on the buffers containing the rows. This contention can manifest itself as “buffer busy waits”, one of the “gc” variants of “buffer busy waits” such as “gc buffer busy acquire” or “gc buffer busy release” or maybe on “latch: cache buffers chains”.

You’ll probably come at a problem of this nature from a session perspective via OEM, ASH data, v$session_event or SQL*Trace data. Taking ASH data as an example you can see below that for my exaggerated test, global cache busy waits dominate the time taken and, key for this series of posts, the “P1” and “P2” columns contain a small number of values.

column event format a35
select * from (
 select NVL(event,'CPU') event
 , count(*) waits
 , p1, p2
 from gv$active_session_history
 where sample_time between 
       to_date(to_char(sysdate,'DD-MON-YYYY')||' &from_time.','DD-MON-YYYY HH24:MI')
   and to_date(to_char(sysdate,'DD-MON-YYYY')||' &to_time.','DD-MON-YYYY HH24:MI')
 and module = 'NJTEST'
 group by NVL(event,'CPU'), p1, p2
 order by 2 desc
) where rownum <= 5;

EVENT                   WAITS    P1      P2
---------------------- ------ ----- -------
gc buffer busy acquire   1012     5  239004
gc buffer busy release    755     5  239004
gc current block busy     373     5  239004
CPU                        65     0       0

For the wait events above “P1” and “P2” have the following meanings:

select distinct parameter1, parameter2
from v$event_name
where name in ('gc buffer busy acquire'
              ,'gc buffer busy release'
              ,'gc current block busy');

PARAMETER1     PARAMETER2
-------------- --------------
file#          block#

So above we have contention on a single buffer. From ASH data we can also get the “SQL_ID” or “CURRENT_OBJ#” in order to identify the problem table. Below we show that all rows in the test table are in a single block – 239004.

select dbms_rowid.rowid_block_number(rowid) blockno
, count(*)
from procstate
group by dbms_rowid.rowid_block_number(rowid);

   BLOCKNO   COUNT(*)
---------- ----------
    239004         12

After discovering this information the first port of call is to look at the SQL statement, execution plan and supporting code but assuming the application is written as best it can be then perhaps it’s time to look at the segment and data organisation.

In order to help illustrate the point of this series here is a chart showing contention in ASH data from a crude test case. The huge grey stripe being “gc” busy waits and the thin red stripe being standard “buffer busy waits”.

Contention  11g normal heap table

The next post in this series will look at some “Old School” methods of reducing this contention.

Advertisements