Ad-hoc Users and Undo Usage

Here is a quote from the Oracle Documentation about V$UNDOSTAT.MAXQUERYLEN:

MAXQUERYLEN NUMBER Identifies the length of the longest query (in seconds) executed in the instance during the period. You can use this statistic to estimate the proper setting of the UNDO_RETENTION initialization parameter. The length of a query is measured from the cursor open time to the last fetch/execute time of the cursor. Only the length of those cursors that have been fetched/executed during the period are reflected in the view.

And here is a quote from My Oracle Support note “V$UNDOSTAT MAXQUERYLEN Suddenly Appears With a High Value For a Simple Query [ID 1307600.1]”

the MAXQUERYLEN value is defined as “the length of a query measured from the cursor open time to the last fetch/execute time of the cursor”. Hence if the cursor is suspended (e.g. dbms_lock.sleep) or the session is switched out (session switching) then the MAXQUERYLEN time will continue to increase but will not be show as no current execution/fetch is occurring on the cursor.

This is expected behaviour

Nothing wrong with any of this but it is all relevant to the test that follows. In order to set up the test I change my local database to have a small undo tablespace and generate some workload. I’ve collapsed the code in case you want to jump straight to the story, in summary it generates some transactions and random queries with a duration between 1 and 10 minutes.

alter system set undo_tablespace = undo scope=both sid='*';

create undo tablespace undo_small 
datafile 'C:\ORACLEXE\ORADATA\XE\UNDO_SMALL.DBF' size 30m reuse autoextend on maxsize 500m;

alter system set undo_tablespace = undo_small scope=both sid='*';

create or replace function slow_query(i_seconds_to_wait in number)
	return number
as
	l_n number;
	l_pad2 bigtab.pad2%TYPE;

	cursor c_bigtab is
		select pad2 from bigtab; 
begin
	open c_bigtab;
	fetch c_bigtab into l_pad2;

	sys.dbms_lock.sleep(i_seconds_to_wait);
	
	fetch c_bigtab into l_pad2;
	close c_bigtab;

	return i_seconds_to_wait;
end slow_query;
/

-- session #1 - generate trxns

declare
	L_rno number;
	L_i number;
	L_pad1 bigtab.pad1%TYPE;
begin
	while true loop
		L_i := L_i + 1;
		L_rno := round(dbms_random.value(1,1000000));

		update bigtab
		set pad1 = pad1 where rno = L_rno;

		commit;

		dbms_lock.sleep(0.01);
	end loop;
end;
/

-- session #2 - generate queries of random length betwen 1 min and 10 mins

declare
	l_sleep number;
	l_slept number;
begin
	while true loop
		l_sleep := round(dbms_random.value(60,600));

		select slow_query(l_sleep) into l_slept from dual;
	end loop;
end;
/

Output from V$UNDOSTAT after the background workload has been running for a while is below along with the current size of the undo tablespace.

END_TIME_CHAR          TXNCOUNT MAXQUERYLEN MAXQUERYID    ACTIVEBLKS UNEXPIREDBLKS EXPIREDBLKS TUNED_UNDORETENTION
-------------------- ---------- ----------- ------------- ---------- ------------- ----------- -------------------
28-SEP-2011 19:45:50      38118         584 200k2np23z57q        160           912         496             1425
28-SEP-2011 19:55:50      38404         452 200k2np23z57q        160          1120        1280             1292
28-SEP-2011 20:05:50      38507         479 200k2np23z57q        160          1704        1464             1320
28-SEP-2011 20:15:50      38340         383 200k2np23z57q        160          2656         496             1224

   FILE_ID FILE_NAME                                             MB
---------- --------------------------------------------- ----------
         5 C:\ORACLEXE\ORADATA\XE\UNDO_SMALL.DBF                 30

Now a simple query is executed from SQL Developer and the first set of rows is fetched. V$SESSION output below.

SQL> select username,program,status,state,event,sql_id,seconds_in_wait siw
  2  from v$session
  3  where username = 'ADHOC';

USERNAME PROGRAM         STATUS   STATE               EVENT                       SQL_ID               SIW
-------- --------------- -------- ------------------- --------------------------- ------------- ----------
ADHOC    SQL Developer   INACTIVE WAITING             SQL*Net message from client 6fwqzurbc8y7k       7

SQL> select sql_text from v$sql where sql_id = '6fwqzurbc8y7k';

SQL_TEXT
-----------------------------------------------------------------------
select * from bigtab

Another quote from the Oracle Documentation

SQL_ID VARCHAR2(13) SQL identifier of the SQL statement that is currently being executed

You can see from the V$SESSION output above that we have a SQL ID and therefore must be executing some SQL. But we also have a session status of “INACTIVE” and are waiting on event “SQL*Net message from client”. That is a nice way of spotting sessions that are similar to the focus of this post.

Around 15 minutes later the session is still waiting on the user to act and V$UNDOSTAT looks consistent with my steady state.

USERNAME PROGRAM         STATUS   STATE               EVENT                       SQL_ID               SIW
-------- --------------- -------- ------------------- --------------------------- ------------- ----------
ADHOC    SQL Developer   INACTIVE WAITING             SQL*Net message from client 6fwqzurbc8y7k        912

END_TIME_CHAR          TXNCOUNT MAXQUERYLEN MAXQUERYID    ACTIVEBLKS UNEXPIREDBLKS EXPIREDBLKS TUNED_UNDORETENTION
-------------------- ---------- ----------- ------------- ---------- ------------- ----------- -------------------
28-SEP-2011 19:45:50      38118         584 200k2np23z57q        160           912         496             1425
28-SEP-2011 19:55:50      38404         452 200k2np23z57q        160          1120        1280             1292
28-SEP-2011 20:05:50      38507         479 200k2np23z57q        160          1704        1464             1320
28-SEP-2011 20:15:50      38340         383 200k2np23z57q        160          2656         496             1224
28-SEP-2011 20:25:50      38557         985 200k2np23z57q        160          2592         816             1825
28-SEP-2011 20:31:28      20492         866 200k2np23z57q        160          2592         816             1706

We now fetch the next set of records and check V$SESSION and V$UNDOSTAT again.

USERNAME PROGRAM         STATUS   STATE               EVENT                       SQL_ID               SIW
-------- --------------- -------- ------------------- --------------------------- ------------- ----------
ADHOC    SQL Developer   INACTIVE WAITING             SQL*Net message from client 6fwqzurbc8y7k      35

END_TIME_CHAR          TXNCOUNT MAXQUERYLEN MAXQUERYID    ACTIVEBLKS UNEXPIREDBLKS EXPIREDBLKS TUNED_UNDORETENTION
-------------------- ---------- ----------- ------------- ---------- ------------- ----------- -------------------
28-SEP-2011 19:45:50      38118         584 200k2np23z57q        160           912         496             1425
28-SEP-2011 19:55:50      38404         452 200k2np23z57q        160          1120        1280             1292
28-SEP-2011 20:05:50      38507         479 200k2np23z57q        160          1704        1464             1320
28-SEP-2011 20:15:50      38340         383 200k2np23z57q        160          2656         496             1224
28-SEP-2011 20:25:50      38557         985 200k2np23z57q        160          2592         816             1825
28-SEP-2011 20:32:49      26370         992 6fwqzurbc8y7k        160          2592         816             1766

V$UNDOSTAT.MAXQUERYID is now that of the SQL Developer session. What if an extra 10 minutes had elapsed?

10 minutes later…

END_TIME_CHAR          TXNCOUNT MAXQUERYLEN MAXQUERYID    ACTIVEBLKS UNEXPIREDBLKS EXPIREDBLKS TUNED_UNDORETENTION
-------------------- ---------- ----------- ------------- ---------- ------------- ----------- -------------------
28-SEP-2011 19:45:50      38118         584 200k2np23z57q        160           912         496             1425
28-SEP-2011 19:55:50      38404         452 200k2np23z57q        160          1120        1280             1292
28-SEP-2011 20:05:50      38507         479 200k2np23z57q        160          1704        1464             1320
28-SEP-2011 20:15:50      38340         383 200k2np23z57q        160          2656         496             1224
28-SEP-2011 20:25:50      38557         985 200k2np23z57q        160          2592         816             1825
28-SEP-2011 20:35:50      41787         992 6fwqzurbc8y7k        160          3440         688             1833
28-SEP-2011 20:41:31      20088        1293 6fwqzurbc8y7k        160          3440         688             2134

The TUNED_UNDORETENTION is starting to climb but the number of expired blocks is unchanged because our UNDO tablespace is extending – it is now at 42M.

   FILE_ID FILE_NAME                                             MB
---------- --------------------------------------------- ----------
         5 C:\ORACLEXE\ORADATA\XE\UNDO_SMALL.DBF                 42

So let’s prevent it from extending and wait another 10 minutes before fetching the next set of rows.

alter database datafile 5 autoextend off;

END_TIME_CHAR          TXNCOUNT MAXQUERYLEN MAXQUERYID    ACTIVEBLKS UNEXPIREDBLKS EXPIREDBLKS TUNED_UNDORETENTION
-------------------- ---------- ----------- ------------- ---------- ------------- ----------- -------------------
28-SEP-2011 20:05:50      38507         479 200k2np23z57q        160          1704        1464             1320
28-SEP-2011 20:15:50      38340         383 200k2np23z57q        160          2656         496             1224
28-SEP-2011 20:25:50      38557         985 200k2np23z57q        160          2592         816             1825
28-SEP-2011 20:35:50      41787         992 6fwqzurbc8y7k        160          3440         688             1833
28-SEP-2011 20:45:50      41534        1594 6fwqzurbc8y7k        160          4608         576             2752
28-SEP-2011 20:55:50      44669        2196 6fwqzurbc8y7k        160          5200         264             2766

We can see that the number of expired blocks is falling and the number of un-expired blocks is rising. 10 minutes later I click to fetch the next set of records and…


ORA-01555: snapshot too old: rollback segment number 17 with name "_SYSSMU17$" too small
01555. 00000 -  "snapshot too old: rollback segment number %s with name \"%s\" too small"
*Cause:    rollback records needed by a reader for consistent read are
           overwritten by other writers
*Action:   If in Automatic Undo Management mode, increase undo_retention
           setting. Otherwise, use larger rollback segments

That was only after 35-40 minutes – well within reach of a normal trip to the canteen (admittedly my undo tablespace is sized modestly in order to help me prove a point).

So, here, in reverse order, are my top 3 undo related threats when giving read only access via a tool such as SQL Developer or Toad to application support staff (even though they have the best of intentions).

#3 Undo tablespace pressure.
Unbeknownst to the ad-hoc user he can quite easily cause pressure on the undo tablespace causing an AUTOEXTEND file to grow purely to satisfy their next page of data.

#2 Undo tablespace alerts
For undo tablespaces with no AUTOEXTEND files Grid Control or some other monitoring tool will alert as the number of unexpired blocks starts to fall causing unnecessary (albeit low priority) work for your first line DBAs.

#1 ORA-1555
For undo tablespaces with no AUTOEXTEND files and either a sufficiently high transaction rate or sufficiently high gap between page 1 and page X of our ad-hoc user’s query we may find that the undo required to construct page X has been overwritten. The user gets ORA-1555, thinks “no harm done” and re-starts the query. However the alert log now contains something like below and an alert is fired shortly afterwards.

Wed Sep 28 21:14:48 2011
ORA-01555 caused by SQL statement below (SQL ID: 6fwqzurbc8y7k, Query Duration=3538 sec, SCN: 0x0000.001aad4e):

If the hour is late and the company is pretty twitchy then this could involve someone getting out of bed to investigate a potentially failed batch run.

Have I missed any undo related threats?

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s