Ad-hoc Users and Undo Usage
Here is a quote from the Oracle Documentation about V$UNDOSTAT.MAXQUERYLEN:
MAXQUERYLEN NUMBER Identifies the length of the longest query (in seconds) executed in the instance during the period. You can use this statistic to estimate the proper setting of the UNDO_RETENTION initialization parameter. The length of a query is measured from the cursor open time to the last fetch/execute time of the cursor. Only the length of those cursors that have been fetched/executed during the period are reflected in the view.
And here is a quote from My Oracle Support note “V$UNDOSTAT MAXQUERYLEN Suddenly Appears With a High Value For a Simple Query [ID 1307600.1]“
the MAXQUERYLEN value is defined as “the length of a query measured from the cursor open time to the last fetch/execute time of the cursor”. Hence if the cursor is suspended (e.g. dbms_lock.sleep) or the session is switched out (session switching) then the MAXQUERYLEN time will continue to increase but will not be show as no current execution/fetch is occurring on the cursor.
…
This is expected behaviour
Nothing wrong with any of this but it is all relevant to the test that follows. In order to set up the test I change my local database to have a small undo tablespace and generate some workload. I’ve collapsed the code in case you want to jump straight to the story, in summary it generates some transactions and random queries with a duration between 1 and 10 minutes.
alter system set undo_tablespace = undo scope=both sid='*'; create undo tablespace undo_small datafile 'C:\ORACLEXE\ORADATA\XE\UNDO_SMALL.DBF' size 30m reuse autoextend on maxsize 500m; alter system set undo_tablespace = undo_small scope=both sid='*'; create or replace function slow_query(i_seconds_to_wait in number) return number as l_n number; l_pad2 bigtab.pad2%TYPE; cursor c_bigtab is select pad2 from bigtab; begin open c_bigtab; fetch c_bigtab into l_pad2; sys.dbms_lock.sleep(i_seconds_to_wait); fetch c_bigtab into l_pad2; close c_bigtab; return i_seconds_to_wait; end slow_query; / -- session #1 - generate trxns declare L_rno number; L_i number; L_pad1 bigtab.pad1%TYPE; begin while true loop L_i := L_i + 1; L_rno := round(dbms_random.value(1,1000000)); update bigtab set pad1 = pad1 where rno = L_rno; commit; dbms_lock.sleep(0.01); end loop; end; / -- session #2 - generate queries of random length betwen 1 min and 10 mins declare l_sleep number; l_slept number; begin while true loop l_sleep := round(dbms_random.value(60,600)); select slow_query(l_sleep) into l_slept from dual; end loop; end; /
Output from V$UNDOSTAT after the background workload has been running for a while is below along with the current size of the undo tablespace.
END_TIME_CHAR TXNCOUNT MAXQUERYLEN MAXQUERYID ACTIVEBLKS UNEXPIREDBLKS EXPIREDBLKS TUNED_UNDORETENTION
-------------------- ---------- ----------- ------------- ---------- ------------- ----------- -------------------
28-SEP-2011 19:45:50 38118 584 200k2np23z57q 160 912 496 1425
28-SEP-2011 19:55:50 38404 452 200k2np23z57q 160 1120 1280 1292
28-SEP-2011 20:05:50 38507 479 200k2np23z57q 160 1704 1464 1320
28-SEP-2011 20:15:50 38340 383 200k2np23z57q 160 2656 496 1224
FILE_ID FILE_NAME MB
---------- --------------------------------------------- ----------
5 C:\ORACLEXE\ORADATA\XE\UNDO_SMALL.DBF 30
Now a simple query is executed from SQL Developer and the first set of rows is fetched. V$SESSION output below.
SQL> select username,program,status,state,event,sql_id,seconds_in_wait siw 2 from v$session 3 where username = 'ADHOC'; USERNAME PROGRAM STATUS STATE EVENT SQL_ID SIW -------- --------------- -------- ------------------- --------------------------- ------------- ---------- ADHOC SQL Developer INACTIVE WAITING SQL*Net message from client 6fwqzurbc8y7k 7 SQL> select sql_text from v$sql where sql_id = '6fwqzurbc8y7k'; SQL_TEXT ----------------------------------------------------------------------- select * from bigtab
Another quote from the Oracle Documentation
SQL_ID VARCHAR2(13) SQL identifier of the SQL statement that is currently being executed
You can see from the V$SESSION output above that we have a SQL ID and therefore must be executing some SQL. But we also have a session status of “INACTIVE” and are waiting on event “SQL*Net message from client”. That is a nice way of spotting sessions that are similar to the focus of this post.
Around 15 minutes later the session is still waiting on the user to act and V$UNDOSTAT looks consistent with my steady state.
USERNAME PROGRAM STATUS STATE EVENT SQL_ID SIW -------- --------------- -------- ------------------- --------------------------- ------------- ---------- ADHOC SQL Developer INACTIVE WAITING SQL*Net message from client 6fwqzurbc8y7k 912 END_TIME_CHAR TXNCOUNT MAXQUERYLEN MAXQUERYID ACTIVEBLKS UNEXPIREDBLKS EXPIREDBLKS TUNED_UNDORETENTION -------------------- ---------- ----------- ------------- ---------- ------------- ----------- ------------------- 28-SEP-2011 19:45:50 38118 584 200k2np23z57q 160 912 496 1425 28-SEP-2011 19:55:50 38404 452 200k2np23z57q 160 1120 1280 1292 28-SEP-2011 20:05:50 38507 479 200k2np23z57q 160 1704 1464 1320 28-SEP-2011 20:15:50 38340 383 200k2np23z57q 160 2656 496 1224 28-SEP-2011 20:25:50 38557 985 200k2np23z57q 160 2592 816 1825 28-SEP-2011 20:31:28 20492 866 200k2np23z57q 160 2592 816 1706
We now fetch the next set of records and check V$SESSION and V$UNDOSTAT again.
USERNAME PROGRAM STATUS STATE EVENT SQL_ID SIW -------- --------------- -------- ------------------- --------------------------- ------------- ---------- ADHOC SQL Developer INACTIVE WAITING SQL*Net message from client 6fwqzurbc8y7k 35 END_TIME_CHAR TXNCOUNT MAXQUERYLEN MAXQUERYID ACTIVEBLKS UNEXPIREDBLKS EXPIREDBLKS TUNED_UNDORETENTION -------------------- ---------- ----------- ------------- ---------- ------------- ----------- ------------------- 28-SEP-2011 19:45:50 38118 584 200k2np23z57q 160 912 496 1425 28-SEP-2011 19:55:50 38404 452 200k2np23z57q 160 1120 1280 1292 28-SEP-2011 20:05:50 38507 479 200k2np23z57q 160 1704 1464 1320 28-SEP-2011 20:15:50 38340 383 200k2np23z57q 160 2656 496 1224 28-SEP-2011 20:25:50 38557 985 200k2np23z57q 160 2592 816 1825 28-SEP-2011 20:32:49 26370 992 6fwqzurbc8y7k 160 2592 816 1766
V$UNDOSTAT.MAXQUERYID is now that of the SQL Developer session. What if an extra 10 minutes had elapsed?
10 minutes later…
END_TIME_CHAR TXNCOUNT MAXQUERYLEN MAXQUERYID ACTIVEBLKS UNEXPIREDBLKS EXPIREDBLKS TUNED_UNDORETENTION -------------------- ---------- ----------- ------------- ---------- ------------- ----------- ------------------- 28-SEP-2011 19:45:50 38118 584 200k2np23z57q 160 912 496 1425 28-SEP-2011 19:55:50 38404 452 200k2np23z57q 160 1120 1280 1292 28-SEP-2011 20:05:50 38507 479 200k2np23z57q 160 1704 1464 1320 28-SEP-2011 20:15:50 38340 383 200k2np23z57q 160 2656 496 1224 28-SEP-2011 20:25:50 38557 985 200k2np23z57q 160 2592 816 1825 28-SEP-2011 20:35:50 41787 992 6fwqzurbc8y7k 160 3440 688 1833 28-SEP-2011 20:41:31 20088 1293 6fwqzurbc8y7k 160 3440 688 2134
The TUNED_UNDORETENTION is starting to climb but the number of expired blocks is unchanged because our UNDO tablespace is extending – it is now at 42M.
FILE_ID FILE_NAME MB
---------- --------------------------------------------- ----------
5 C:\ORACLEXE\ORADATA\XE\UNDO_SMALL.DBF 42
So let’s prevent it from extending and wait another 10 minutes before fetching the next set of rows.
alter database datafile 5 autoextend off; END_TIME_CHAR TXNCOUNT MAXQUERYLEN MAXQUERYID ACTIVEBLKS UNEXPIREDBLKS EXPIREDBLKS TUNED_UNDORETENTION -------------------- ---------- ----------- ------------- ---------- ------------- ----------- ------------------- 28-SEP-2011 20:05:50 38507 479 200k2np23z57q 160 1704 1464 1320 28-SEP-2011 20:15:50 38340 383 200k2np23z57q 160 2656 496 1224 28-SEP-2011 20:25:50 38557 985 200k2np23z57q 160 2592 816 1825 28-SEP-2011 20:35:50 41787 992 6fwqzurbc8y7k 160 3440 688 1833 28-SEP-2011 20:45:50 41534 1594 6fwqzurbc8y7k 160 4608 576 2752 28-SEP-2011 20:55:50 44669 2196 6fwqzurbc8y7k 160 5200 264 2766
We can see that the number of expired blocks is falling and the number of un-expired blocks is rising. 10 minutes later I click to fetch the next set of records and…
ORA-01555: snapshot too old: rollback segment number 17 with name "_SYSSMU17$" too small
01555. 00000 - "snapshot too old: rollback segment number %s with name \"%s\" too small"
*Cause: rollback records needed by a reader for consistent read are
overwritten by other writers
*Action: If in Automatic Undo Management mode, increase undo_retention
setting. Otherwise, use larger rollback segments
That was only after 35-40 minutes – well within reach of a normal trip to the canteen (admittedly my undo tablespace is sized modestly in order to help me prove a point).
So, here, in reverse order, are my top 3 undo related threats when giving read only access via a tool such as SQL Developer or Toad to application support staff (even though they have the best of intentions).
#3 Undo tablespace pressure.
Unbeknownst to the ad-hoc user he can quite easily cause pressure on the undo tablespace causing an AUTOEXTEND file to grow purely to satisfy their next page of data.
#2 Undo tablespace alerts
For undo tablespaces with no AUTOEXTEND files Grid Control or some other monitoring tool will alert as the number of unexpired blocks starts to fall causing unnecessary (albeit low priority) work for your first line DBAs.
#1 ORA-1555
For undo tablespaces with no AUTOEXTEND files and either a sufficiently high transaction rate or sufficiently high gap between page 1 and page X of our ad-hoc user’s query we may find that the undo required to construct page X has been overwritten. The user gets ORA-1555, thinks “no harm done” and re-starts the query. However the alert log now contains something like below and an alert is fired shortly afterwards.
Wed Sep 28 21:14:48 2011 ORA-01555 caused by SQL statement below (SQL ID: 6fwqzurbc8y7k, Query Duration=3538 sec, SCN: 0x0000.001aad4e):
If the hour is late and the company is pretty twitchy then this could involve someone getting out of bed to investigate a potentially failed batch run.
Have I missed any undo related threats?
Impact of Updating a Row Using a PL/SQL Record Type
Before I talk about the topic of this post I’d like to demonstrate a bit of obvious information just to get it in the forefront of your mind. First off we create a test table holding 5000 records.
create table redo_test as select rownum col1 , rpad(rownum,100) col2 , rpad(rownum,100) col3 from dual connect by rownum <= 5000; create index redo_test_i on redo_test(col1); exec dbms_stats.gather_table_stats(user,'redo_test',cascade=>true)
And if we now update a 100 character column in each of the 5000 records we see some undo/redo generated:
conn / update redo_test set col3 = col3; commit; NAME VALUE ---------------------------------------------------------------- ---------- undo change vector size 700132 redo size 1432060
And if we update the same 5000 records but more columns we see an increased amount of undo/redo generated.
conn / update redo_test set col2 = col2 , col3 = col3; commit; NAME VALUE ---------------------------------------------------------------- ---------- undo change vector size 2026732 redo size 4943892
All as expected, though what I didn’t expect was the undo/redo statistics to be so erratic – the values above are the most frequently output (further testing required to understand that oddity).
So, now we have all been taught how to suck eggs on to “Updating the Database with PL/SQL Record Values” – a quote and example from the Oracle documentation below:
A PL/SQL-only extension of the UPDATE statement lets you update database rows using a single variable of type RECORD or %ROWTYPE on the right side of the SET clause, instead of a list of fields.
UPDATE departments SET ROW = dept_info WHERE department_id = 300;
So let’s see what happens behind the scenes when we update a row using a record type:
conn / alter session set sql_trace = true; declare redo_rec redo_test%ROWTYPE; begin select * into redo_rec from redo_test where col1 = 1; update redo_test set row = redo_rec where col1 = 1; commit; end; /
In the TKPROF output we see the SQL used by the update. It updates all columns in the table.
UPDATE REDO_TEST SET COL1 = :B1 ,COL2 = :B2 ,COL3 = :B3 WHERE COL1 = 1
So – if you are using record types to update records frequently or to update tables with large record sizes then you should maybe pause for thought. Likewise if you see SQL updating every column on a table this could be the reason. See an example below changing the value of only 1 column in our 5000 row table and the undo/redo generation jumping up by 50% due to updating extra columns we don’t need to change.
conn /
begin
for i in (select * from redo_test) loop
update redo_test
set col2 = rpad('x',100)
where col1 = i.col1;
end loop;
commit;
end;
/
NAME VALUE
---------------------------------------------------------------- ----------
undo change vector size 1082416
redo size 2537112
conn /
declare
redo_rec redo_test%ROWTYPE;
begin
for i in (select * from redo_test) loop
redo_rec.col1 := i.col1;
redo_rec.col2 := rpad('x',100);
redo_rec.col3 := i.col3;
update redo_test
set row = redo_rec
where col1 = i.col1;
end loop;
commit;
end;
/
NAME VALUE
---------------------------------------------------------------- ----------
undo change vector size 1462416
redo size 3299680
leave a comment