MULTISET experiments

Recently I was asked to make some investigation about the PL/SQL procedure running slowly. PL/SQL procedure does not contain any SQL or complicate math operation but for some reasons have unexpected delays in processing. After small investigation I localize the problematic statement as containing the join of two nested tables – “multiset union”.

I was not able to explain it and crate test case to get to the bottom of this issue.

CREATE OR REPLACE PACKAGE epani.multicast_pkg
  IS
     TYPE t_number_nt IS TABLE OF NUMBER; -- test nested table type
       v_new_number  t_number_nt; --second nested table
       v_number t_number_nt; --first nested table
     PROCEDURE init (num_init1 number
                                  , num_init2 number); -- procedure that populate first nested table
     PROCEDURE add_number_old; -- procedure that join nested table in old fashion
     PROCEDURE add_number_new; -- procedure that join nested tables with MULTISET
END;
/
CREATE OR REPLACE PACKAGE BODY epani.multicast_pkg
 IS
    PROCEDURE  init (num_init1 number  -- number of elements in the first table
                                  , num_init2 number) -- number of elements in the second table
    is
    begin
        SELECT num BULK COLLECT INTO v_number   FROM (
                              SELECT  LEVEL num  FROM DUAL CONNECT BY LEVEL <= num_init1);
        SELECT num BULK COLLECT INTO v_new_number  FROM (
                              SELECT  LEVEL num  FROM DUAL CONNECT BY LEVEL <= num_init2);
    end;

    PROCEDURE  add_number_old 
     IS
     BEGIN
          v_number.EXTEND(v_new_number.COUNT); -- allocate nested table 
        FOR i IN v_new_number.FIRST .. v_new_number.LAST
        LOOP
          v_number(v_number.LAST) := v_new_number(i);
        END LOOP;
     END add_number_old;

    PROCEDURE  add_number_new
     IS
     BEGIN
       v_number:=v_number multiset union v_new_number;
     END add_number_new;
END;
/

I have prepared few test cases
First : We join the small table to the big one

---We initialise the first nested table by significant number of rows and second by small number
SQL> exec EPANI.multicast_pkg.init(356000,5);
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.33
-- Run the old fashion join procedure
SQL> exec EPANI.multicast_pkg.add_number_old;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.01
-- the procedure does not really takes so long
SQL> exec EPANI.multicast_pkg.add_number_new;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.41
-- the procedure takes almost ½ of time to initially generate the table

Second: we join two relatively small tables but do it repeatedly 1000 times to get the meaningful timing

---We initialise the nested tables by small number of rows
SQL> exec EPANI.multicast_pkg.init(1000,100);
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.05
-- We run pl/sql block using old fashion method
SQL> begin
  for i in 1..1000 LOOP
    EPANI.multicast_pkg.add_number_old;
  END LOOP;
end;
/
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.15
-- We run pl/sql block using new method
SQL> begin
  for i in 1..1000 LOOP
     EPANI.multicast_pkg.add_number_new;
  END LOOP;
end;
/
PL/SQL procedure successfully completed.
Elapsed: 00:04:29.75

Third: We join two big tables

---We initialise the nested tables by significant number of rows
SQL> exec EPANI.multicast_pkg.init(100000, 100000);
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.97
-- Run the old fashion join procedure
SQL> exec EPANI.multicast_pkg.add_number_old;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.15
-- the procedure does not  takes measurable time
---We reinitialise the nested tables by significant number of rows
SQL> exec EPANI.multicast_pkg.init(100000, 100000);
PL/SQL procedure successfully completed.
Elapsed: 00:00:03.11
SQL> exec EPANI.multicast_pkg.add_number_new;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.28
-- the procedure takes almost extra 50% above the old method

Forth: We join big tables to the small one

---We initialise the nested tables by significant number of rows
SQL> exec EPANI.multicast_pkg.init(5, 356000);
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.08
-- Run the old fashion join procedure
SQL> exec EPANI.multicast_pkg.add_number_old;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.62
-- the procedure does takes measurable time
---We reinitialise the nested tables in the same fashion
SQL> exec EPANI.multicast_pkg.init(5, 356000);
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.27
SQL> exec EPANI.multicast_pkg.add_number_new;
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.07
-- the procedure takes almost extra 50% above the old method

We have proved that for all cases we got better performance on old fashion method in all cases, but why. What stands behind this degradation? The answer can be found if we rewrite our procedures in slightly different way.

CREATE OR REPLACE PACKAGE BODY epani.multicast_pkg
 IS
    PROCEDURE  init (num_init1 number  -- number of elements in the first table
                                  , num_init2 number) -- number of elements in the second table
    is
    begin
        SELECT num BULK COLLECT INTO v_number   FROM (
                              SELECT  LEVEL num  FROM DUAL CONNECT BY LEVEL <= num_init1);
        SELECT num BULK COLLECT INTO v_new_number  FROM (
                              SELECT  LEVEL num  FROM DUAL CONNECT BY LEVEL <= num_init2);
    end;

    PROCEDURE  add_number_old 
     IS
	v_result t_number_nt:= t_number_nt(); --resulting nested table
     BEGIN
        v_result.EXTEND(v_number.COUNT); -- allocate nested table 
        FOR i IN v_number.FIRST .. v_number.LAST
        LOOP
          v_result(v_number.LAST) := v_number(i);
        END LOOP;
        v_result.EXTEND(v_new_number.COUNT); -- allocate nested table 
        FOR i IN v_new_number.FIRST .. v_new_number.LAST
        LOOP
          v_result(v_number.LAST) := v_new_number(i);
        END LOOP;
     END add_number_old;

    PROCEDURE  add_number_new
     IS
	v_result t_number_nt:= t_number_nt(); --resulting nested table
     BEGIN
       v_result:=v_number multiset union v_new_number;
     END add_number_new;
END;
/

And repeat our tests

First : We join the small table to the big one

---We initialise the first nested table by significant number of rows and second by small number
SQL> exec EPANI.multicast_pkg.init(356000,5);
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.16
-- Run the old fashion join procedure
SQL> exec EPANI.multicast_pkg.add_number_old;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.93
-- the procedure does takes significantly longer
SQL> exec EPANI.multicast_pkg.add_number_new;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.64
-- faster by new method

Second: we join two relatively small tables but do it repeatedly 1000 times to get the meaningful timing

---We initialise the nested tables by small number of rows
SQL> exec EPANI.multicast_pkg.init(1000,100);
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.04
-- We run pl/sql block using old fashion method
SQL> begin
  for i in 1..1000 LOOP
    EPANI.multicast_pkg.add_number_old;
  END LOOP;
end;
/
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.30
-- We run pl/sql block using new method
SQL> begin
  for i in 1..1000 LOOP
     EPANI.multicast_pkg.add_number_new;
  END LOOP;
end;
/
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.96
-- Faster by new method

Third: We join two big tables

---We initialise the nested tables by significant number of rows
SQL> exec EPANI.multicast_pkg.init(100000, 100000);
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.56
-- Run the old fashion join procedure
SQL> exec EPANI.multicast_pkg.add_number_old;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.48
-- the procedure does not  takes measurable time
---We reinitialise the nested tables by significant number of rows
SQL> exec EPANI.multicast_pkg.init(100000, 100000);
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.08
SQL> exec EPANI.multicast_pkg.add_number_new;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.40
-- slightly faster by new method

Forth: We join big tables to the small one

---We initialise the nested tables by significant number of rows
SQL> exec EPANI.multicast_pkg.init(5, 356000);
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.31
-- Run the old fashion join procedure
SQL> exec EPANI.multicast_pkg.add_number_old;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.58
-- the procedure does takes measurable time
---We reinitialise the nested tables in the same fashion
SQL> exec EPANI.multicast_pkg.init(5, 356000);
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.38
SQL> exec EPANI.multicast_pkg.add_number_new;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.50
-- slightly faster by new method

It looks like that for new set of test MULTICAST shows better performance than the handmade one. The answer is lighting in a scope of functionality. When we add one collection to the already existing the old one works perfectly but when we create the new collection as join of two old one and later assign it to the one of them the MULTICAST shows better efficiency. Multicast is able to resolve the vide scope of tasks. The developers should clearly see the difference between operating two collections or three collections and choose the appropriate method based on application requirements.

Application Code Level script

With patches and fixes delivered to different sites and applying to different testing, verification and production regions, there is a need for a simple way to compare database objects in the application schema without having database link between different databases. This can be challenging as the PL/SQL code is usually encrypted and database segments have different storage attributes. Here I show the method to achieve this comparison.
The reason why I think it would be interesting to public is that the problem is very common and up to day. There is no common white paper that helps to solve it. The only note that pretends to cover the topic is 781719.1 But it speaks only about PL/SQL objects that are not enough in a real world.
The attached script use MD5 hashing algorithm to get hash value associated to the version of database object. The hashing CLOB can be get from DBMS_METADATA package or from DBA_SOURCE view for PL/SQL objects. The principals are very simple but there are few points that have been raised during development.
1. The tables, indexes and e.t.c. with different storage configuration should generate the same hash. This issue has been solved by DBMS_METADATA.SET_TRANSFORM_PARAM
2. The tables with different order of columns and constraints have to provide the same hash too. The solution was to use Simple XML format of DBMS_METADATA output and sorting the columns and constraints alphabetically using XML functions.
3. The nested and index organise service tables are included into DBA_OBJECTS as TABLES but DBMS_METADATA include their description into the master objects but failed on servant items. My solution was to use DBA_TABLES view to segregate only high level objects.
4. Java objects has no option to generate description in XML format thus we generate it in general DDL format.
5. Role description should include grants that have been granted to the role. The DBMS_METADA does not include grants into generated description thus again the XML functions was used to get proper CLOB

The following privileges required for the user who run the script

-- GRANT CREATE PROEDURE
-- GRANT EXECUTE ON DBMS_CRYPTO
-- GRANT EXECUTE ON DBMS_METADATE
-- GRANT SELECT ON DBA_SOURCE
-- GRANT SELECT ON DBA_SYS_PRIVS
-- GRANT SELECT ON DBA_TAB_PRIVS
-- GRANT SELECT ON DBA_ROLE_PRIVS

The following script take one parameter – the application schema name and provide as output list of database objects with corresponding hash values.

create or replace function verify_source(p_source_type in varchar2,
p_source_name in varchar2,
p_source_owner in varchar2) return varchar2 AUTHID CURRENT_USER as
code_source clob;
md5hash varchar2(32);
v_h NUMBER; -- handle returned by OPEN
v_th NUMBER; -- handle returned by ADD_TRANSFORM
begin
IF p_source_type in ('VIEW','ROLE','TABLE','INDEX'
,'MATERIALIZED_VIEW','MATERIALIZED_VIEW_LOG','SEQUENCE','SYNONYM') THEN
v_h := DBMS_METADATA.OPEN(p_source_type);
if p_source_type not in ('ROLE') THEN
DBMS_METADATA.SET_FILTER(v_h,'SCHEMA',p_source_owner);
END IF;
DBMS_METADATA.SET_FILTER(v_h,'NAME',p_source_name);
v_th := DBMS_METADATA.ADD_TRANSFORM(v_h,'SXML');
if p_source_type not in ('VIEW','ROLE') THEN
DBMS_METADATA.SET_TRANSFORM_PARAM (v_th,'SEGMENT_ATTRIBUTES',FALSE);
DBMS_METADATA.SET_TRANSFORM_PARAM (v_th,'STORAGE',FALSE);
DBMS_METADATA.SET_TRANSFORM_PARAM (v_th,'TABLESPACE',FALSE);
DBMS_METADATA.SET_TRANSFORM_PARAM (v_th,'PARTITIONING',FALSE);
END IF;
code_source := DBMS_METADATA.FETCH_CLOB(v_h);
IF p_source_type in ('TABLE') THEN
-- get rid off sorting misconfigurations
SELECT UPDATEXML(
UPDATEXML(SRC,'/TABLE/RELATIONAL_TABLE/COL_LIST',
XMLELEMENT("COL_LIST", (
XMLQuery (
'for $i in /TABLE/RELATIONAL_TABLE/COL_LIST/COL_LIST_ITEM
order by $i/NAME
return $i'
passing by value SRC
RETURNING CONTENT
)
) )
) ,'/TABLE/RELATIONAL_TABLE/FOREIGN_KEY_CONSTRAINT_LIST',
XMLELEMENT("FOREIGN_KEY_CONSTRAINT_LIST", (
XMLQuery (
'for $i in
/TABLE/RELATIONAL_TABLE/FOREIGN_KEY_CONSTRAINT_LIST/FOREIGN_KEY_CONSTRAINT_LIST_ITEM
order by $i/NAME
return $i'
passing by value SRC
RETURNING CONTENT
)
) )
).getClobVal() INTO code_source
FROM ( SELECT XMLQuery(
'declare function local:removeNS($e as element()) as element()
{
element { QName("", local-name($e)) }
{
for $i in $e/node()|$e/attribute::*
return typeswitch ($i)
case element() return local:removeNS($i)
default return $i
}
}; (::)
local:removeNS($SRC/child::*)'
passing XMLType(code_source) as "SRC"
returning content
) SRC
FROM dual) INITSRC;
END IF;
DBMS_METADATA.CLOSE(v_h);
ELSIF p_source_type in ('PROCEDURE','FUNCTION','TYPE','TYPE BODY','TRIGGER','PACKAGE','PACKAGE BODY')
THEN
code_source := '';
for source_record in (select text from dba_source where owner = upper(p_source_owner) and name =
upper(p_source_name) and type = upper(p_source_type) order by line)
loop
code_source := code_source||source_record.text;
end loop;
ELSIF p_source_type in ('JAVA_CLASS','JAVA_SOURCE') THEN
code_source := DBMS_METADATA.GET_DDL(p_source_type,p_source_name,upper(p_source_owner));
ELSIF p_source_type in ('ROLE') THEN
SELECT
INSERTCHILDXMLAFTER(SRC,'/ROLE','TYPE[1]',
(SELECT
XMLAgg(
XMLELEMENT("privs",
XMLFOREST(OWNER, TABLE_NAME, PRIVILEGE)))
FROM
(SELECT DISTINCT *
FROM (
SELECT OWNER, TABLE_NAME, PRIVILEGE
FROM dba_tab_privs
WHERE GRANTEE IN
(SELECT GRANTED_ROLE
FROM DBA_ROLE_PRIVS
START WITH GRANTEE=p_source_name
CONNECT BY NOCYCLE PRIOR GRANTED_ROLE=GRANTEE
UNION ALL
SELECT p_source_name FROM DUAL)
UNION ALL
SELECT NULL OWNER, PRIVILEGE TABLE_NAME, NULL PRIVILEGE
FROM dba_sys_privs
WHERE GRANTEE IN (SELECT GRANTED_ROLE
FROM DBA_ROLE_PRIVS
START WITH GRANTEE=p_source_name
CONNECT BY NOCYCLE PRIOR GRANTED_ROLE=GRANTEE
UNION ALL
SELECT p_source_name FROM DUAL))
ORDER BY 1,2,3))).getClobVal() INTO code_source
FROM ( SELECT XMLQuery(
'declare function local:removeNS($e as element()) as element()
{
element { QName("", local-name($e)) }
{
for $i in $e/node()|$e/attribute::*
return typeswitch ($i)
case element() return local:removeNS($i)
default return $i
}
}; (::)
local:removeNS($SRC/child::*)'
passing XMLType(code_source) as "SRC"
returning content
) SRC
FROM dual) INITSRC;
END IF;

md5hash := rawtohex(dbms_crypto.hash(typ => dbms_crypto.HASH_MD5,
src => code_source));
return md5hash;
exception
WHEN OTHERS THEN
return p_source_type||p_source_name;
end;
/
show error

Spool code_level_checksum.lst
set pagesize 5000
set heading off
set echo off
set feedback off

col OBJECT_TYPE FORMAT a30
col OBJECT_NAME FORMAT a30
col HASHVAL FORMAT a35

prompt ###################################
prompt # TABLES #
prompt ###################################

SELECT 'TABLE' OBJECT_TYPE, TABLE_NAME OBJECT_NAME , verify_source('TABLE',TABLE_NAME, OWNER)
HASHVAL
FROM DBA_TABLES WHERE OWNER='&1'
AND IOT_NAME IS NULL
order by 1,2
/

prompt ###################################
prompt # INDEXES #
prompt ###################################

SELECT 'INDEX' OBJECT_TYPE, INDEX_NAME OBJECT_NAME , verify_source('INDEX',INDEX_NAME, OWNER)
HASHVAL
FROM DBA_INDEXES WHERE OWNER='&1'
order by 1,2
/

prompt ###################################
prompt # ROLES #
prompt ###################################

SELECT 'ROLE' OBJECT_TYPE, ROLE OBJECT_NAME , verify_source('ROLE',ROLE, 'TCTDBS') HASHVAL
FROM DBA_ROLES
order by 1,2
/

prompt ###################################
prompt # JAVA SOURCES #
prompt ###################################

select OBJECT_TYPE, OBJECT_NAME, verify_source('JAVA_SOURCE',OBJECT_NAME, OWNER) HASHVAL FROM
dba_objects
WHERE OWNER='&1' and OBJECT_TYPE IN ('JAVA SOURCE')
order by 1,2
/

prompt ###################################
prompt # JAVA CLASSES #
prompt ###################################

select 'JAVA_CLASS' OBJECT_TYPE, NAME OBJECT_NAME, verify_source('JAVA_CLASS',NAME, OWNER) HASHVAL
FROM dba_java_classes
WHERE OWNER='&1'
order by 1,2
/

prompt ###################################
prompt # MISC #
prompt ###################################

select OBJECT_TYPE, OBJECT_NAME, verify_source(OBJECT_TYPE,OBJECT_NAME, OWNER) HASHVAL FROM
dba_objects
WHERE OWNER='&1' and OBJECT_TYPE IN ('PROCEDURE','FUNCTION','TYPE','TYPE
BODY','TRIGGER','VIEW','MATERIALIZED_VIEW','MATERIALIZED_VIEW_LOG','SEQUENCE','SYNONYM','PACKAGE','PACKAG
E BODY')
order by 1,2
/

spool off;

drop function verify_source
/

Hope you will find it usefull.

The output from this script can be compared as simple flat file using any favorite comparison tool, e.g. WinMerge.

Impact of Updating a Row Using a PL/SQL Record Type

Before I talk about the topic of this post I’d like to demonstrate a bit of obvious information just to get it in the forefront of your mind. First off we create a test table holding 5000 records.

create table redo_test as
select  rownum col1
,	rpad(rownum,100) col2
,	rpad(rownum,100) col3
from dual
connect by rownum <= 5000;
create index redo_test_i on redo_test(col1);
exec dbms_stats.gather_table_stats(user,'redo_test',cascade=>true)

And if we now update a 100 character column in each of the 5000 records we see some undo/redo generated:

conn /
update	redo_test
set	col3 = col3;
commit;

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
undo change vector size                                              700132
redo size                                                           1432060

And if we update the same 5000 records but more columns we see an increased amount of undo/redo generated.

conn /
update	redo_test
set	col2 = col2
,	col3 = col3;
commit;

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
undo change vector size                                             2026732
redo size                                                           4943892

All as expected, though what I didn’t expect was the undo/redo statistics to be so erratic – the values above are the most frequently output (further testing required to understand that oddity).

So, now we have all been taught how to suck eggs on to “Updating the Database with PL/SQL Record Values” – a quote and example from the Oracle documentation below:

A PL/SQL-only extension of the UPDATE statement lets you update database rows using a single variable of type RECORD or %ROWTYPE on the right side of the SET clause, instead of a list of fields.

UPDATE departments SET ROW = dept_info WHERE department_id = 300;

So let’s see what happens behind the scenes when we update a row using a record type:

conn /
alter session set sql_trace = true;
declare
	redo_rec	redo_test%ROWTYPE;
begin
	select * into redo_rec from redo_test where col1 = 1;

	update	redo_test
	set	row = redo_rec
	where	col1 = 1;
	commit;
end;
/

In the TKPROF output we see the SQL used by the update. It updates all columns in the table.

UPDATE REDO_TEST SET COL1 = :B1 ,COL2 = :B2 ,COL3 = :B3
WHERE
 COL1 = 1

So – if you are using record types to update records frequently or to update tables with large record sizes then you should maybe pause for thought. Likewise if you see SQL updating every column on a table this could be the reason. See an example below changing the value of only 1 column in our 5000 row table and the undo/redo generation jumping up by 50% due to updating extra columns we don’t need to change.

conn /
begin
	for i in (select * from redo_test) loop
		update	redo_test
		set	col2 = rpad('x',100)
		where	col1 = i.col1;
	end loop;
	commit;
end;
/

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
undo change vector size                                             1082416
redo size                                                           2537112

conn /
declare
	redo_rec	redo_test%ROWTYPE;
begin
	for i in (select * from redo_test) loop
		redo_rec.col1 := i.col1;
		redo_rec.col2 := rpad('x',100);
		redo_rec.col3 := i.col3;

		update	redo_test
		set	row = redo_rec
		where	col1 = i.col1;
	end loop;

	commit;
end;
/
NAME                                                                  VALUE
---------------------------------------------------------------- ----------
undo change vector size                                             1462416
redo size                                                           3299680

Parallel Enabled Pipelined PL/SQL Functions – an Old Feature That’s New to Me

This post describes an Oracle feature that is several versions old yet new to me (I can feel a series coming on “Old New Features For The Dog-eared DBA”). Apologies if I’m the only DBA unaware of this.

I came across an unusual request recently regarding a batch job that was running for longer than desired. For each record in a list it retrieves sensitive data from an external source, processes it and sends it onwards to another external source. The retrieval takes a short amount of time, say 0.05 seconds. However, if there are 1,000,000 look-ups to be done the time starts to mount up.

It is common for in-house developed batch processes to have custom parallel execution designed in, for example several processes may be run concurrently and collect their portion of data into table. This data is then collated by a watcher process and sent onwards as appropriate, not for the purist perhaps but perfectly acceptable. As this structure was already in place the obvious advice would be to utilise it for this process.

Now here comes the unusual bit- the data to be handled was very sensitive, so sensitive that it could not be stored anywhere, including database tables and archive logs. The data should just be retrieved, processed and sent on. So the question was:

How can we process sets of records in PL/SQL in a number of parallel streams, not store the resulting data set anywhere and generate a single onward data stream?

I figured pipelined PL/SQL functions must be usable somewhere here and, after a bit of Googling and AskTom searching, I came up with the following mechanism which I’m recording here as I find it quite exciting and you may too 🙂 . Some links I found very useful:

http://www.oracle-developer.net/display.php?id=207
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:4447489221109

First we create a list of id records to be processed and some types to support the pipelined function.

DROP TABLE id_list;
CREATE TABLE id_list AS
SELECT ROWNUM id FROM all_objects WHERE ROWNUM <= 5000;

CREATE OR REPLACE TYPE private_rec IS OBJECT (
	id 		NUMBER
,	data_1 		NUMBER(16)
,	data_2		NUMBER(20)
,	date_1		DATE
,	date_2		DATE
);
/
CREATE OR REPLACE TYPE private_tab AS TABLE OF private_rec;
/

A word of warning here – don’t be tempted to cut the row count above down too much – I wasted a couple of hours thinking I was missing something when all I was missing was enough rows for parallelism to kick in.

Next we create a pipelined function. This accepts a weak refcursor and for each row simulates a look-up to an external data source by sleeping for 0.05 seconds. I won’t go into the details of how this pipelining and partitioning works as I learned most of what I know about this within the last few days from the rather excellent oracle-developer.net site (plus I don’t fully understand it myself yet).

CREATE OR REPLACE FUNCTION retrieve_private_data(I_cursor IN SYS_REFCURSOR)
	RETURN private_tab PARALLEL_ENABLE(PARTITION I_cursor BY ANY) PIPELINED 
IS
	L_id_rec id_list%ROWTYPE;
	TYPE id_tab IS TABLE OF id_list%ROWTYPE;
	L_id_tab id_tab;
	L_num NUMBER := 0;
	L_start_time DATE := SYSDATE;
	L_end_time DATE;
BEGIN
	LOOP
		FETCH I_cursor BULK COLLECT INTO L_id_tab LIMIT 10; 

		L_num := L_num + L_id_tab.COUNT;
		FOR i IN 1 .. L_id_tab.COUNT LOOP
			--each retrieval of sensitive data takes 0.05 seconds
			dbms_lock.sleep(0.05);

			--in reality this is where the sensitive data would be piped back.
			--PIPE ROW(private_rec(L_id_tab(i).id,dbms_random.value(1,10),dbms_random.value(1,10),NULL,NULL));
		END LOOP;

		EXIT WHEN I_cursor%NOTFOUND;
	END LOOP;
	L_end_time := SYSDATE;

	-- pipe a row showing number of records processed by the call and the start and end time
	PIPE ROW(private_rec(L_num,NULL,NULL,L_start_time,L_end_time));

	CLOSE I_cursor;
END retrieve_private_data;
/

So now we can test a serial call. Processing 5000 rows one after the other. Key information below is the elapsed time of 4 minutes and that a single row was returned by the function showing that the call processed 5000 rows.

alter session set nls_date_format ='DD-MON-YYYY HH24:MI:SS';
set timing on
set autotrace on
SELECT * FROM table(retrieve_private_data(CURSOR(SELECT * FROM id_list i)));
set autotrace off

        ID     DATA_1     DATA_2 DATE_1               DATE_2
---------- ---------- ---------- -------------------- --------------------
      5000                       18-MAY-2010 23:02:28 18-MAY-2010 23:06:44

Elapsed: 00:04:16.49

------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                       |  8168 |  2073K|    24   (0)| 00:00:01 |
|   1 |  VIEW                              |                       |  8168 |  2073K|    24   (0)| 00:00:01 |
|   2 |   COLLECTION ITERATOR PICKLER FETCH| RETRIEVE_PRIVATE_DATA |       |       |            |          |
|   3 |    TABLE ACCESS FULL               | ID_LIST               |  5000 | 65000 |     4   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------

And now the bit that I still find quite amazing. We’ll process the data in 4 parallel streams using a PARALLEL hint. Notice again the elapsed time and number of records/rows processed returned by the function.

alter session set nls_date_format ='DD-MON-YYYY HH24:MI:SS';
set timing on
set autotrace on
SELECT * FROM table(retrieve_private_data(CURSOR(SELECT  /*+ PARALLEL(i 4)*/ * FROM id_list i)));
set autotrace off


        ID     DATA_1     DATA_2 DATE_1               DATE_2
---------- ---------- ---------- -------------------- --------------------
      1058                       18-MAY-2010 23:37:18 18-MAY-2010 23:38:12
      1314                       18-MAY-2010 23:37:18 18-MAY-2010 23:38:25
      1314                       18-MAY-2010 23:37:18 18-MAY-2010 23:38:25
      1314                       18-MAY-2010 23:37:18 18-MAY-2010 23:38:25

Elapsed: 00:01:07.82

---------------------------------------------------------------------------------------------------------
|Id| Operation                            | Name                  | Rows | Bytes | Cost(%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
| 0| SELECT STATEMENT                     |                       | 8168 |  2073K|   24   (0)| 00:00:01 |
| 1|  PX COORDINATOR                      |                       |      |       |           |          |
| 2|   PX SEND QC (RANDOM)                | :TQ10000              | 8168 |  2073K|   24   (0)| 00:00:01 |
| 3|    VIEW                              |                       | 8168 |  2073K|   24   (0)| 00:00:01 |
| 4|     COLLECTION ITERATOR PICKLER FETCH| RETRIEVE_PRIVATE_DATA |      |       |           |          |
| 5|      PX BLOCK ITERATOR               |                       | 5000 | 65000 |    2   (0)| 00:00:01 |
| 6|       TABLE ACCESS FULL              | ID_LIST               | 5000 | 65000 |    2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

The mind boggles at the possibilities 🙂

Unhandled PL/SQL Error in SQL*Plus

Whilst deploying a data fix (written by someone else) yesterday the SQL script that I was running hit an error midway through a PL/SQL anonymous block. A simplificiaton of the script is show below:

  1. Create table A
  2. Insert 2 records into table A
  3. Commit
  4. Run anonymous block:
    1. Cursor from table A to update table B, C & D
    2. Insert into E
    3. Commit
  5. Commit
  6. Drop table A

There are a few things wrong with how it was written, but that’s not the point I’m interested in and might not be obvious without the actual code.

When running the script the insert into E failed and I was left with output showing table A was created successfully and populated, the PL/SQL block failed, the commit outside the PL/SQL block was successful and table A was dropped…

The question for me was what had happened to the updates to B, C and D? There was no exception handler in the anonymous block so I was confident that the commit in the block had never been reached, but would the commit outside of the block have committed the updates to B, C & D for the first record of the cursor? After a quick chat with a colleague we concluded that the updates within the block will have been neither committed nor rolled back in the PL/SQL and then committed by the commit immediately after the block… It was getting late and I had plenty of other things to do so I sent the output to the developer so he could provide a new script.

This morning I got a call from the developer and he explained that he’d forgotten to comment out the insert in E, as you do! He went on to ask if I’d rolled back the change as neither of the two sets of updates had happened (there were 2 records created in table A that formed the cursor).

This pointed to my colleague and I getting it wrong so it was time to prepare a quick test and that’s what follows…

I created a table with 10 rows as shown.

create table a(a number);
insert into a select rownum from all_objects
 where rownum < 11;
commit;

Next I created a script “pl_test_01.sql”:

-- Start of pl_test_01.sql --
update a set a = a + 10 where a = 1;
begin
update a set a = a + 10 where a = 2;
update a set a = a + 10 where a = 3;
insert into b (b) values (100);
commit;
end;
/
update a set a = a + 10 where a = 4;
commit;
-- End of pl_test_01.sql --

* Note that table B did not exist.

I then ran the following commands:

SQL> select * from a;

         A
----------
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10

10 rows selected.

SQL> @pl_test_01.sql

1 row updated.

insert into b (b) values (100);
            *
ERROR at line 4:
ORA-06550: line 4, column 13:
PL/SQL: ORA-00942: table or view does not exist
ORA-06550: line 4, column 1:
PL/SQL: SQL Statement ignored

1 row updated.

Commit complete.

SQL> select * from a;

         A
----------
        11
         2
         3
        14
         5
         6
         7
         8
         9
        10

10 rows selected.

SQL>

“OK.”, I thought, “Well that explains what the developer reported.” But, what is initiating the rollback? This was undermining what the two of us believed about how PL/SQL handles transactions.

After a bit of playing around with commits in different places and considering various options, including the type of error I was generating I ran “pl_test_05.sql”

-- Start of pl_test_05.sql --
update a set a = a + 10 where a = 1;
declare
l_var number := 0;
begin
update a set a = a + 10 where a = 2;
update a set a = a + 10 where a = 3;
insert into a (a) values (100/l_var);
exception
   when others then
   null;
end;
/
update a set a = a + 10 where a = 4;
commit;
-- End of pl_test_05.sql --

“When others then null” – Tom Kyte would have a fit! I was working on the basis that this was testing of functionality and was therefore acceptable.

SQL> select * from a;

         A
----------
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10

10 rows selected.

SQL> @pl_test_05.sql

1 row updated.

PL/SQL procedure successfully completed.

1 row updated.

Commit complete.

SQL> select * from a;

         A
----------
        11
        12
        13
        14
         5
         6
         7
         8
         9
        10

10 rows selected.

SQL>

Bingo! I’d managed to create the opposite behaviour of what I’d initially seen. SQL*Plus was rolling back the unhandled error.
After a bit of reading around I came across the following on page 23 of “Oracle PL/SQL for DBAs” (Arup Nanda and Steven Feuerstein):

That environment (a tool like SQL*Plus, Oracle Forms, or a Java program) then takes an action appropriate to the situation; in the case of SQL*Plus, a ROLLBACK of any DML changes from within the top-level block’s logic is automatically performed.

I’ve ended another day feeling like I’ve learn something new.

Added 11/03/2009

… Or maybe not! A colleage pointed me towards chapter 4 – Transactions of Expert One-on-One Oracle by Tom Kyte, where there is a different explanantion. According to Tom this isn’t a case of SQL*Plus rolling back the transaction, but actually the result of:

It is interesting to note that Oracle considers PL/SQL anonymous blocks to be statements
as well.