MULTISET experiments

Recently I was asked to make some investigation about the PL/SQL procedure running slowly. PL/SQL procedure does not contain any SQL or complicate math operation but for some reasons have unexpected delays in processing. After small investigation I localize the problematic statement as containing the join of two nested tables – “multiset union”.

I was not able to explain it and crate test case to get to the bottom of this issue.

CREATE OR REPLACE PACKAGE epani.multicast_pkg
  IS
     TYPE t_number_nt IS TABLE OF NUMBER; -- test nested table type
       v_new_number  t_number_nt; --second nested table
       v_number t_number_nt; --first nested table
     PROCEDURE init (num_init1 number
                                  , num_init2 number); -- procedure that populate first nested table
     PROCEDURE add_number_old; -- procedure that join nested table in old fashion
     PROCEDURE add_number_new; -- procedure that join nested tables with MULTISET
END;
/
CREATE OR REPLACE PACKAGE BODY epani.multicast_pkg
 IS
    PROCEDURE  init (num_init1 number  -- number of elements in the first table
                                  , num_init2 number) -- number of elements in the second table
    is
    begin
        SELECT num BULK COLLECT INTO v_number   FROM (
                              SELECT  LEVEL num  FROM DUAL CONNECT BY LEVEL <= num_init1);
        SELECT num BULK COLLECT INTO v_new_number  FROM (
                              SELECT  LEVEL num  FROM DUAL CONNECT BY LEVEL <= num_init2);
    end;

    PROCEDURE  add_number_old 
     IS
     BEGIN
          v_number.EXTEND(v_new_number.COUNT); -- allocate nested table 
        FOR i IN v_new_number.FIRST .. v_new_number.LAST
        LOOP
          v_number(v_number.LAST) := v_new_number(i);
        END LOOP;
     END add_number_old;

    PROCEDURE  add_number_new
     IS
     BEGIN
       v_number:=v_number multiset union v_new_number;
     END add_number_new;
END;
/

I have prepared few test cases
First : We join the small table to the big one

---We initialise the first nested table by significant number of rows and second by small number
SQL> exec EPANI.multicast_pkg.init(356000,5);
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.33
-- Run the old fashion join procedure
SQL> exec EPANI.multicast_pkg.add_number_old;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.01
-- the procedure does not really takes so long
SQL> exec EPANI.multicast_pkg.add_number_new;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.41
-- the procedure takes almost ½ of time to initially generate the table

Second: we join two relatively small tables but do it repeatedly 1000 times to get the meaningful timing

---We initialise the nested tables by small number of rows
SQL> exec EPANI.multicast_pkg.init(1000,100);
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.05
-- We run pl/sql block using old fashion method
SQL> begin
  for i in 1..1000 LOOP
    EPANI.multicast_pkg.add_number_old;
  END LOOP;
end;
/
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.15
-- We run pl/sql block using new method
SQL> begin
  for i in 1..1000 LOOP
     EPANI.multicast_pkg.add_number_new;
  END LOOP;
end;
/
PL/SQL procedure successfully completed.
Elapsed: 00:04:29.75

Third: We join two big tables

---We initialise the nested tables by significant number of rows
SQL> exec EPANI.multicast_pkg.init(100000, 100000);
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.97
-- Run the old fashion join procedure
SQL> exec EPANI.multicast_pkg.add_number_old;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.15
-- the procedure does not  takes measurable time
---We reinitialise the nested tables by significant number of rows
SQL> exec EPANI.multicast_pkg.init(100000, 100000);
PL/SQL procedure successfully completed.
Elapsed: 00:00:03.11
SQL> exec EPANI.multicast_pkg.add_number_new;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.28
-- the procedure takes almost extra 50% above the old method

Forth: We join big tables to the small one

---We initialise the nested tables by significant number of rows
SQL> exec EPANI.multicast_pkg.init(5, 356000);
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.08
-- Run the old fashion join procedure
SQL> exec EPANI.multicast_pkg.add_number_old;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.62
-- the procedure does takes measurable time
---We reinitialise the nested tables in the same fashion
SQL> exec EPANI.multicast_pkg.init(5, 356000);
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.27
SQL> exec EPANI.multicast_pkg.add_number_new;
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.07
-- the procedure takes almost extra 50% above the old method

We have proved that for all cases we got better performance on old fashion method in all cases, but why. What stands behind this degradation? The answer can be found if we rewrite our procedures in slightly different way.

CREATE OR REPLACE PACKAGE BODY epani.multicast_pkg
 IS
    PROCEDURE  init (num_init1 number  -- number of elements in the first table
                                  , num_init2 number) -- number of elements in the second table
    is
    begin
        SELECT num BULK COLLECT INTO v_number   FROM (
                              SELECT  LEVEL num  FROM DUAL CONNECT BY LEVEL <= num_init1);
        SELECT num BULK COLLECT INTO v_new_number  FROM (
                              SELECT  LEVEL num  FROM DUAL CONNECT BY LEVEL <= num_init2);
    end;

    PROCEDURE  add_number_old 
     IS
	v_result t_number_nt:= t_number_nt(); --resulting nested table
     BEGIN
        v_result.EXTEND(v_number.COUNT); -- allocate nested table 
        FOR i IN v_number.FIRST .. v_number.LAST
        LOOP
          v_result(v_number.LAST) := v_number(i);
        END LOOP;
        v_result.EXTEND(v_new_number.COUNT); -- allocate nested table 
        FOR i IN v_new_number.FIRST .. v_new_number.LAST
        LOOP
          v_result(v_number.LAST) := v_new_number(i);
        END LOOP;
     END add_number_old;

    PROCEDURE  add_number_new
     IS
	v_result t_number_nt:= t_number_nt(); --resulting nested table
     BEGIN
       v_result:=v_number multiset union v_new_number;
     END add_number_new;
END;
/

And repeat our tests

First : We join the small table to the big one

---We initialise the first nested table by significant number of rows and second by small number
SQL> exec EPANI.multicast_pkg.init(356000,5);
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.16
-- Run the old fashion join procedure
SQL> exec EPANI.multicast_pkg.add_number_old;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.93
-- the procedure does takes significantly longer
SQL> exec EPANI.multicast_pkg.add_number_new;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.64
-- faster by new method

Second: we join two relatively small tables but do it repeatedly 1000 times to get the meaningful timing

---We initialise the nested tables by small number of rows
SQL> exec EPANI.multicast_pkg.init(1000,100);
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.04
-- We run pl/sql block using old fashion method
SQL> begin
  for i in 1..1000 LOOP
    EPANI.multicast_pkg.add_number_old;
  END LOOP;
end;
/
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.30
-- We run pl/sql block using new method
SQL> begin
  for i in 1..1000 LOOP
     EPANI.multicast_pkg.add_number_new;
  END LOOP;
end;
/
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.96
-- Faster by new method

Third: We join two big tables

---We initialise the nested tables by significant number of rows
SQL> exec EPANI.multicast_pkg.init(100000, 100000);
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.56
-- Run the old fashion join procedure
SQL> exec EPANI.multicast_pkg.add_number_old;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.48
-- the procedure does not  takes measurable time
---We reinitialise the nested tables by significant number of rows
SQL> exec EPANI.multicast_pkg.init(100000, 100000);
PL/SQL procedure successfully completed.
Elapsed: 00:00:02.08
SQL> exec EPANI.multicast_pkg.add_number_new;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.40
-- slightly faster by new method

Forth: We join big tables to the small one

---We initialise the nested tables by significant number of rows
SQL> exec EPANI.multicast_pkg.init(5, 356000);
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.31
-- Run the old fashion join procedure
SQL> exec EPANI.multicast_pkg.add_number_old;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.58
-- the procedure does takes measurable time
---We reinitialise the nested tables in the same fashion
SQL> exec EPANI.multicast_pkg.init(5, 356000);
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.38
SQL> exec EPANI.multicast_pkg.add_number_new;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.50
-- slightly faster by new method

It looks like that for new set of test MULTICAST shows better performance than the handmade one. The answer is lighting in a scope of functionality. When we add one collection to the already existing the old one works perfectly but when we create the new collection as join of two old one and later assign it to the one of them the MULTICAST shows better efficiency. Multicast is able to resolve the vide scope of tasks. The developers should clearly see the difference between operating two collections or three collections and choose the appropriate method based on application requirements.

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s