METHOD_OPT of NULL

This is the start of a series of MEEK (maybe everyone else knows) posts as I seem to keep running into things about Oracle that I feel I should have known.

Until very recently I did not know what using METHOD_OPT => NULL in a DBMS_STATS call does. If you do then stop reading here šŸ™‚

Creation of a test table…

MNASH@orcl> create table t as select * from dba_objects;

Table created.

MNASH@orcl> select count(*) from t;

  COUNT(*)
----------
     72563

MNASH@orcl> select table_name, num_rows from user_tables where table_name = 'T';

TABLE_NAME   NUM_ROWS
---------- ----------
T

MNASH@orcl> select table_name, column_name, num_distinct from user_tab_columns where table_name = 'T';

TABLE_NAME COLUMN_NAME                    NUM_DISTINCT
---------- ------------------------------ ------------
T          OWNER
T          OBJECT_NAME
T          SUBOBJECT_NAME
T          OBJECT_ID
T          DATA_OBJECT_ID
T          OBJECT_TYPE
T          CREATED
T          LAST_DDL_TIME
T          TIMESTAMP
T          STATUS
T          TEMPORARY
T          GENERATED
T          SECONDARY
T          NAMESPACE
T          EDITION_NAME

15 rows selected.

MNASH@orcl>

Gathering statistics using DBMS_STATS.GATHER_TABLE_STATS with METHOD_OPT => NULL

MNASH@orcl> exec dbms_stats.gather_table_stats(null,'T',method_opt=>null)

PL/SQL procedure successfully completed.

MNASH@orcl> select table_name, num_rows from user_tables where table_name = 'T';

TABLE_NAME   NUM_ROWS
---------- ----------
T               72563

MNASH@orcl> select table_name, column_name, num_distinct from user_tab_columns where table_name = 'T';

TABLE_NAME COLUMN_NAME                    NUM_DISTINCT
---------- ------------------------------ ------------
T          OWNER
T          OBJECT_NAME
T          SUBOBJECT_NAME
T          OBJECT_ID
T          DATA_OBJECT_ID
T          OBJECT_TYPE
T          CREATED
T          LAST_DDL_TIME
T          TIMESTAMP
T          STATUS
T          TEMPORARY
T          GENERATED
T          SECONDARY
T          NAMESPACE
T          EDITION_NAME

15 rows selected.

MNASH@orcl>

Statistics are gathered for the table, but not for the columns of the table.

I found out about this through a friend after puzzling over how some of the tables in a database had statistics at a table level, but none for the columns.

Just for completeness, the following shows what happens when METHOD_OPT is allowed to default:

MNASH@orcl> exec dbms_stats.gather_table_stats(null,'T')

PL/SQL procedure successfully completed.

MNASH@orcl> select table_name, num_rows from user_tables where table_name = 'T';

TABLE_NAME   NUM_ROWS
---------- ----------
T               72563

MNASH@orcl> select table_name, column_name, num_distinct from user_tab_columns where table_name = 'T';

TABLE_NAME COLUMN_NAME                    NUM_DISTINCT
---------- ------------------------------ ------------
T          OWNER                                    32
T          OBJECT_NAME                           43784
T          SUBOBJECT_NAME                          108
T          OBJECT_ID                             72563
T          DATA_OBJECT_ID                         7705
T          OBJECT_TYPE                              44
T          CREATED                                1174
T          LAST_DDL_TIME                          1272
T          TIMESTAMP                              1318
T          STATUS                                    1
T          TEMPORARY                                 2
T          GENERATED                                 2
T          SECONDARY                                 2
T          NAMESPACE                                21
T          EDITION_NAME                              0

15 rows selected.

MNASH@orcl>

Whether or not you want to use the default, which is to allow Oracle to create histograms as it sees appropriate, is a separate point which has received plenty of attention elsewhere.

In the case I was looking into there was actually a bug in the code used as a wrapper to DBMS_STATS and it was not intended that no column statistics were gathered, but I started to wonder if there would be a situation where using this undocumented option would be attractive. I did some fairly simple testing to establish the overhead of collecting column statistics…

The test was:

1 – Increase the number of rows in table T to around 1 million
2 – Delete table statistics for table T
3 – Flush the buffer cache
4 – Gather table statistics using compute with method_opt => null
5 – Repeat (2) to (4) 5 times
6 – Delete table statistics for table T
7 – Flush the buffer cache
8 – Gather table statistics using compute with method_opt => ‘for all columns size 1’
9 – Repeat (6) to (8) 5 times

Looking at the elapsed time for the statistics gathering I got averages of:

null: 2.3 seconds
for all columns size 1: 14.5 seconds

That’s a pretty big overhead, but a lot of information you are robbing the CBO of if you don’t gather them!

I then ran the same test, but with AUTO_SAMPLE_SIZE as this is a more realistic case and got averages of:

null: 1.5 seconds
for all columns size 1: 3.4 seconds

Note: The tests were performed on a 11.2.0.1 database.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s