This is the start of a series of MEEK (maybe everyone else knows) posts as I seem to keep running into things about Oracle that I feel I should have known.
Until very recently I did not know what using METHOD_OPT => NULL in a DBMS_STATS call does. If you do then stop reading here 🙂
Creation of a test table…
MNASH@orcl> create table t as select * from dba_objects; Table created. MNASH@orcl> select count(*) from t; COUNT(*) ---------- 72563 MNASH@orcl> select table_name, num_rows from user_tables where table_name = 'T'; TABLE_NAME NUM_ROWS ---------- ---------- T MNASH@orcl> select table_name, column_name, num_distinct from user_tab_columns where table_name = 'T'; TABLE_NAME COLUMN_NAME NUM_DISTINCT ---------- ------------------------------ ------------ T OWNER T OBJECT_NAME T SUBOBJECT_NAME T OBJECT_ID T DATA_OBJECT_ID T OBJECT_TYPE T CREATED T LAST_DDL_TIME T TIMESTAMP T STATUS T TEMPORARY T GENERATED T SECONDARY T NAMESPACE T EDITION_NAME 15 rows selected. MNASH@orcl>
Gathering statistics using DBMS_STATS.GATHER_TABLE_STATS with METHOD_OPT => NULL
MNASH@orcl> exec dbms_stats.gather_table_stats(null,'T',method_opt=>null) PL/SQL procedure successfully completed. MNASH@orcl> select table_name, num_rows from user_tables where table_name = 'T'; TABLE_NAME NUM_ROWS ---------- ---------- T 72563 MNASH@orcl> select table_name, column_name, num_distinct from user_tab_columns where table_name = 'T'; TABLE_NAME COLUMN_NAME NUM_DISTINCT ---------- ------------------------------ ------------ T OWNER T OBJECT_NAME T SUBOBJECT_NAME T OBJECT_ID T DATA_OBJECT_ID T OBJECT_TYPE T CREATED T LAST_DDL_TIME T TIMESTAMP T STATUS T TEMPORARY T GENERATED T SECONDARY T NAMESPACE T EDITION_NAME 15 rows selected. MNASH@orcl>
Statistics are gathered for the table, but not for the columns of the table.
I found out about this through a friend after puzzling over how some of the tables in a database had statistics at a table level, but none for the columns.
Just for completeness, the following shows what happens when METHOD_OPT is allowed to default:
MNASH@orcl> exec dbms_stats.gather_table_stats(null,'T') PL/SQL procedure successfully completed. MNASH@orcl> select table_name, num_rows from user_tables where table_name = 'T'; TABLE_NAME NUM_ROWS ---------- ---------- T 72563 MNASH@orcl> select table_name, column_name, num_distinct from user_tab_columns where table_name = 'T'; TABLE_NAME COLUMN_NAME NUM_DISTINCT ---------- ------------------------------ ------------ T OWNER 32 T OBJECT_NAME 43784 T SUBOBJECT_NAME 108 T OBJECT_ID 72563 T DATA_OBJECT_ID 7705 T OBJECT_TYPE 44 T CREATED 1174 T LAST_DDL_TIME 1272 T TIMESTAMP 1318 T STATUS 1 T TEMPORARY 2 T GENERATED 2 T SECONDARY 2 T NAMESPACE 21 T EDITION_NAME 0 15 rows selected. MNASH@orcl>
Whether or not you want to use the default, which is to allow Oracle to create histograms as it sees appropriate, is a separate point which has received plenty of attention elsewhere.
In the case I was looking into there was actually a bug in the code used as a wrapper to DBMS_STATS and it was not intended that no column statistics were gathered, but I started to wonder if there would be a situation where using this undocumented option would be attractive. I did some fairly simple testing to establish the overhead of collecting column statistics…
The test was:
1 – Increase the number of rows in table T to around 1 million
2 – Delete table statistics for table T
3 – Flush the buffer cache
4 – Gather table statistics using compute with method_opt => null
5 – Repeat (2) to (4) 5 times
6 – Delete table statistics for table T
7 – Flush the buffer cache
8 – Gather table statistics using compute with method_opt => ‘for all columns size 1’
9 – Repeat (6) to (8) 5 times
Looking at the elapsed time for the statistics gathering I got averages of:
null: 2.3 seconds
for all columns size 1: 14.5 seconds
That’s a pretty big overhead, but a lot of information you are robbing the CBO of if you don’t gather them!
I then ran the same test, but with AUTO_SAMPLE_SIZE as this is a more realistic case and got averages of:
null: 1.5 seconds
for all columns size 1: 3.4 seconds
Note: The tests were performed on a 11.2.0.1 database.