Optimiser Statistics and Lower Case Columns

I was contacted recently about a performance problem in an application we support. The query summarises the number of issues that have not been closed. A simplified version is shown below:

SELECT  COUNT(*)
FROM    "issues"
WHERE   "state" IN ('open','new');

There are two things of note here. Firstly your eyes do not deceive you – those are lower case table and column names (more on this later). Secondly the “state” column is indexed yet the query is doing a full table scan on the 400,000 row “issues” table.

The reason for the full scan is skewed “state” data. There are a handful of new and open issues but a heck of a lot of closed ones (and no possibility of setting them all to NULL before you ask). So the index looks unattractive to the optimiser:

400,000 rows / 3 distinct values = rubbish index access

The obvious thing to do is put a histogram on the “state” column. Things did not go quite to plan. My demonstration of this is below (on a 9.2.0.8 database – the same as the application in question).

1) Create a test table with a mixture of upper and lower case columns, notice we have to wrap table and column names in double quotes (“”) to avoid a conversion to upper case.

create table "tab1"(col1 number,"col2" number,"col3" number);

begin
	for i in 1..10 loop
		insert into "tab1" values (i,i,i);
	end loop;
	commit;
end;
/

2) DBMS_STATS also performs upper case conversions

exec dbms_stats.gather_table_stats(null,'tab1');

column column_name format a10
select  tc.table_name,tc.column_name, tc.nullable nulls
,       tc.num_distinct, tc.num_buckets buckets
from    user_tab_columns tc
where tc.table_name = 'tab1'
order by 1,2;

TABLE_NAME   COLUMN_NAM N NUM_DISTINCT    BUCKETS
------------ ---------- - ------------ ----------
tab1         COL1       Y
tab1         col2       Y
tab1         col3       Y

And again with double quotes (“”) to prevent case conversion

exec dbms_stats.gather_table_stats(null,'"tab1"');

column column_name format a10
select  tc.table_name,tc.column_name, tc.nullable nulls
,       tc.num_distinct, tc.num_buckets buckets
from    user_tab_columns tc
where tc.table_name = 'tab1'
order by 1,2;

TABLE_NAME   COLUMN_NAM N NUM_DISTINCT    BUCKETS
------------ ---------- - ------------ ----------
tab1         COL1       Y           10          1
tab1         col2       Y           10          1
tab1         col3       Y           10          1

So far everything works as expected. Now for the histogram.

3) Create histograms on lower and upper case columns

exec dbms_stats.delete_table_stats(null,'"tab1"');
exec dbms_stats.gather_table_stats(null,'"tab1"'-
               ,method_opt=>'for columns col1 size 254');
exec dbms_stats.gather_table_stats(null,'"tab1"'-
               ,method_opt=>'for columns "col2" size 254');

column column_name format a10
select  tc.table_name,tc.column_name, tc.nullable nulls
,       tc.num_distinct, tc.num_buckets buckets
from    user_tab_columns tc
where   tc.table_name = 'tab1'
order by 1,2;

TABLE_NAME   COLUMN_NAM N NUM_DISTINCT    BUCKETS
------------ ---------- - ------------ ----------
tab1         COL1       Y           10          9
tab1         col2       Y
tab1         col3       Y

That wasn’t expected, there’s no histogram on “col2”. After trying an assortment of “special” characters and seeing the following message far too frequently…

ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at "SYS.DBMS_STATS", line 10502
ORA-06512: at "SYS.DBMS_STATS", line 10516
ORA-06512: at line 1

…I did what I should have done originally and searched Metalink:

Bug 4892211 – DBMS_STATS does not recognise lower case / special characters in column names

  • This issue is fixed in*
  • 10.2.0.3 (Server Patch Set) <javascript:taghelp(‘FIXED_A203’)>
  • 11.1.0.6 (Base Release) <javascript:taghelp(‘FIXED_B106’)>

4) At first I thought a workaround would be to add a histogram to all columns on the table and then remove the ones I don’t want. However I’m faced with the same problem – I can’t specify “col3” to remove that histogram:

exec dbms_stats.gather_table_stats(null,'"tab1"'-
               ,method_opt=>'for all columns size 254');

TABLE_NAME   COLUMN_NAM N NUM_DISTINCT    BUCKETS
------------ ---------- - ------------ ----------
tab1         COL1       Y           10          9
tab1         col2       Y           10          9
tab1         col3       Y           10          9

exec dbms_stats.gather_table_stats(null,'"tab1"'-
               ,method_opt=>'for columns "col3" size 1');

TABLE_NAME   COLUMN_NAM N NUM_DISTINCT    BUCKETS
------------ ---------- - ------------ ----------
tab1         COL1       Y           10          1
tab1         col2       Y           10          9
tab1         col3       Y           10          9

So in my case the best I could do was put the histogram on all indexed columns and use “method_opt=>’for all columns size repeat'”for future optimiser statistic collections to keep the histogram in place.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s