Optimiser Statistics and Lower Case Columns
I was contacted recently about a performance problem in an application we support. The query summarises the number of issues that have not been closed. A simplified version is shown below:
SELECT COUNT(*)
FROM "issues"
WHERE "state" IN ('open','new');
There are two things of note here. Firstly your eyes do not deceive you – those are lower case table and column names (more on this later). Secondly the “state” column is indexed yet the query is doing a full table scan on the 400,000 row “issues” table.
The reason for the full scan is skewed “state” data. There are a handful of new and open issues but a heck of a lot of closed ones (and no possibility of setting them all to NULL before you ask). So the index looks unattractive to the optimiser:
400,000 rows / 3 distinct values = rubbish index access
The obvious thing to do is put a histogram on the “state” column. Things did not go quite to plan. My demonstration of this is below (on a 9.2.0.8 database – the same as the application in question).
1) Create a test table with a mixture of upper and lower case columns, notice we have to wrap table and column names in double quotes (“”) to avoid a conversion to upper case.
create table "tab1"(col1 number,"col2" number,"col3" number); begin for i in 1..10 loop insert into "tab1" values (i,i,i); end loop; commit; end; /
2) DBMS_STATS also performs upper case conversions
exec dbms_stats.gather_table_stats(null,'tab1'); column column_name format a10 select tc.table_name,tc.column_name, tc.nullable nulls , tc.num_distinct, tc.num_buckets buckets from user_tab_columns tc where tc.table_name = 'tab1' order by 1,2; TABLE_NAME COLUMN_NAM N NUM_DISTINCT BUCKETS ------------ ---------- - ------------ ---------- tab1 COL1 Y tab1 col2 Y tab1 col3 Y
And again with double quotes (“”) to prevent case conversion
exec dbms_stats.gather_table_stats(null,'"tab1"'); column column_name format a10 select tc.table_name,tc.column_name, tc.nullable nulls , tc.num_distinct, tc.num_buckets buckets from user_tab_columns tc where tc.table_name = 'tab1' order by 1,2; TABLE_NAME COLUMN_NAM N NUM_DISTINCT BUCKETS ------------ ---------- - ------------ ---------- tab1 COL1 Y 10 1 tab1 col2 Y 10 1 tab1 col3 Y 10 1
So far everything works as expected. Now for the histogram.
3) Create histograms on lower and upper case columns
exec dbms_stats.delete_table_stats(null,'"tab1"');
exec dbms_stats.gather_table_stats(null,'"tab1"'-
,method_opt=>'for columns col1 size 254');
exec dbms_stats.gather_table_stats(null,'"tab1"'-
,method_opt=>'for columns "col2" size 254');
column column_name format a10
select tc.table_name,tc.column_name, tc.nullable nulls
, tc.num_distinct, tc.num_buckets buckets
from user_tab_columns tc
where tc.table_name = 'tab1'
order by 1,2;
TABLE_NAME COLUMN_NAM N NUM_DISTINCT BUCKETS
------------ ---------- - ------------ ----------
tab1 COL1 Y 10 9
tab1 col2 Y
tab1 col3 Y
That wasn’t expected, there’s no histogram on “col2″. After trying an assortment of “special” characters and seeing the following message far too frequently…
ERROR at line 1: ORA-06502: PL/SQL: numeric or value error ORA-06512: at "SYS.DBMS_STATS", line 10502 ORA-06512: at "SYS.DBMS_STATS", line 10516 ORA-06512: at line 1
…I did what I should have done originally and searched Metalink:
Bug 4892211 – DBMS_STATS does not recognise lower case / special characters in column names
- This issue is fixed in*
- 10.2.0.3 (Server Patch Set) <javascript:taghelp(‘FIXED_A203′)>
- 11.1.0.6 (Base Release) <javascript:taghelp(‘FIXED_B106′)>
4) At first I thought a workaround would be to add a histogram to all columns on the table and then remove the ones I don’t want. However I’m faced with the same problem – I can’t specify “col3″ to remove that histogram:
exec dbms_stats.gather_table_stats(null,'"tab1"'-
,method_opt=>'for all columns size 254');
TABLE_NAME COLUMN_NAM N NUM_DISTINCT BUCKETS
------------ ---------- - ------------ ----------
tab1 COL1 Y 10 9
tab1 col2 Y 10 9
tab1 col3 Y 10 9
exec dbms_stats.gather_table_stats(null,'"tab1"'-
,method_opt=>'for columns "col3" size 1');
TABLE_NAME COLUMN_NAM N NUM_DISTINCT BUCKETS
------------ ---------- - ------------ ----------
tab1 COL1 Y 10 1
tab1 col2 Y 10 9
tab1 col3 Y 10 9
So in my case the best I could do was put the histogram on all indexed columns and use “method_opt=>’for all columns size repeat’”for future optimiser statistic collections to keep the histogram in place.
leave a comment