Calculating Statistics For Function-Based Index Virtual Hidden Columns
I read a very interesting blog post the other day by Richard Foote entitled “Function Based Indexes and Missing Statistics - No Suprises.“ The post explains how in Oracle 10g that normal statistics are generated for function-based indexes but not on the underlying virtual hidden table columns that are created when a function-based index is added.
If Oracle knows the low value, the high value and the number of distinct values of the data associated with the function on a column, it can then accurately determine the associated selectivity and cardinality when the function is used on the column (assuming even distribution of data) and hence calculate an accurate cost and determine whether the use of the function-based index is appropriate.
However, and here comes the trap, when a function-based index is created, Oracle will now (since 10g) automatically calculate the statistics associated with the index (such as the blevel, number of leaf blocks, clustering factor, etc.) but it will NOT calculate the statistics associated with the hidden virtual column as these statistics are associated with the parent table, not directly with the index itself.
…
Therefore, even after you’ve created the function-based index, Oracle still has no idea on the selectivity associated the function because the necessary virtual column statistics are still missing. Oracle simply takes a “guess” and as with most guesses, it’s quite likely to be wrong which means the selectivity is likely to be wrong which means the costings is likely to be wrong which means the execution plan could very well be wrong as well.
After explaining the problem he goes on to provide the method by which to correct the issue. I’ve created a simple script that will look at all of our tables with hidden virtual columns and create an analyze script to get these statistics for our environment. To use elsewhere one would need to remove the reference to emtclientinfo.emtclients.
[sourcecode language='sql']set pagesize 0
set linesize 200
set echo off
set feedback off
set timing off
set serveroutput off
spool analyze_hidden_virtual_columns.sql
select ’set echo on’ from dual;
select ’set feedback on’ from dual;
select ’set timing on’ from dual;
select ’set serveroutput on’ from dual;
select ’spool analyze_results.rpt’ from dual;
select ‘exec dbms_stats.gather_table_stats(ownname=>”’||owner||”’,tabname=>”’||table_name||”’,estimate_percent=>null,cascade=>true,method_opt=>”FOR ALL HIDDEN COLUMNS SIZE 1”);’
from dba_tab_cols
where lower(owner) in (select distinct(name) from emtclientinfo.emtclients)
and (hidden_column=’YES’ or virtual_column=’YES’)
group by owner, table_name
order by owner, table_name;
select ’spool off’ from dual;
spool off
[/sourcecode]