VizualBusiness: Not In Nasty

Saturday, June 15, 2013

Not In Nasty

Not In Nasty:
Actually it’s probably not the NOT IN that’s nasty, it’s the thing you get if you don’t use NOT IN that’s more likely to be nasty. Just another odd little quirk of the optimizer, which I’ll demonstrate with a simple example (running under 11.2.0.3 in this case):

create table t1
as
select *
from all_objects
where rownum <= 10000
;

update t1 set
 object_type = null
where
 object_type = 'TABLE'
;

commit;

begin
 dbms_stats.gather_table_stats(
  ownname   => user,
  tabname   =>'T1',
  method_opt  => 'for all columns size 1'
 );
end;
;

select
 sample_size, num_nulls, num_distinct, histogram
from
 user_tab_columns
where
 table_name = 'T1'
and column_name = 'OBJECT_TYPE'
;

This code is going to give you 10,000 rows but the number of distinct values for object_type and the number of nulls will depend on the version and options installed; however you should get about 15 distinct values for object_type and about 1,000 nulls in that column.
If you want to run a query to count the rows where the object_type is not one of: ‘INDEX’,'SYNONYM’, or ‘VIEW’, there are two obvious ways of writing it, so let’s put them into a single query with a UNION ALL, and see what Oracle predicts as the cardinality (cut-n-pasted from an SQL*Plus session):

SQL> set autotrace traceonly explain
SQL> select
  2     count(*)
  3  from       t1
  4  where      object_type != 'INDEX'
  5  and        object_type != 'SYNONYM'
  6  and        object_type != 'VIEW'
  7  union all
  8  select
  9     count(*)
 10  from       t1
 11  where      object_type not in ('INDEX', 'SYNONYM', 'VIEW')
 12  ;

Execution Plan
----------------------------------------------------------
Plan hash value: 575959041

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     2 |    12 |    38  (53)| 00:00:01 |
|   1 |  UNION-ALL          |      |       |       |            |          |
|   2 |   SORT AGGREGATE    |      |     1 |     6 |            |          |
|*  3 |    TABLE ACCESS FULL| T1   |  5903 | 35418 |    19   (6)| 00:00:01 |
|   4 |   SORT AGGREGATE    |      |     1 |     6 |            |          |
|*  5 |    TABLE ACCESS FULL| T1   |  7308 | 43848 |    19   (6)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("OBJECT_TYPE"<>'INDEX' AND "OBJECT_TYPE"<>'SYNONYM' AND "OBJECT_TYPE"<>'VIEW')
5 - filter("OBJECT_TYPE"<>'INDEX' AND "OBJECT_TYPE"<>'SYNONYM' AND "OBJECT_TYPE"<>'VIEW')

Look at operations 3 and 5: the way you write the query makes a significant difference to the cardinality predicted by the optimizer – despite the fact that the NOT IN predicate is transformed internally so that the predicate sections of the two parts of the UNION ALL are exactly the same.
If you’re wondering, the NOT IN option is the one that does the rational arithmetic – it has calculated the number of rows where object_type is not null, then allowed for filtering out 3/15ths of those rows because the query rejects 3 rows out of the 15 distinct values reported by the statistics. The list of inequalities has applied some arithmetic that tries to allow for the nulls three times – the more values you reject the worse the estimate gets.
It’s quite likely that you won’t notice the effect in many cases – but if you run queries against a column with a large percentage of nulls, then the differences can be very large and knock-on effects of this error could be dramatic.

DIGITAL JUICE

VizualBusiness

Saturday, June 15, 2013

Not In Nasty

No comments:

Post a Comment

Digital Juice for Primary Education

About Me

Translate

Total Pageviews

Saturday, June 15, 2013

Not In Nasty

No comments:

Post a Comment

Digital Juice for Primary Education

About Me

Translate

Subscribe To

Total Pageviews