VizualBusiness: Histogram Generation

Friday, September 14, 2012

Histogram Generation

Histogram Generation:
This really could be published in the Errata and Addenda of “Cost Based Oracle – Fundamentals”, but it’s more convenient to publish the whole thing here and just add a link to the Errata pages.
In chapter 7, on page 156, I gave an example of the type of SQL that Oracle runs (in the dbms_stats package) to generate a histogram. A sample of the code, and the plan from the 9.2.0.8 tkprof output, is listed below:

select
 min(minbkt),maxbkt,
 substrb(dump(min(val),16,0,32),1,120) minval,
 substrb(dump(max(val),16,0,32),1,120) maxval,
 sum(rep) sumrep, sum(repsq) sumrepsq,
 max(rep) maxrep, count(*) bktndv,
 sum(case when rep=1 then 1 else 0 end) unqrep
from
 (
 select
  val,min(bkt) minbkt, max(bkt) maxbkt,
  count(val) rep, count(val) *count(val) repsq
 from (
  select
   /*+
    noparallel(t) noparallel_index(t)
    cursor_sharing_exact dynamic_sampling(0)
    no_monitoring
   */
   "LN100" val,
   ntile(200) over (order by "LN100") bkt
  from
   sys.ora_temp_1_ds_616 t
  where
   "LN100" is not null
  )
 group by val
 )
group by
 maxbkt
order by
 maxbkt
;

Rows     Row Source Operation
-------  ---------------------------------------------------
    170  SORT GROUP BY
    825   VIEW
    825    SORT GROUP BY
 200028     VIEW
 200028      WINDOW SORT
 200028       TABLE ACCESS FULL ORA_TEMP_1_DS_616

The code in this case references a table called ora_temp_1_DS_616, which is a temporary table dynamically created by the package to hold a sample of just the columns that I was interested in. I had specified a histogram of 200 buckets, and you can see the ntile(200) analytic function in the code that Oracle has used to generate this – and “window sort” in the execution plan that supports it; it’s a window sort of the entire sampled data set which could be quite large and, as we’ve seen elsewhere, could be very resource-intensive.
In a demonstratin of how important it is (and how nearly impossible it is) to keep revisiting the things you think you know, I’ve recently discovered – while constructing a demonstration of a problem in 11.2.0.3 – that the implementation had changed by 10.2.0.3 (and it may have changed even earlier, of course – but that’s the earliest version of 10g I have acces to). Here’s the generated code for the same data set, with plan, from 10.2.0.3:

select
 substrb(dump(val,16,0,32),1,120) ep, cnt
from
 (
 select
  /*+
   no_parallel(t) no_parallel_index(t) dbms_stats
   cursor_sharing_exact use_weak_name_resl
   dynamic_sampling(0) no_monitoring
  */
  "LN100" val, count(*) cnt
 from
  sys.ora_temp_1_ds_1470 t
 where
  "LN100" is not null
 group by "LN100"
 )
order by
 val

Rows     Row Source Operation
-------  ---------------------------------------------------
    824  SORT GROUP BY (cr=499 pr=0 pw=0 time=42484 us)
 199451   TABLE ACCESS FULL ORA_TEMP_1_DS_1470 (cr=499 pr=0 pw=0 time=199490 us)

As you can see it doesn’t have any functionality built into it that could possibly provide the 200 buckets that I finally want to see. You’ll also notice that the sample size is slightly different from the 9i sample size – that’s one of the threats of sampled histogram creation, of course, you could get a slightly (but dangerously) different histogram every time you call dbms_stats even if the data hasn’t changed.
Based on the differences in the query, I think we can safely assume that Oracle is using some procedural code to count its way through the results from this query (ordered by column value) to construct the histogram details that we are used to seeing. I think it’s worth noting that this demonstrates the fact that using a simpler SQL statement with some surrounding PL/SQL can sometimes be more efficient than writing a devilishly clever piece of SQL that gets you the answer directly.

DIGITAL JUICE

VizualBusiness

Friday, September 14, 2012

Histogram Generation

No comments:

Post a Comment

Digital Juice for Primary Education

About Me

Translate

Total Pageviews

Friday, September 14, 2012

Histogram Generation

No comments:

Post a Comment

Digital Juice for Primary Education

About Me

Translate

Subscribe To

Total Pageviews