Friday, September 28, 2012

Row sizes

Row sizes:
A recent question on Oracle-L (a performance problem creating indexes that turned into an issue with migrated rows) prompted me to revisit a piece of code I first wrote about 20 years ago for an Oracle 6 system, and first published in November 2000. It addresses the question of analysing row-lengths in detail and, if you look at the example I’ve linked to, you’ll see that this could be very useful when you’re trying to work out suitable settings for pctfree and why you’re suffering from row migration.
The script is strictly limited to “simple” tables, by which I mean heap tables with columns that are basic data types and don’t include such things as Longs, LOBs, nested tables and all the other nasty things that usually break simple utilities. All it does is estimate the length of each row, then aggregating by row length. The estimate doesn’t allow for nulls at the ends of rows or columns longer than 254 bytes – technically the former don’t need length bytes and the latter use 3 bytes for the length when the column is more than 250 bytes. I don’t use the script often, and haven’t got around to including all the column types it won’t work for – that’s left as an exercise for the reader.
In the original example I just listed a script using the vsize() function being applied to a specific table; in this updated code I’ve switched to the sys_op_opnsize() function to measure the internal length of the column, and I’ve written a pl/sql block that generates the SQL from a table name (using the current schema) and then runs the code to produce the result:

define m_table='&1'

declare
m_string varchar2(32000) := 'select 5'; -- 3 if you want to ignore the row directory
m_cursor sys_refcursor;

m_length number;
m_ct number;
m_space number;
m_total number;
m_ratio number;

begin
for r in (
select
column_name, data_type
from user_tab_columns
where table_name = upper('&m_table')
and data_type not in (
'LONG',
'LONG RAW',
'CLOB',
'NCLOB',
'BLOB' -- etc. etc.
)
) loop -- add the lengths of each column
m_string := m_string ||
' + 1 + nvl(sys_op_opnsize(' ||
r.column_name ||
'),1)' ||
chr(10)
;

end loop;

--
-- Use the preceding SQL in an inline view
--

m_string :=
'select row_space, count(*) row_ct, row_space * count(*) sum_space, ' ||
'sum (row_space * count(*)) over (order by row_space) running_tot, ' ||
'round(100* ratio_to_report(row_space * count(*)) over(),2) ratio_to_rep ' ||
' from (' ||
m_string ||
' row_space from &m_table ) group by row_space order by row_space'
;

-- dbms_output.put_line(m_string);

dbms_output.put_line(
rpad('Length',8) ||
rpad('Row Count',12) ||
rpad('Space',12) ||
rpad('Running Total',14) ||
rpad('Space pct',12)
);

dbms_output.put_line(
rpad('-',7,'-') || ' ' ||
rpad('-',11,'-') || ' ' ||
rpad('-',11,'-') || ' ' ||
rpad('-',13,'-') || ' ' ||
rpad('-',11,'-')
);

open m_cursor for m_string;
loop
fetch m_cursor
into
m_length, m_ct, m_space, m_total, m_ratio
;

exit when m_cursor%notfound;

dbms_output.put_line(
lpad(to_char(m_length,'FM999,999'),7) || ' ' ||
lpad(to_char(m_ct,'FM999,999,999'),11) || ' ' ||
lpad(to_char(m_space,'FM999,999,999'),11) || ' ' ||
lpad(to_char(m_total,'FM999,999,999'),13) || ' ' ||
lpad(to_char(m_ratio,'FM999,999.00'),11)
);

end loop;

close m_cursor;

end;
/


Here’s a sample of the output, run against obj$ in a fairly standard install of 11.2.0.3 (with 58,773 rows in the table).

Length  Row Count   Space       Running Total Space pct
------- ----------- ----------- ------------- -----------
73 1 73 73 .00
74 13 962 1,035 .02
75 11 825 1,860 .01
76 18 1,368 3,228 .02
77 32 2,464 5,692 .04
78 62 4,836 10,528 .08
79 85 6,715 17,243 .11
80 242 19,360 36,603 .33
81 348 28,188 64,791 .48
82 368 30,176 94,967 .52
83 385 31,955 126,922 .55
84 418 35,112 162,034 .60
85 508 43,180 205,214 .74
86 584 50,224 255,438 .86
87 600 52,200 307,638 .89
88 808 71,104 378,742 1.22
89 967 86,063 464,805 1.47
90 1,155 103,950 568,755 1.78
91 1,186 107,926 676,681 1.85
92 1,441 132,572 809,253 2.27
93 1,456 135,408 944,661 2.32
94 1,659 155,946 1,100,607 2.67
95 1,888 179,360 1,279,967 3.07
96 2,206 211,776 1,491,743 3.63
97 2,508 243,276 1,735,019 4.17
98 2,612 255,976 1,990,995 4.38
99 2,785 275,715 2,266,710 4.72
100 2,934 293,400 2,560,110 5.02
101 3,153 318,453 2,878,563 5.45
102 3,705 377,910 3,256,473 6.47
103 3,336 343,608 3,600,081 5.88
104 3,827 398,008 3,998,089 6.82
105 15,683 1,646,715 5,644,804 28.20
106 579 61,374 5,706,178 1.05
107 429 45,903 5,752,081 .79
108 129 13,932 5,766,013 .24
109 125 13,625 5,779,638 .23
110 92 10,120 5,789,758 .17
111 114 12,654 5,802,412 .22
112 66 7,392 5,809,804 .13
113 55 6,215 5,816,019 .11
114 39 4,446 5,820,465 .08
115 26 2,990 5,823,455 .05
116 27 3,132 5,826,587 .05
117 22 2,574 5,829,161 .04
118 14 1,652 5,830,813 .03
119 8 952 5,831,765 .02
120 17 2,040 5,833,805 .03
121 7 847 5,834,652 .01
122 5 610 5,835,262 .01
123 12 1,476 5,836,738 .03
124 5 620 5,837,358 .01
125 6 750 5,838,108 .01
126 4 504 5,838,612 .01
127 2 254 5,838,866 .00
128 2 256 5,839,122 .00
129 1 129 5,839,251 .00
130 2 260 5,839,511 .00
131 1 131 5,839,642 .00

PL/SQL procedure successfully completed.



You will appreciate, of course, that if the table definition has a very large number of columns, or if the variation in column lengths is extreme, then the report could run easily run to two or three pages, so you might want a slightly less accurate, more concise report. The required code patch is fairly straightforward, though you have to count bracket carefully. All you have to do is edit the text used in the outer select after you’ve generated the inline view with all the columns in it. Here, for example, is the assignment you need to aggregate the column sizes to the nearest 10 bytes, and the results showing the impact this had on my output:

 m_string :=
'select 10 * round(row_space/10) row_space, count(*) row_ct, ' ||
'10 * round(row_space/10) * count(*) sum_space, ' ||
'sum (10 * round(row_space/10) * count(*)) over (order by 10 * round(row_space/10)) running_tot, ' ||
'round(100* ratio_to_report(10 * round(row_space/10) * count(*)) over(),2) ratio_to_rep ' ||
' from (' ||
m_string ||
' row_space from &m_table ) ' ||
'group by 10 * round(row_space/10) order by 10 * round(row_space/10)'
;

Length Row Count Space Running Total Space pct
------- ----------- ----------- ------------- -----------
70 14 980 980 .02
80 1,969 157,520 158,500 2.67
90 10,364 932,760 1,091,260 15.78
100 28,954 2,895,400 3,986,660 48.99
110 17,311 1,904,210 5,890,870 32.22
120 143 17,160 5,908,030 .29
130 18 2,340 5,910,370 .04

PL/SQL procedure successfully completed.


Don’t forget that this code is going to scan every column of every row in the table, applying a function to each column, so it’s likely to do a lot of work in a large table. If you want to reduce the workload then it would be perfectly reasonable to edit line 8 of the code above to insert a sample clause after the table name.
Once you’ve got some code like this, there’s another interesting game you can play with your table – but I’m saving that for another article.



DIGITAL JUICE

No comments:

Post a Comment

Thank's!