cancel
Showing results for 
Search instead for 
Did you mean: 

Index Fragmentation in ASE

Former Member
0 Kudos

I've been looking into index fragmentation recently not just to determine if the index needs to be rebuilt but also how efficient it is *right now* structurally speaking.  I've also been comparing to other DBMSs such as Oracle RDBMS 12, SQL Server 2012 and IQ.  Since I'm writing to a Sybase oriented group of people, I'll keep my comparisons on Sybase .. er.. SAP products. 

In IQ, I can easily pull up a histogram of useful metrics regarding the 'fragmentation', well one type of fragmentation anyways, of indexes. 

In ASE, we don't have anything that can retrieve similar metrics without walking the index tree(s) - remember with symantec partitioned tables, you can actually have more than one index tree (local indexes). 

We do have the various index metrics produced by update but they provide a single number that doesn't give us a clear picture where the (insert type of) fragmentation really resides.

White Sands Technology has ProActive DBA which can do this but it has the same issue as if I scan the index tree(s) myself.  The operation is expensive.  Damn expensive.  Sure, I can do a binary copy of the ASE server and run the analysis on another box, but by the time that is done, the window of action (maintenance window or whatever) may have passed.  Worse yet, it might be a mission critical 24x7 server with no maintenance window to speak of. 

White Sands Technology has an excellent, dated but excellent, white paper on fragmentation.  I strongly recommend reading that to understand the difference between the various types of fragmentation in ASE .. and in databases in general.

There are a myriad of reasons why you would want to find exactly where that fragmentation is hiding.

Fixing the fragmentation, if it is actually a problem, may be as simple as rebuilding the index (reorg or drop/create) or changing the order of the keys or fill factor, or some other option.

My question is thus:  Would you like to see SAP add more detail to the fragmentation metrics?  If so, would you be willing to share the reason why?

p.s.  David Cherin, can you bug someone to update that white paper to 15.7?  Pretty please?

Accepted Solutions (0)

Answers (3)

Answers (3)

corral
Explorer
0 Kudos

Jason asked:  Would you like to see SAP add more detail to the fragmentation metrics?  If so, would you be willing to share the reason why?   I am no longer working as a DBA, so I'll give my opinion supposing it was asked while I was still at the job. Call me lazy, but I think I would not have used any detailed fragmentation info. We decided which tables to reorg every weekend with coarse criteria. First, we targeted tables, not indexes. Although we reorganized some indexes instead of their tables, we did so just because reorg rebuild against the table is a minimally logged activity and sometimes we couldn't schedule minimally logged tasks; but we wished to reorg the table instead. So we ignored index fragmentation and focused only on table fragmentation. It's not a bad strategy, as the most severely fragmented indexes are probably those belonging to highly fragmented tables.  And second, we wished a quick answer, so relied on coarse statistics about the table. Entering on the details of which part of the table or index was fragmented the most and why was not a concern for us. Very often, we just compared the tables actual size with their estimate.  Regards, Mariano Corral Herranz

Former Member
0 Kudos

The more I look into ASE fragmentation, the more I wonder about the dpcr, drcr, sput, lgio metrics.  Specifically the definitions of them (see index metrics) seem a bit vague.  I'm not entirely sure how the metrics are calculated, how accurate they are after an update index stats job, where the 'fragmentation' actually resides. 

If, for example, I have an 8TB table with say a 2TB index, I want to know exactly where that fragmentation is and how that fragmentation is calculated.  I will need to justify switching to a standby system while I rebuild that index and even if that will make any real difference (note the "how that fragmentation is calculated" comes in to play here.) 

Knowing where that fragmentation is located can be critical as well.  If the fragmentation is located in part of the index that references data is accessed once a month can wait while fragmentation that is accessed heavily may warrant an out of schedule index maintenance. 

kevin_sherlock
Contributor
0 Kudos

What exactly are your doubts about dpcr, drcr, sput, and lgio?  When I was writing the original "sp__optdiag" procedure, it pre-dated the derived_stat() builtin.  I spent quite a bit of time trying to derive the formulas for dpcr, drcr, sput, and lgio.  The easiest to determine of them was sput, given it's the ratio of minimum number of pages required to store the rows vs the actual pages currently being used.

cluster ratios and lgio were the most difficult.  I came very close though.  It's always been my understanding that these formulas are protected information, so I kinda hesitate to directly publish the formulas I came close with.  I can however, describe them based on the math.

As for knowing "where" fragmentation is, your description to me seems to fit the criteria of semantic partitions where you can measure (and act on) the fragmentation of each partition.

Former Member
0 Kudos

Thanks for responding.  🙂

I'm not sure why the formulas would be covered under the NDA.  my guess is that it was documented in the source code and whomever provided that information wasn't in a position to release it officially.   a detailed whitepaper would be useful.

As far as semantic partitioning goes,  most companies have not purchased the add on feature. 

kevin_sherlock
Contributor
0 Kudos

So, sput is easy.   (rowcount * avgrowsize) / (pagesize - hdrsize) / (pagecnt).  However, sput can be misleading as it doesn't account for space management properties such as rows_per_page, fillfactor, reservepagegap, expected_row_size, etc.

dpcr/ipcr for APL tables is somewhat simple as well.  It's a measurement of "extent jumps" done while sequentially scanning each leaf page of the table/index.   With APL, pages are linked together by a next page pointer.  when you read a page and go the next page pointed to by that page, you either stay within the same extent, or you visit a new extent.  When an APL table/index is "perfectly clustered", the number of these jumps to a new extent is simply equal to the total number of extents in the table (each extent is jumped to exactly once).  The optdiag output includes datapageCRcnt (or indexpageCRcnt).  These are the total number of extent jumps done while sequentially scanning the table.  If it's higher than the ideal number of extents needed to store the pages, then your dpcr/ipcr goes down.   The formula is somewhat like this (like sput, (optimal / actual)), except you have to derive the ratio based on pages.   So,    ( pagecnt - pageCRcnt) / (pagecnt - (pagecnt/8)) .  Actually, pagecnt/8 isn't the exact optimal number of extents needed to store pagecnt.  you need to round up. 

I also don't consider dpcr/ipcr APL a very accurate measurement necessarily in every case.  You can have what you might consider "high" cluster ratios (like .90), but an inefficiently stored table.

More later

Former Member
0 Kudos

thanks again Kevin

it looks like my revised reply from last night was lost in the forums outage this morning.  I'll retype that up a bit later today. 

kevin_sherlock
Contributor
0 Kudos

>> As far as semantic partitioning goes,  most companies have not purchased the add on feature. 

But your example stated the case in terms of an 8TB table, and 2Tb index (among other large tables/indexes I would assume).  I assume you used these sizes to make your case stronger about the need to reduce maintenance windows, etc.  If i were the DBA in an environment like that, I'd most likely push the partitioning feature which is a fraction of the cost vs the license probably needed to support that env exactly to leverage those advatages related to stats, index maintenance, etc.

I guess what I'm saying is that what you may be asking for, is already there in that sense. 

Former Member
0 Kudos

Thank you, Kevin, for the clearest explanation I have seen about the meaning of [di]pcr.

former_member213365
Active Participant
0 Kudos

Jason asked: Would you like to see SAP add more detail to the fragmentation metrics?  If so, would you be willing to share the reason why?

My response: Is this a trick question?  Of course I want more details!

Keeping in mind that I'm not an Oracle expert but I do a lot of nitty gritty, under the covers work with Oracle I will say that there is a lot of disagreement in the Oracle community over rebuilding indexes.  So giving someone a metric that says their index is x% fragmented could lead to a lot of unnecessary rebuilds.  Right or wrong, that decision should still be mine to make and SAP/Sybase should make the metrics available.