cancel
Showing results for 
Search instead for 
Did you mean: 

15.7 SP110 -> CPU, proccache and memory consumption increase after IMDB

0 Kudos

Hello,

We finally switched our tempdbs databases to IMDB databases last weekend. We experienced the day after a blocking of dataserver , that we solved by emptying proccache. We already had only once this behaviour, and I asked prod DBAS for opening support ticket to perhaps activate TF 753 and/or 757.

Nevertheless , I can see that memory usage increase , and for some batch processes Proccache usage has drastically increased, for globally the same amount of work. Moreover exec time has not decreased , but increased by about 20%.

On the opposite, response time for screen (which also calls stored procs , but lighter) has decreased as expected.

Has someone experienced this kind of behaviour ?

Regards

Marc

Accepted Solutions (0)

Answers (1)

Answers (1)

0 Kudos

Hi ,

Some news since yesterday :

-- Our cpu overconsumption is linked to a well-known Sybase 15.x bug when Engine Local Cache is filled. We applied TF 757  , but we got same issue and finally we have scheduled every hour a job to free unused procs in cache.  Issue is still pending and it seems Sybase does plan nothing for this issue.

-- Execution time degradation seems linked , not to IMDB as we thought, but to dml_logging set to minimal for tempdbs.  In fact, when analyzing figures from MDA tables, same workload , either allocates 80 temp tables and does 200K transactions and commits , either , with dml_logging = minimal  , allocates 8000 tempdb tables and does 4 times number of commits and transactions.

Feeling that is increased with increase of nesting.

Sounds weird, configuring minus logging increases commits and transactions.

Is someone aware of something to solve procedure cache issue , or at lesat be sure to avoid it ?

Regards

Marc

former_member182259
Contributor
0 Kudos

That would be incorrect - not sure why you are saying what you are, but a quick glance through the ASE 16 NFG would have pointed out some changes to ELC.

What is more likely is a 12.5-ish proc design causing excessive recompilation.    However, at this stage, you are only offering your symptoms and conclusions - none of which gives us anything to go by.  

Turning on minimal logging....which tempdbs' are by default - would have no impact on the number of temp tables NOR the number of transactions.   Clearly this illustrates a difference in workload - not the same as you have assumed.

Do you have monProcedureCache, monProcedureCacheMemoryUsage, monProcessActivity, etc.???   If so, I would start checking monProcedureCache for increases in recompiles or high removals/inserts to correspond with the CPU spikes.....  I would also look at monProcessActivity and see how it corresponds with CPU Spikes....and I would also look at monSpinlockActivity.

0 Kudos

Thanks a lot Jeff, for your time.

We have , in fact , 2 issues :

-- For the first one, about CPU usage raise , we encountered it on several dataservers , after upgrading to 15.x version . We had an audit from a Sybase consultant, and what I said is only what he said. We noticed high cpu activity , and explanation was ELC has been filled, this is a well-known bug.   In documentation , you have an advice for applying TF757 , if cpu raising has no clear reasons.

We use a french opensource tool, asemon, which gathers information from MDA tables  and reports it in web server.  It has been done by more famous (here) sybase consultant. Briefly , actions were to regularly free the procedure cache, apply TF757 .  In one dataserver, they found a stored procedure , whose plans filled cache without being cleared.

ObjectName MemUsageKB sum_RequestCnt nb_plans
------------------------------ ----------- -------------- -----------
f_update_instruction_p              299554 196         195
t_54X_creatinstr_p                     85276     2349 9
f_controldata_instruction_p          23054 7           5

Anyway , I will gather all information I can as you requested , we had 3 pikes monday and tuesday.


--


The second issue is more strange , after creating 4 tempdbs IMDB , and binding activity from traditional ones from these new ones , we noticed globally an increase of average execution time for our main batches by 20% .

Our activity has not increased since last week, we have just done this update during the week-end and performance decreased from monday, so we have associated this decrease with update.

I am still investigating on this issue , and first thing I tested was to come back to a full dml_logging and results are encouraging.


I will keep you informed .


Again, thanks for your time


Regards

Marc




former_member182259
Contributor
0 Kudos

Unfortunately, proc cache management is a rather involved topic, but the above points to a rather confusion of issues.

First, TF757 simply reduces the scale down of large allocations when the proc cache is full by replacing a proc in cache.  There is a bit of an inverse relationship with ELC, but let's talk about that later.   When a proc is first loaded into cache, the entire plan for the proc has to be loaded into cache.  Prior to ASE 15, this was largely done by successive 2K memory pages.  ASE 15 supported procs grabbing 2K, 4K, 8K and 16K allocations.   However, as you can imagine, if proc cache is nearly full, then memory is likely fragmented and loading a large plan into proc cache could result in the plan having to grab a bunch of 2K pages all over the place - kinda like the impact on CPU in the OS when the OS memory pages free become smaller, the CPU necessary to find those holes becomes more.   TF757 simply allows ASE to instead of scaling the requests down to 2K chunks, to instead bump another proc plan out of cache to make (hopefully) some contiguous room to load the new proc plan.

However, I would NOT use TF757 if "cpu was increasing for no apparent reason" - I would only do so if the proc cache manager was showing high spinlock contention AND after monitoring monProcedureCacheMemoryUsage/monProcedureCacheModuleUsage, monProcedureCache and monCachedProcedure that I could see a pattern that indicated that proc cache was full a lot and proc plans were being recompiled or loaded quite frequently - and then likely only as a temporary measure to determine if the proc cache recompilation can be reduced.  More on that in a second.

Now, as far as ELC is concerned, you should really get a copy of Stefan Karlsson's ASE 16 performance paper that he has posted on SCN - specifically about page 18 where he starts to discuss proc cache and ELC.   In it, the difference between the ELC behavior in ASE 15 and ASE 16 - specifically pointing out that in ASE 16 the ELC size is now configurable and by default double what it was in ASE 15 - as well as some other changes that improve ELC operations.   Remember, the whole goal of ELC is to minimize contention on proc cache manager spinlock by caching free proc cache pages locally.    However, a *key* aspect of ELC is that we are discussing *free* proc cache pages - if the proc cache is nearly full.....welllllllllll .....then there is nothing to put into ELC and therefore the various engines have to free stuff from proc cache to make room....which means contention on the proc cache spinlocks.  

So, you can see that TF757 really has no direct bearing on ELC.    TF757 really only kicks in when the proc cache is nearly full, which implies that the ELC has been exhausted.   If the proc cache is NOT full and ELC is exhausted, the engine simply grabs another chunk of memory pages for the ELC.....

Now then - unfortunately, data can be contorted to mean many different things.   For example, in your list of procs, you listed one (f_update_instruction_p) that you (I think - I could be wrong) were pointing out that had a large number of plans in cache (195) as if to imply that proc plans were not being removed rapidly enough.   Unfortunately, the data you provide - summarized as it was - was insufficient to support such a conclusion.   For example, if 195 connections executed that proc concurrently, you would legitimately need 195 plans in cache as ASE 15 doesn't (normally) do plan sharing.

All of this is only one possibility driving your problem.   One of the most common issues that can drive proc cache higher while slowing execution speeds is quite simple - merge joins - specifically sort merge joins.   There are others - e.g. if there is a lot of recompilation - then proc cache will be under memory pressure....but let's deal with that later (again).   What you should really do is plot CPUTime and LogicalReads from monProcessActivity.....you will need two different scales, but if the spikes coincide - then it is more likely you have merge joins going on - especially if monOpenObjectActivity shows entries with IndexID=0 as the top objects when sorted by LogicalReads in descending order. 

Secondly, once you have eliminated LIO from driving the problem, I would make sure that you are seeing proc cache cache contention - and that is where monSpinlockActivity comes into play - or if not on ASE 15.7, you will need to run spinmon.

One aspect to consider is that ASE 15 requires a lot more proc cache than did ASE 12.5 - one of the reasons is quite simple - more statistics.   In ASE 12.5, while we supported 'update index statistics', few people used it.....and the default optimization techniques couldn't really leverage them (although there were quite a few cases where it could).   In ASE 15,  we started advocation (errrr....maybe even "demanding") that customers use update index statistics - which means now when there is any optimization to be done, a lot more statistics has to be loaded into proc cache for optimization.    I am assuming you are aware of this and increased proc cache considerably from 12.5 to 15 upgrade (since you brought this up) or else, that *could* be the source of your problem.

However, now lets talk about proc cache re-compilation.   There are number of instances in which this happens....e.g. we have a #temp table outside the proc that the proc references (a common pre-15 technique used to try to force ASE to pick up an index on a #temp table).   However, ASE 15 also added 'procedure deferred compilation'.....remember, this option allowed ASE to recompile a procedure at the first execution of a statement that contained a @var or #temp table.   Soooooo.....we load a proc into cache, we create a plan, start executing it.....and voila - first thing we know we are recompiling a statement due to a @var or #temp.   This creates a NEW plan - and subsequent executions of the same statement in that plan do not cause a recompilation....but you can easily see how that this impacts proc cache....interestingly, if you trace monSysStatement for that execution, you will actually see the PlanID change.   Net result, along with the increased number of stats, we need more proc cache for deferred compilation.....

To really see how this could be impacting, consider the following scenario:

1) User connects to ASE and creates a few #temp tables for their session.

2) User executes a stored proc...which populates the #temp tables and returns the results and then truncates the #temp table

Since there are concurrent users, the user grabs a proc plan from cache.   However, because the #temp tables, the plan is recompiled.   During execution, we hit an statement with @var and/or #temp....and you guessed it......

Much of the fault here lies with a 12.5 style of coding that was required due to trying to eliminate contention on system tables in tempdb.   That being to pre-create the #temp tables upfront.   However, my experience with customers that did this was that often is increase the log semaphore contention on tempdb due to the increase in logged IO in tempdb - and in some cases, the net result was identical - just the blocking shifted from system tables to the log semaphore which wasn't visible in syslocks/monLocks.   The other coding style was the use of subprocs to populate the #temp tables and then the outer proc continued executing.....this, in my mind, is a legitimate use of subprocs to simplify proc maintenance......but it has the nasty side effect of causing a recompilation....

However, think about it.....if I have 4 tempdb's or 1 tempdb, the amount of recompilation due to #temp should remain the same....as it wouldn't matter if everything was in 1 tempdb or not - the proc plan would have to be recompiled.    What we don't know (for example) is what impact the IMDB setup had on decreasing memory available for data cache - and THIS could result in an increase in response time by driving PIO higher.....but by simply stating the symptoms, we don't know if this is what is happening or not. 

Certainly, enabling minimal_logging would NOT have any difference on the number of temp tables created NOR would have any impact on the number of transactions.   Physically impossible.   The ONLY thing it does is change whether transactions are logged or not as an aspect of transaction size .....e.g. if the transaction exceeds the size of the session tempdb ULC, it will still be logged to allow it to be rolled back, but if it doesn't exceed the size of the session tempdb ULC, it won't be logged.     Sooooooo......

1 - If you are seeing a difference in the number of transactions or number of #temp tables, you are either experiencing a different workload - OR - your monitoring is faulty and you are missing those data elements in the previous captures.   I would bet on the former.

2 - That increase/change in workload is likely driving the difference in proc cache, which has it's own set of symptoms

You may not wish to believe me - but send this post to your "more famous" consultant and ask his opinion....just make sure you mention my name as the source of the post.

Either way, in order to reduce the amount of recompilation and impact on proc cache, you may want to consider rewriting some of the stored procs in such a way that recompilation is not as frequent.

0 Kudos

Again, Thanks a lot Jeff, for your time.

For temp tables creation, you were right, I finally found a "new" hashtable creation within a stored proc which is called several million times a day. So , IMDB and RDDB fetaures , are currently perfectly working for this application.

For another one, a very heavy usage of tempdb led us to huge figures in spinlocks for IMDB caches we have created. (20% contention) . I will add cache partitions and monitor.

For high CPU usages , I got figures from sysmon but I see nothing high when pikes occured.

Sampling Started at   : Nov 24 2014  1:55PM

Sampling Ended at     : Nov 24 2014  2:12PM

Sample Interval       : 990 s

Nb transactions       : 17198

Procedure Cache Management    per sec  per xact   count  % of total

---------------------------  ------------  ------------  ----------  ----------

  Procedure Requests             11.1       0.6   11025   n/a  
  Procedure Reads from Disk       0.2       0.0     220   2.0 %
  Procedure Writes to Disk        0.2       0.0     226   2.0 %
  Procedure Removals              2.2       0.1    2132   n/a  
  Procedure Recompilations        0.1       0.0     118   n/a  

  Recompilations Requests:

Execution Phase               0.1       0.0      64  54.2 %
Compilation Phase             0.1       0.0      54  45.8 %
Execute Cursor Execution      0.0       0.0       0   0.0 %
Redefinition Phase            0.0       0.0       0   0.0 %

  Recompilation Reasons:

Table Missing                 0.1       0.0     104   n/a  
Temporary Table Missing       0.1       0.0     104   n/a  
Schema Change                 0.0       0.0      14   n/a  
Index Change                  0.0       0.0       0   n/a  
Isolation Level Change        0.0       0.0       0   n/a  
Permissions Change            0.0       0.0       0   n/a  
Cursor Permissions Change     0.0       0.0       0   n/a  

  SQL Statement Cache:

Statements Cached             0.0       0.0      16   n/a  
Statements Found in Cache     1.2       0.1    1188   n/a  
Statements Not Found          0.0       0.0      16   n/a  
Statements Dropped            0.0       0.0      35   n/a  
Statements Restored           0.0       0.0       6   n/a  
Statements Not Cached         0.0       0.0       0   n/a  

BTW , we experienced no other issue since we free ProcCache regularly.

Again , thanks a lot for your help.

Regards

Marc