Technology Blogs by Members
Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!
cancel
Showing results for 
Search instead for 
Did you mean: 
stefan_koehler
Active Contributor

Introduction

In the last days i received a few AWR reports by mail from my colleague due to Oracle performance issues in a SAP environment. There were several different issues (based on insufficient) SQLs (and structures), but there was also a pretty interesting behavior about "library cache locks". This blog will focus on this "unusual" library cache issue only and how to troubleshoot such issues.

AWR reports

Particular sections of the AWR reports basically looked like this (ignoring details like snapshot time, database time and so on).

Notice the very unusual high "connection management call elapsed time" and the not tolerable wait event "library cache lock" in this snapshot period.  Let's check the official Oracle documentation for these metrics first.

library cache lock

This event controls the concurrency between clients of the library cache. It acquires a lock on the object handle so that either:

    • One client can prevent other clients from accessing the same object
    • The client can maintain a dependency for a long time which does not allow another client to change the object

This lock is also obtained to locate an object in the library cache.

Library cache lock will be obtained on database objects referenced during parsing or compiling of SQL or PL/SQL statements (table, view, procedure, function, package, package body, trigger, index, cluster, synonym). The lock will be released at the end of the parse or compilation.

connection management call elapsed time

Amount of elapsed time spent performing session connect and disconnect calls.

The documentation describes, that a "library cache lock" usually occurs by parsing or compiling a SQL statement, but what is the connection between parsing / compiling a SQL and the high wait times by connecting and disconnecting to/from the database. Are these two different problems? We need to dig much deeper to get the root cause of these two issues.

System state dump

There are several ways to analyze this issue now, but the easiest way to get all of the relevant information is a system state dump (if you are not on-site to query all of the different views and x$ table structures on your own). Before going into the details - let's explain a system state dump first:

"A system state is a set of process states for all processes on the instance when the dump is taken. A system state dump is useful in determining the interaction between processes. A systemstate dump will report on resources that are being held by each process. The <level> sets the amount of additional information that will be extracted from the processes found by SYSTEMSTATE dump based on the "STATE" of the node.

    • 1 Very basic process information only
    • 2 process + session state objects
    • 10 Most common level - includes state object trees for all processes
    • Level + 256 Adding 256 to the level will try to dump short stack info for each process.

"

You need to run an oradebug command (to get a useful system state dump) when that particular library cache lock issue occurs. I usually start with the common level 10.

shell> su - ora<SID>
shell> sqlplus / as sysdba
SQL> oradebug setmypid
SQL> oradebug unlimit
SQL> oradebug dump systemstate 10

The system state dump file is created in RDBMS ADR (since 11g) or in the usertrace folder (until 10g). Now let's search for the wait event "library cache lock" in that trace file.

I just include the interesting parts (corresponding to that "library cache lock" issue) of the system state dump. There is a lot of other useful information in it, but let's keep the focus on our library cache lock issue here.

We got our both sessions (interestingly both from the same OS process) waiting on the library cache lock. We also see the corresponding library cache handle (7000000ed99d980) and its lock (7000000daa7f108) address. So let's search for these addresses in the trace file.

We can examine the details about that library cache object lock and that it is associated with the namespace "ACCOUNT_STATUS". It seems to be something like the status of an user account (like column ACCOUNT_STATUS in DBA_USERS) in the library cache just by interpreting the name of the namespace.

I checked MOS and found some relating bug (13496395) information like that:

"A session logging on keeps hold of a lock on the ACCOUNT_STATUS object for the user unnecessarily"

Unfortunately the client was not running this particular database version and the necessary steps to run into this bug were not relevant, but we verified our assumption about that particular library cache lock usage.

I found some information on other blogs and finally MOS note #1309738.1 after further researching:

"'Library cache lock' can be observed when concurrent users login with wrong password to the database"

Wow .. what a feature ;-)) .. a solution was already provided in 11.2.0.2 (luckily the database version, that the client was running on), but the event "28401 trace name context forever, level 1" was not set. Mystery solved.

Summary

Do not brute force your oracle database or you will badly end up with "library cache locks" (with default settings). System state dumps can be very helpful, if you need all the relevant information when you are not on-site or if you need to dig deeper into locking or latching issues. Additionally there are a lot of useful details in system state dumps for researching, troubleshooting and learning purposes.

Further information about that "new security feature" can be found in this blog: Simon Kwek - Logon failures causes "row cache lock" waits

If you have any further questions - please feel free to ask or get in contact directly, if you need assistance by troubleshooting Oracle database issues.

References

Labels in this area