Technology Blogs by Members
Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!
cancel
Showing results for 
Search instead for 
Did you mean: 
stefan_koehler
Active Contributor

Introduction

This is just a short blog post about a simple DTrace script (dtrace_kghal_pga_code), that i recently wrote and published due to a PGA memory leak troubleshooting assignment. A client of mine noticed a major PGA memory increase after upgrading to Oracle 12c. The PL/SQL code did not change - just the database release. He already troubleshooted the issue with help of Tanel Poder's blog post "Oracle Memory Troubleshooting, Part 4: Drilling down into PGA memory usage with V$PROCESS_MEMORY_DET..." and identified the corresponding heap and allocation reason. However there is one issue by just looking at V$PROCESS_MEMORY_DETAIL or heap dumps - you can not trace back the causing PL/SQL or SQL code very easily.

DTrace approach with script "dtrace_kghal_pga_code"

The basic idea behind the DTrace script is to check the memory allocation reason on probe entry level (for KGHAL memory allocator functions) and stop the Oracle process at this point if the check is true. After the Oracle process has been stopped by DTrace you can request an error stack with oradebug and continue the Oracle process. As the process continues you get an errorstack trace which includes the responsible code for this particular memory allocation request (in PGA).

The following demo is run with Oracle 12.1.0.2 on Solaris 11.2 x86. I also used Frits Hoogland's PL/SQL code from here and put it into PL/SQL function called mem_alloc to allocate a lot of PGA memory.

SQL> select mem_alloc() from dual;

After a few seconds Tanel Poder's script smem_detail.sql can be used to determine the detailed PGA memory usage.

SQL> @http://blog.tanelpoder.com/files/scripts/smem_detail.sql 25

       SID CATEGORY        NAME                       HEAP_NAME            BYTES ALLOCATION_COUNT

---------- --------------- -------------------------- --------------- ---------- ----------------

        25 PL/SQL          pmuccst: adt/record        koh-kghu call    896860424            54901

        25 PL/SQL          pl/sql vc2                 koh-kghu call    112961768             6915

        25 PL/SQL          pmucalm coll               koh-kghu call     31622336             1936

        25 Other           permanent memory           pga heap            263952               32

Now you know that most memory is allocated from private heap "koh-kghu call" due to reason "pmuccst: adt/record", but you still don't know which PL/SQL code line is responsible for requesting this memory. In this demo case it is pretty simple as Frit's code allocates such amount of memory at only one place, but this is just for demonstration purpose. In reality the PL/SQL code is usually much larger and much more complex - so no way to determine the corresponding code line easily.

Let's use the DTrace script "dtrace_kghal_pga_code" to intercept the process when it requests memory from heap "koh-kghu call" due to "pmuccst: adt/record".

shell> ./dtrace_kghal_pga_code 1502 kghalf "pmuccst: adt/record"

dtrace: description 'pid$target::kghalf:entry

    ' matched 1 probe

dtrace: allowing destructive actions

CPU     ID                    FUNCTION:NAME

   0  78530                     kghalf:entry

Process is stopped due to memory allocation reason "pmuccst: adt/record" from heap "koh-kghu call ".

Please run the following commands in separate SQL*Plus to dump an errorstack:

SQL> ORADEBUG SETOSPID 1502

SQL> ORADEBUG DUMP ERRORSTACK 3    <<<< ORADEBUG will hang until process is continued by prun - works as designed

----------------------------

Please run the following command in separate shell after executing ORADEBUG:

shell> /usr/bin/prun 1502


I think the DTrace output is self-explanatory and you just have to follow the instructions.

Received ORADEBUG command (#1) 'DUMP ERRORSTACK 3' from process '1519'

*** 2016-01-30 09:04:30.651

dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=3, mask=0x0)

----- Error Stack Dump -----

----- Current SQL Statement for this session (sql_id=8y13z00p0sgc6) -----

SELECT * FROM T2

----- PL/SQL Stack -----

----- PL/SQL Call Stack -----

  object      line  object

  handle    number  name

199ddab20        14  function TEST.MEM_ALLOC

----- Call Stack Trace -----

calling              call     entry                     

location             type     point                 

-------------------- -------- --------------------

*** 2016-01-30 09:04:30.946

ksedst()+307         call     skdstdst()

__sighndlr()+6       call     ssprtmin()          

call_user_handler()+718  call     __sighndlr()                                          

sigacthandler()+219  call     call_user_handler() 

kghalf()+2           signal   sigacthandler()     

kohalmc()+214        call     kghualloc()         

kohalc()+145         call     kohalmc()           

pmuocon2_con_recur()+145  call     kohalc()                                                

pmuocon()+127        call     pmuocon2_con_recur()                   

pfrrun_no_tool()+298 call     pfrinstr_EXECC()                                    

pfrrun()+1216        call     pfrrun_no_tool()    

plsql_run()+648      call     pfrrun()            

peidxr_run()+317     call     plsql_run()         

ssthrdmain()+558     call     opimai_real()       

main()+164           call     ssthrdmain()        

_start()+123         call     main()           


Checking the call stack trace in the generated trace file reveals that the errostack trace is created at the right time.

The function kghalf() is the last executed function right before the signal handler (by oradebug) kicks in: "sigacthandler()+219<-kghalf()+2". You also can see the PL/SQL call stack and the currently running SQL statement (if a "plain" SQL statement runs at this time - otherwise you see the "entry" PL/SQL call). PL/SQL code line 14 in function TEST.MEM_ALLOC is exactly related to "select * bulk collect into c_tmp from t2;".

Summary

I hope you may find this DTrace script a little bit useful in case of troubleshooting unclear PGA memory allocation.

Just let me know if you have some issues with it. If you have any further questions - please feel free to ask or get in contact directly, if you need assistance by troubleshooting Oracle database (performance) issues. You can also follow or drop me a note on Twitter of course.

References

3 Comments
Labels in this area