AWR reports (Part I) - 10 most important bits

salome_szanto · ‎11-03-2014

I have collected 10 points which can be checked for the very first step in your analysis. Here, I am not focusing on a single issue but providing an overview instead.

Automatic Workload Repository (AWR), a licensed feature of Oracle, is one of the best tools to monitor the performance issues. The report can be generated as of Oracle 10.2 on any platform.

There are 4 Important points one should consider before generating AWR:

before/meantime/after reports – create 3 different reports which cover the timeframe during the problem was experienced and compare it with the timeframe before and after.
duration of snapshot – choose a shorter timeframe to get a more precise report.
retention time – set this to 42 days, though it must be checked if SYSAUX tablspace has enough freespace. Configure the retention period
format – Choose html to be able to use the links, bookmarks inside the report.

How to create AWR

The following sections are the most important to be checked first to get an understanding of the performance problem.

1 Header Section – contains general information about the timescale and the system environment.

If the timescale is not chosen correctly (not covering the time when the problem encountered) then the analyzing the report will be meaningless.

2 Cache Sizes – contains information about the SGA (System Global Area) in the beginning and end of the snapshot.

3 Load Profile – contains information about the database workload during the snapshot.

4 Instance Efficiency Percentage – these needs to be looked very carefully, since these are not really a good measurement of the database performance.
For example in very processing-intensive SQL statements which are executed repeatedly, only read blocks from the buffer pool increases the hit rate of the buffer pool. After optimizing such statements the hit ratio decreased though performance improves.

Buffer Nowait – shows how often buffer cache were accessed with no wait time.

Buffer Hit – shows how often a requested block has been found in the buffer cache without requiring disk access.

Redo NoWait – shows if log_buffer size is set correctly. Preemptive redolog switches in Oracle 11.2

Parse CPU to Parse Elapsd – shows how much time was spent on parsing while waiting for resources.

Non-Parse CPU – in the following example the the figure is close to 100% meaning that the overall CPU usage is only 0.15 % for statement parsing.

5 Shared Pool Statistics – shows if there is an overhead on the system regarding shared pool.
The values should not be very high (preferably less than 75%).

6 Top 5 Timed Foreground Events – Shows top 5 wait events that are taking the most of time. The exact meaning of each and every event can be found here.

Based on what we found in here 2 other sections in the report must be checked.

SQL statistics section – check if there is a lot of read and which SQL statements involved.
IO stats section – check if there is I/O bottleneck.

In the following example a long time (70%) is spent for waiting on I/O related reads. There are 12,727 waits in 265 seconds (4 minutes) which is more significant than 3 million waits in 36,388 second (10 hours).

As well only 8% of the time has been spent on DB CPU. If this time is significant check the following points:

Is the CPUT times(s), (in our example 5119 seconds) significant compared to the total CPU time?
Total CPU time = number of CPU(s) * snapshot time (in seconds). find NUM_CPUS in the "Operating System Statistics" section.
Is there a SQL statement which takes most of the time, if yes check “SQL ordered by CPU Time” section.

7 SQL ordered by Elapsed Time – Shows which SQL statement runs for a longer time. Those statement needs to be focused that have less numbers of execution with high Elapsed Time per Exec (s). More over check if the % total is significant.

8 SQL ordered by CPU Time – contains information on which SQL statement takes the most CPU time
Total DB time

9 &10 IO Stats – Check if read and write of the datafiles /logs is taking longer. Section 6 reveals some information about the waits for I/O, wether this is a good number or not is dependent on the Hardware/OS. If the I/O is slow check "Tablespace IO Stats" and "File IO stats". "Rd(ms)" columns must not exceed 20 otherwise it worths involving your OS team and hardware vendor to investigate the I/O bottleneck.

AWR reports (Part I) - 10 most important bits

Get Your SAP HANA Idea Incubator Badge Today!

SCN Mission - SAP HANA Quiz Challenge is now retired

Share your #HANAStory and Win