Nonlinearity Check Using the Z_SE30_COMPARE

former_member194613 · ‎01-24-2008

In this Web log a small tool is introduced which can be used to compare hit lists of two ABAP Runtime measurements with different amounts of data. You will learn where you can get the tool, what you need as input and how to use it for a basic analysis.

1. Introduction

This web log continues the ideas developed in the web log Performance Problems Caused by Nonlinear Coding. There, it was stated that a good program should scale linearly, which means that the processing of an order, for example, with 500 positions should maximally need 10 times longer as a similar order with 50 positions. The web log explained that, in practice, linearity is not automatically given, as small programming bugs are easily overlooked in tests with 1 or 5 positions. It was outlined how a practical test to identify nonlinear coding could work. Such a test is based on two detailed runtime measurements with different amounts of data which would uncover the hidden nonlinearity. The total amounts of example data can be rather small.

This web log turns theory into practice. A report provides the comparison of two measurements:

The data which are used are runtime measurements done with the ABAP Runtime Trace (SE30). To be able to execute the Runtime Trace, you should know at least the basics about it. Please refer to web log The ABAP Runtime Trace (SE30) - Quick and Easy. Each executed operation gives one line in the hit list of the SE30, which shows the total number of executions, the gross and the net time. Gross time and net time illustrate the modular structure of a program. The gross time includes the times spent in lower level subroutines. The net time excludes the times spend on other levels. The traced operations are modularization units, i.e. functions, forms, methods and RFCs (remote function calls), and also other expensive operations, such as DB operations, internal table operations and many more.
The report Z_SE30_COMPARE can be found in another web log A Tool to Compare Runtime Measurements: Z_SE30_COMPARE. It explains all the technical prerequisites about the report; you can download the source code, do an immediate check with some example data, learn in detail how to execute your own measurements, and especially how to store the hit list on your PC. Also the basic operation of the report is explained there.

This web log offers a detailed description of how to analyze whether a program contains nonlinear coding. The background knowledge can be found in several other web logs, as referenced above.

2. Preparation of the Trace Files for a Nonlinearity Check

First, decide what you want to trace. Your test case must be bug free, all the customizing settings should be chosen the way a customer would use the program. The database tables must contain all the necessary data. The test case should reflect a typical customer use case.

For the nonlinearity check, the test case must be executed with two different amounts of data, i.e. items, positions or lines. It is necessary that there is a substantial increase from N1 to N2, such that x = N2/N1 should be 5 or, even better, 10. Also, the smaller measurement should not be N1 = 1, because this is very special. N1 should be at least 5 but larger is better.
Details on how to prepare the trace files can be found in A Tool to Compare Runtime Measurements: Z_SE30_COMPARE. Try combinations (N1, N2) between (5, 50) and (20,100) first and use larger combinations later, when you are sure that there are no very serious nonlinearities or after these have already been fixed.

Details about the trace execution can be found in A Tool to Compare Runtime Measurements: Z_SE30_COMPARE. For the nonlinearity check you should select your test case carefully, and choose useful amounts of data N1 and N2.

3. Comparison of Measurements

For each line in trace 1 the comparison report tries to find a line with identical keys in trace 2. The key fields are call, program and line. It is important to split the call into two parts:, in type of call and name of call. For example, ‘Call Method XYZ’ is split into ‘Call Method’ and ‘XYZ’, and ‘Read table IT_123’ into ‘Read Table’ and ‘IT_123’ etc. This is important because call type and call name play quite different roles in the comparison task. The call type is also classified into call classes, which results in another but dependent key field. So the five fields call class, call type, call name, calling program and calling position, are the key fields to identify one line uniquely.

The comparison might appear to be quite simple. However, the following problems make the task more complicated:

The calling position is not automatically displayed in the hit list and must be added manually before you download the trace file. Some users might forget to change the layout and save hit lists without calling position. Therefore it is desirable, that the tool has a fallback possibility to compare also hit lists which do not contain the calling position, even if this possibility does not offer the whole functionality.
The calling position is not counted inside a function or a method but counted for a whole include. Even small coding changes may change all following calling positions of the whole include, which can be quite a lot. So a comparison using the calling position will not be suitable to compare different program versions.
The internal tables are the most interesting operations for the nonlinearity check because they are the main source of nonlinearity problems. Unfortunately, the names of internal tables in the trace are unique identifiers such as ‘it0987’ etc. These numbers are chosen in the sequence of the execution. It can not be assumed that they are identical in two executions of the program. If larger amounts of data are processed, then it can happen that additional internal tables are created which can change the numbering. Even if nothing under your control changes at all, then randomly occurring system calls can create internal tables and change the numbering.

Fortunately, there is some redundancy in the five key fields, so that two compare modes are possible which work with different subsets of the five key fields:

The default mode uses call class, call type, calling program and calling position, but neglects the call-name. This method works very well and gives the most detailed results. It can be used to compare internal table calls as the call-name is not used.
The other mode neglects the calling position. It checks whether call class, call type, call name and calling program are identical. It cannot be used to compare internal tables, so if internal tables are in the trace, then they will appear uncompared. This mode is used automatically if the trace file does not contain calling positions. It can also be selected manually on the start screen of the report.

Before the compare is executed, the program takes care that the key fields are unique. If the calling position is used, then internal table calls are aggregated, because internal tables locally defined in a subroutine get new names in each execution of the subroutine and are not aggregated by the ABAP Runtime Analysis (SE30) in older releases. This aggregation is actually an advantage of the compare program, and therefore this feature is also implemented in newer releases of the SE30. The compare using the call name aggregates all different calling positions, which leads to the same result as the full aggregation of the SE30.

Once the key fields are identified, the values of the net time, executions and gross time of the two measurements can be compared.

The central functionality of the Z_SE30_COMPARE is the comparison of individual measurement results of execution 1 and 2. Two compare methods are implemented to compensate for specific technical issues.

4. The Detailed Result List

The results are displayed in an ALV table:

The key fields (blue) are ‘Call Class’, ‘Call Type’, Call Name’, ‘Calling Program’ and ‘Line’, i.e. calling position. The call class is useful for filtering.
The measured values (yellow, white and orange columns) are net time, number of executions, and gross time. They are always shown in groups of three columns displaying the two original values plus comparison which can be the ratio or the difference, i.e. ‘Net 2/1’, ‘Net 2’, ‘Net 1’, ‘Exec 2/1’, ‘Exec 2’, ‘Exec 1’, ‘Gross 2/1’, ‘Gross 2’ , and ‘Gross 1’. Note, it is recommended for trace 2 to be the larger one, because the ratio 2/1 or the difference 2-1 is calculated.
The green columns give additional information about the comparison. The column ‘Code’ indicates whether the compare worked or not. If it did not work, then the codes ‘T1’ or ‘T2’ indicate that this line was found only in one trace. The other codes are, ‘CL’ and ‘IL’ or ‘CN’, indicating compare success. ‘C’ is used for normal calls, and ‘I’ for internal table calls. And ‘L’ and ‘N’ identify the compare method, using line (calling position) or name (‘IN’ is not possible as internal tables cannot be compared with this method). The columns ‘Scale’ and ‘Nonlinear’ appear only if the values ‘N1’ and ‘N2’ have been provided at the start screen. They are explained in the next sections.

The main result screen of the report provides the detailed comparison results. The next sections tell you how to interpret the displayed information.

5. Classification of the Net Time Scaling Behavior

Now, how does this information help to identify nonlinear coding? As explained in Performance Problems Caused by Nonlinear Coding, we can expand the total runtime as a sum of different scaling behaviors

    T(N) = A * 1 + B * logN + C * N + D * N * logN + E * N * N + ...

But we can assume that the individual calls measured by the ABAP trace are so fine granular that they show no combinations of scalings but only one pure scaling! These can be constant, linear, quadratic or even more.

Constant  : Net(N) = a     => Net2/1 = a/a               = 1

Linear    : Net(N) = c*N   => Net2/1 = c*N2/(c*N1)       = x 

Quadratic : Net(N) = e*N*N => Net2/1 = e*N2*N2/(e*N1*N1) = x*x

where x is the ratio between the two amounts of data x = N2/N1.

The logarithms should not be handled as separate scalings but together with the other next lower natural scaling. The logarithm and the variability of real-life measurements propose the following classification intervals for the measured values of Net2/1:

Reduced   : Scale = R                     => Net2/1<0.5 

Constant  : Scale = C  Net(N) = a         => Net2/1 = 0.5  ... 2

Linear    : Scale = L  Net(N) = c*N       => Net2/1 = 2    ... 2x 

Quadratic : Scale = N  Net(N) = e*N*N     => Net2/1 = 2x   ... 2x*x

Cubic     : Scale = N  Net(N) = g*N*N*N   => Net2/1 = 2x*x ... 2x*x*x

Order 4   : Scale = N  Net(N) = i*N*N*N*N => Net2/1 = 2x*x*x ...

The factor 2 comes from log(100) / log(10) =2 and ‘Scale’ is the column in the result screen.

We should mention that strictly speaking, functions, forms and methods are still combinations of different scalings, especially if they contain many operations on internal tables. But the method works quite well, since even a very complicated function will contain only 10 or 20 factors while a whole program contains several thousand factors. A nonlinear factor will, therefore, dominate the scaling of the function very easily. However, best results are achieved if internal table operations are also traced.

The section explains scaling of the net times and why it can be used to classify the scaling behavior of the calls.

6. Scaling of Net Time Depends on Executions and Time per Execution

As the net time in the SE30 hit list is the total time for all executions, it is the product of the numbers of executions and the time for one execution. This also means that the scaling of the net time depends on the scaling of the executions and the scaling of the time per execution. For example, a net time with a quadratic scaling can be caused by a quadratic increase in the time per execution or by a quadratic increase of the execution numbers.

Both possibilities appear in practice and require a different analysis of the cause for the nonlinearity:

If the scaling of the time per execution is the problem, it can be analysed very easily: The cause must be the coding of the traced operation. So, if a line in the hit list is identified then check the source code with some experience it should be easy to find the problem in the coding.
If the scaling of the executions is the problem, then it is not the traced operation which must be checked, but the program logic of the call stack. Somewhere in the call stack the excessive increase of the executions must start. The call stack is not directly available in the aggregated hit list, but it is possible to find the problem with a sort by gross time. Details on how to this are explained in the next section.

The report result identifies the scaling of the net time but also the scaling of the executions. Therefore, the ratios of executions are classified in the same way as the net time. Then we can write the scaling exponents x, y and z of net time, executions and time per execution as

Net x = Exec y + Time/Exec z    with x, y, z = 0,1,2 3, or 4

The time per execution is not an explicitly measured value, it is only calculated. It is not displayed by the report. As it supports the line of argumentation and I have sometimes added in the text in brackets.

Displayed is the combined field

Net x - Exec y         with x, y = 0,1,2,3, or 4.

Only combinations (x, y), where x is not smaller than y, make sense, because the scaling of the net time can not be smaller than the scaling of the executions. The scaling of ‘Time per execution’ is given if you take the hyphen as a Minus-sign, i.e. Net x Exec y = Time/Exec z.

The interesting combinations are the following ones:

The green combinations are unproblematic.

The quadratic contribution ‘Net 2’ can have three causes:

The case ‘Net 2 Exec 0 (= Time/Exec 2)’ is caused by a quadratic increase of the time per execution. Therefore, the coding of the operation must be the cause of the nonlinearity. Very often this is nested loop processing where the inner operation scales linearly and should be optimized to a logarithmic or constant operation, see Runtimes of Reads and Loops on Internal Tables for details.´
The other clear case is ‘Net 2 Exec 2 (= Time/Exec 0)’, which is caused by a quadratic increase of the executions. The coding of the operation needs no check as its cost remains constant. The problem is caused by an excessive increase of the executions somewhere in the call stack of the operation. The reason can be a simple bug such as a missing refresh of an internal table, but can also be a problem in the program logic.
In the case ‘Net 2 Exec 1 (= Time/Exec 1)’ the cause is not so clear. Both executions and time per execution increase linearly, so the problem can be on both sides. Check which of the two linear increases can be avoided, there is a slightly higher probability that the coding can be optimized to a scaling faster than linear.

Cubic and order 4 scalings can create even more combinations. Then, the analysis is done in the same way. You must try to reduce the scaling order step-by-step either by changing the executions or the cost per execution.

Nonlinearities in the net times can result from problems in the coding or problems in the logic determining the executions. Each requires a different type of analysis. The column ‘Nonlinear’ in the result list should guide you in the right direction.

7. How to Do a Nonlinearity Analysis

First you should decide whether to include internal tables in your trace. It is especially useful if your program contains large functions with numerous operations on internal tables, because then it can be cumbersome to figure out which operation causes the problem. If the program code has a very fine modularization, where each method contains only one or two internal tables, then it is not necessary to trace internal tables.

Then you should check whether there are quadratic executions in the trace. It is not necessary to start with the cubic contributions as they are often a side-product of quadratic executions. If there are quadratic executions then sort the compare result by the gross time of the larger trace, i.e. by gross2. Scan the list from the top for the first appearance of a quadratic execution either by checking the column ‘Nonlinear’ or the column ‘Exec 2/1’. For this line, read the source code. Double-click on the line, if the calling position was downloaded, otherwise search the line in the original trace and view the source code there. Try to figure out where the quadratic executions come from and how they can be avoided.

From the source code you can also determine the call stack from top to bottom. Just go into the source and drill down to the calls inside. All lower levels will have also quadratic increases of executions or even more. Go back into the result list and scan for the next quadratic executions; check whether there are calls which are not in the already known call stack.

Also, check the major quadratic times per execution. To do this, sort the list by net2 and scan for the largest quadratic net time contributions, which are not caused by quadratic executions. When you find a line, then review the source code.

Nonlinear coding is very often connected with suboptimal nested loops on internal tables. Often programmers are not aware that the following operations on internal tables scale linearly with the size of the internal table and are therefore not suited for use inside a loop on another internal table:

Loop at ... where on a standard internal table
Read ... with key on a standard internal table

Read also the web log Runtimes of Reads and Loops on Internal Tables.

When all identified nonlinearities are corrected, then I would recommend you to run the test case again with a larger object size (N1, N2) between (50, 500) and (100, 1000). You can only be confident, that the program scales linearly even to very large amounts of data, if no nonlinearities appear in such a large test case.

This last section summarizes the steps of a nonlinearity analysis. First, decide whether to include internal tables, then check the quadratic increases of the executions and at the end the problems in the coding leading to nonlinear runtimes.