BPath. Lesson 10. Aggregations

former_member193368 · ‎08-05-2014

Table of Contents ...	... Table of Code Examples
<< Lesson 09. Groupings	Lesson 11. Enhancements on sub-queries

Hello everybody,

as indicated in the last lesson groupings and aggregations somehow belong together. So I speeded a bit up with this lesson, which is a available a bit earlier. Content will be aggregations solely, with just one small exception. Nevertheless it will be an extensive lesson.

First and Last and Always...Aggregations. Theory.

Aggregations are aggregating a number of datasets to one value. The most prominent representants here are the COUNT, SUM and the AVERAGE function. But one should not forget "easy" aggregations like MIN, MAX, FIRST or LAST. On the other side there are also a couple of more complex aggregations as SEARCH, GEOAVG and STDEV. The full list can be obtained in the table at the end of this blog post.

Some general remarks to the usage of aggregations:

All aggregations refer to a group, technically they can only be used in the Loop-Block of a group definition. Attempts to use it in other locations will cause a dump.
Aggregations on a full dataset (not on a group) always have to be done via $(*::....)-approach. Either in the main-query or in a sub query.
The initialization part is automatically done by the aggregation. Unlike the examples of the hand-made aggregations, the Initblock of the group definition will typically remain empty.
Aggregations can be built upon every valid BPATH expression.
Any aggregation which is at the intermediate state of processing row i ( smaller then the total number of rows in that group) will have the value as if i would be the last row in that group. This has the consequence that nested aggregations may return different results then anticipated. E.g. the inner statement in COUNT(@AMOUNT=MAX(@AMOUNT)), is true for all rows, which have no bigger AMOUNT in a row smaller then its row. In case you really want to count the rows which are equal to the global maximum, the used approach is not sufficient. In this case a hand-made solution has to be taken (For the sake of completeness: $(MyGroup:!t=0;!a=0:!t=iff(@AMOUNT>!a,1,!t+iff(@AMOUNT=!a,1,0);!a=iff(@AMOUNT>!t,@AMOUNT,!a))

Aggregations. Practice

Basic aggregations: Count, Sum() & Product()

~*[!CLASS]/SearchResFlightRel/FlightBookRel$(@CLASS::!COUNT=Count();!SUM=Sum(@LOCCURAM))
Code Example 43,$(@GROUP::!X=Count();!Y=Sum(@FELD)), Count() and Sum()

Well, that's easy now, isn't it? Slightly easier compared to the hand-made approach.

The code explains itself. Hopefully. Count can be enhanced with a condition, so only rows fullfilling the condition are counted. As with the sum() function the same can be achieved using a sum(iff(condition,@Feld,0)) construction.

Product() works similar to Sum() but muliplies all available parameters. Might be useful for probabilities. Since the result may be a fracture, possibility to round result is provided.

Advanced Aggregations: Conc, Avg() / Geoavg() and Stddev()

~*[!CLASS_LONG,!FORCURKEY]/SearchResFlightRel/FlightBookRel$(@@CLASS,@FORCURKEY::!Average=AVG(@LOCCURAM,2);!PASSENGERLIST=CONC(@PASSNAME,",",@PASSNAME<>""))
Code Example 44,$(@@GROUP::!X=Avg(@Feld,Prec);!Y=Conc(@FELD,Sep,Cond)), Avg() and Conc()

The Average function without second parameter returns the value with 34 digits. This is probably more exact than you need, so it is possible to add the number of decimal places (supported by all 3 "advanced aggregations" which work on Numbers).

Conc() (please distinguish with the concatenate function, which is not an aggregation function) concatenates the handed over (with parameter 1) value separated by parameter 2. As third parameter it is possible to specify a condition. In this case it is used to sort out non-filled entries. In contrast to the LIST() function, this is not done automatically.

Please note that the Group condition is a long text field here, consequently the sort condition has to reflect this also.

Geoavg() calculates the geometrical average (nth root on the product of all elements). Calculation is done using logarithm. All numbers in the calculations must be positive.

Stddev() calculates the standard deviation.

Minima. Maxima.

~*/SearchResFlightRel/FlightBookRel$(@@CLASS,@FORCURKEY::!R=Min(@LOCCURAM,concatenate(@PASSNAME," paid ",""+@FORCURAM," ",@FORCURKEY)))
Code Example 45, $(@GROUP::!X=Min(@Feld,Returnvalue)), Min() and Max()

The example shows the usage of the Min/Max method. The result is assembled based on the first Min/Max row found with the second parameter. If no second parameter is provided, the min/max value is returned directly.

Example 46 shows, that aggregations are not necesarilly basing on attributes directly. The example calculates the maximum paid for a kilogram of luggage, with the side condition that the weight should be bigger than zero. Using the iff() function is a standard way to deal with these side condition, but please note that in this case it is actually not needed since the division is anyway defined in a way that 0 is returned if it is attempted to divide by zero. This is done to avoid dumps at run time which can not be detected at design time.

~*/SearchResFlightRel/FlightBookRel$(*::!R=Max(iff(@LUGGWEIGHT>0,@LOCCURAM%@LUGGWEIGHT,0)))
Code Example 46, $(*::!X=Max(function(@Feld1,@Feld2))), Max based on complex expression

First(), Last(), GrpIndex(), Any()

Some "easy" aggregations. First() returns the value corresponding to the first row, the aggregation may be equiped with a condition . Last() basically does ... nothing. It is just included for the sake of completeness, you will have the same effect if you state the parameter directly.

With GrpIndex() you might access the nth row (n is the first parameter, second parameter specifies the value to be returned). In contrast to direct indexing, n has to be positive here.

Any() uses the randomizer library to choose an arbitrary row from which the return is assembled.

Search(), SearchFirst()

SearchFirst() and Search() are searching for specific rows fullfilling the specified search condition. SearchFirst chooses the first found entry, Search all - or if no self linking is used, in consequence the last row.

~*/SearchResFlightRel/FlightBookRel$(*::!R=SearchFirst(@LUGGWEIGHT>30,@Passname))
Code Example 47, $(*::!X=SearchFirst(@F1,@F2)), SearchFirst

The following example shows the teamwork of the Search and the list function. The search function searches all bookings with "overweight" (considered to be above 24) and adds the name of the passenger to the list of overweighted passengers:

~*/SearchResFlightRel/FlightBookRel$(@@CLASS::!OVERWGHT=Search(@LUGGWEIGHT>24,LIST(!OVERWGHT,",",@PASSNAME)))
Code Example 48, $(*::!X=Search(Cond,func(!X))), Searches returning lists

Median()

Median() sorts the attributes and chooses the one in the middle (respectively one of the middle pair). Median is supported for Numeric Values, Strings and Dates. Median() has no optional rounding parameter.

Even though it sounds similar to the average, technically it is a bit more costly. For all other aggregations it is sufficient to keep one value per group, for the Median the complete list has to be kept. Performance seems to be harmed only by a smaller factor as the sort_in features of ABAP tables are used.

A Small leftover: ToRoot-Relation ("...")

~*/SearchResFlightRel/FlightBookRel/...$/*
Code Example 49,..., to root

Additional to the relation to the object itself ("."), the relation to the parent (".."), the relation to the root object is now available using the ... syntax. In contrast to the parent, the root is unambigous.

Aggregation Table

The letters refer to the 4 possible datatypes ( String, Date, Boolean, Number ). The GRPINDEX function for example has a first parameter of type Number follwed by a second parameter of any type. Conc has either two parameters of type String, or two string parameters followed by a Boolean parameter as third parameter.

Name	Main Parameters	Optional Parmeters	Description
COUNT	no	B	counts all elements where the condition evals to true
SUM	N	no	sums up handed over value
PRODUCT	N	N	multiplizes handed over entries. Optional parameter provided possibility to round the return
MIN	SDNB	SDNB	returns second parameter for the first found global minimum. If second parameter is not supplied first parameter will be returned
MAX	SDNB	SDNB	maximum, equivalent to minimum
AVG	DN	N	returns arithmetical average. Optional parameter provided possibility to round the return (not in case of date values), e.g. 0 rounds to whole digits, 2 to 2 decimal places.
GEOAVG	DN	N	returns geometrical average. Optional parameter see AVG.
FIRST	SDNB	no	returns entry from first row
LAST	SDNB	no	returns entry from last row (equivalent to the usage of the parameter directly)
GRPINDEX	N, SDNB	no	returns entry from row specified with parameter 1. Negative indexing is not possible here, since the size of the group is not known.
ANY	SDNB	no	returns one of the rows chosen by equal probability
STDEV	N	N	returns standard deviation. Optional parameter provided possibility to round the return.
CONC	S, S	B	concatenates first parameter separated by second parameter. Optionally only rows which fullfill the condition specified as third parameter are taken into account. If all rows evaluate to false, an empty string will be returned.
SEARCH	B	SNDB	returns parameter 2 belonging to row where condition is met. This means typically the last row fulfilling the condition, or, in case parameter 2 includes its own target an hand-made aggregation of all found rows (see examples)
SEARCHFIRST	B	SNDB	returns parameter 2 belonging to first row where condition is met.
MEDIAN	SND	no	returns median element from the list of attributes

Lesson 11 will deal with enhancements done mainly on the sub-function.