Application Development Discussions
Join the discussions or start your own on all things application development, including tools and APIs, programming models, and keeping your skills sharp.
cancel
Showing results for 
Search instead for 
Did you mean: 

DELETE WHERE sample case that i would like to share with you

former_member192616
Active Contributor
0 Kudos

Hi,

analyzing a program the other day i got the following hit list in SE30 as TOP 2 and TOP 3 of the most

expensive calls sorted by net time descending.


  Number        Gross =         Net Gross (%) Net (%) Call  
   11.975  23.651.879    15.612.054   12,0    7,9  Loop at IT_126570   
8.698.597   8.039.825  =  8.039.825    4,1    4,1  Append IT_126574 

source code excerpt:


REFRESH it_dict[].

it_dict[] = ist_dict[].
DELETE it_dict WHERE charactname  NE pv_charactname
                   AND   language_int = pv_language_int.

it_dict and ist_dict are both sorted tables with a non-unique key with charactname language_int tabix value_char.

Note: The DELETE WHERE is intarnally processed as a LOOP WHERE which APPENDS the the lines

to be deleted to an internal system table which afterwards is used in a system call that does the

delete (not visible). Since ST12 (which i use normally) combines the LOOP and the APPEND to a

single line which shows DELETE as the call i used SE30 here to show all the details.

Now what?

(How) can we optimize it?

How much time can we save?

I will follow up with more details later since there are some interesting things to show i think.

Kind regards,

Hermann

1 ACCEPTED SOLUTION

Former Member
0 Kudos

Hi Hermann,<br />

<div class="jive-quote">The DELETE WHERE is internally processed as a LOOP WHERE which APPENDS the lines to be deleted to an internal system table which afterwards is used in a system call that does the delete<br />

</div>

When you say that <i>APPENDS the lines to be deleted</i> do you mean content (sounds like it) or references (index/line numbers, which I'd hope)?<br />

<br />

<i>Side note</i>: Your's and Rob's code examples look like tables with header lines, where you add the header line <i>it_dict</i> as many times as lines in <i>ist_dict</i> match the condition (so more like a shorthand for showing intent than actual coding?).<br />

<br />

<div class="jive-quote">

(How) can we optimize it? How much time can we save?<br />

</div>

I'd say as always - it depends. For <em>overall</em> optimization I favor the <i>insert only required entries</i> approach. I suspect that despite the fast <i>lazy copy</i> for <span style="font-family:Courier New">it_dict[] = ist_dict[]</span> the following <i>DELETE</i> statement probably triggers a complete copy, correct?<br />

<br />

From your <a class="jive-link-external" href="http://www.sap-press.com/products/ABAP-Performance-Tuning.html" target="_newWindow">book</a> I understood that for sorted tables we have three possible cases: Stored sorted in memory or stored unsorted with either a linear or a tree index. So in theory one could imagine that instead of doing a <i>full copy</i> of the table before a modification statement, SAP just creates another index that points to the same table and then just manipulates that index (disadvantage would be that indirect access is even required for example for a <i>loop at</i>, which doesn't use key fields). I'm pretty sure though that this is most likely not what happens...<br />

<br />

Anyhow, we're lacking a efficient single statement like <i>insert from itab where</i> for insertion of chosen table rows. I suspect that in many (most?) cases the switch from one statement to many statements (especially loops) causes a significant and measurable overhead.<br />

<br />

However, I still prefer the <i>insert only required entries</i> approach as long as the involved table is large and a considerably lesser percentage is to be copied. The convenient and hidden memory organization causes in my opinion often problems with inexperienced developers who think of <i>performance</i> in terms of <i>runtime</i> only...<br />

<br />

Also, I suspect the runtime measurement for the <i>delete</i> approach does not include the runtime of the garbage collector for cleaning up, but that might not be substantial (correct me if I'm wrong).<br />

<br />

Please feel free to enlighten us with more details how this all works internally... ☺<br />

<br />

Cheers, harald

<br/>

</br/>

p.s.: I guess the other upside of the the <em>insert only required entries</em> approach in this particular example should be that we wouldn't need an index on the target table...

39 REPLIES 39

Former Member
0 Kudos

I wonder if this would be faster:

REFRESH it_dict[].

LOOP AT ist_dict WHERE
   charactname  EQ pv_charactname  AND
   language_int NE pv_language_int.
  APPEND it_dict.
ENDLOOP.

Not entirely sure of the logic, but the concept is to do it in one step.

Rob

0 Kudos

Hi Rob,

that was not tried but this:

>

REFRESH it_dict[].
> 
> LOOP AT ist_dict WHERE
>    charactname  EQ pv_charactname  AND
>    language_int = pv_language_int.
>   INSERT it_dict.
> ENDLOOP.

instead of inserting everything and deleting the unwanted

it was changed to

inserting the wanted ones.

What do YOU think? Should it be faster?

Kind regards,

Hermann

0 Kudos

Since both tables are presumably sorted in the same sequence, I would expect that the APPEND might be faster, But it would be interesting to compare all three cases.

Rob

0 Kudos

Hi Rob,

so you would say:

slowest one is the original?

then the loop with insert?

and fastest one loop with append?

I have to ask for some patience since i will get more details tomorrow... .

Kind regards,

Hermann

0 Kudos

Yes - but it's not much more than a studied guess. I know nothing of the workings of the kernel.

I'm basing this on the fact that to APPEND sorted tables, the appended records have to be in the correct sort sequence. This should be the case here. To INSERT records, the kernel would have to figure out where to insert each. Here it would always be the last record.

I think.

(Where's Siegfried when you need him?)

Rob

0 Kudos

Hi Rob,

> Yes - but it's not much more than a studied guess. I know nothing of the workings of the kernel.

> I'm basing this on the fact that to APPEND sorted tables, the appended records have to be in the correct sort sequence.

> This should be the case here. To INSERT records, the kernel would have to figure out where to insert each.

> Here it would always be the last record.

and you are correct with that.

inserting in a sorted table requires a (binary) search to find the place to insert and puting the record there.

appending a sorted table requires a check of the last line of the table before the append in order to confirm

that the sort sequence is still correct or not. (if not a dump (illegal sort order) is thrown). Appending to sorted

tables is a tiny little bit more "expensive" as appending standard tables (not the right word) due to this check...

Kind regards,

Hermann

former_member182566
Active Contributor
0 Kudos

How many distinct values in charactname and language_int, and how are they distributed? Is that known?

0 Kudos

Hi Rui,

> How many distinct values in charactname and language_int, and how are they distributed? Is that known?

that was/is not known. What we can see from the trace that in average each delete

deleted 726 lines.

Some samples were taken from the debugger:

some deletes deleted ~200 out of ~200 lines and some deletes deleted ~1150 out of ~1200 lines ....

Kind regards,

Hermann

Former Member
0 Kudos

Hi Hermann,<br />

<div class="jive-quote">The DELETE WHERE is internally processed as a LOOP WHERE which APPENDS the lines to be deleted to an internal system table which afterwards is used in a system call that does the delete<br />

</div>

When you say that <i>APPENDS the lines to be deleted</i> do you mean content (sounds like it) or references (index/line numbers, which I'd hope)?<br />

<br />

<i>Side note</i>: Your's and Rob's code examples look like tables with header lines, where you add the header line <i>it_dict</i> as many times as lines in <i>ist_dict</i> match the condition (so more like a shorthand for showing intent than actual coding?).<br />

<br />

<div class="jive-quote">

(How) can we optimize it? How much time can we save?<br />

</div>

I'd say as always - it depends. For <em>overall</em> optimization I favor the <i>insert only required entries</i> approach. I suspect that despite the fast <i>lazy copy</i> for <span style="font-family:Courier New">it_dict[] = ist_dict[]</span> the following <i>DELETE</i> statement probably triggers a complete copy, correct?<br />

<br />

From your <a class="jive-link-external" href="http://www.sap-press.com/products/ABAP-Performance-Tuning.html" target="_newWindow">book</a> I understood that for sorted tables we have three possible cases: Stored sorted in memory or stored unsorted with either a linear or a tree index. So in theory one could imagine that instead of doing a <i>full copy</i> of the table before a modification statement, SAP just creates another index that points to the same table and then just manipulates that index (disadvantage would be that indirect access is even required for example for a <i>loop at</i>, which doesn't use key fields). I'm pretty sure though that this is most likely not what happens...<br />

<br />

Anyhow, we're lacking a efficient single statement like <i>insert from itab where</i> for insertion of chosen table rows. I suspect that in many (most?) cases the switch from one statement to many statements (especially loops) causes a significant and measurable overhead.<br />

<br />

However, I still prefer the <i>insert only required entries</i> approach as long as the involved table is large and a considerably lesser percentage is to be copied. The convenient and hidden memory organization causes in my opinion often problems with inexperienced developers who think of <i>performance</i> in terms of <i>runtime</i> only...<br />

<br />

Also, I suspect the runtime measurement for the <i>delete</i> approach does not include the runtime of the garbage collector for cleaning up, but that might not be substantial (correct me if I'm wrong).<br />

<br />

Please feel free to enlighten us with more details how this all works internally... ☺<br />

<br />

Cheers, harald

<br/>

</br/>

p.s.: I guess the other upside of the the <em>insert only required entries</em> approach in this particular example should be that we wouldn't need an index on the target table...

0 Kudos

Hi Experts,

Append or delete may be decided by the number of rows in source and after deletion, but some hints may be good for both.

1) Review on declaration of this sorted table if field charactname is positioned as first column on sort because only this field fill use the table index, if not a full scan on the ist_dict table will be performed.

2) Remove (if possible) the sorted declaration of table it_dict and do a manual sort at end of this LOOP.

If this doesn't solve it on kernel execution, I should add to Rob's suggestion the usage of field-symbols to access direct the row in memory than copy it to workarea before copy it again to destination.

LOOP AT ist_dict ASSIGNING <fs_dict> WHERE
   charactname  EQ pv_charactname  AND
   language_int NE pv_language_int.
  APPEND <fs_dict> TO it_dict.
ENDLOOP.

Just my 2 cents...

Regards, Fernando Da Rós

-

-


P.S.: If I were you I would look around the code if a huge volumn on that internal table is really necessary, if it came from database maybe not...

Edited by: Fernando Ros on May 6, 2010 12:43 AM

0 Kudos

Hi Harald,

> When you say that <i>APPENDS the lines to be deleted</i> do you mean content (sounds like it) or references (index/line numbers, which I'd hope)?

Indexes or keys (more is not necessary)

> Your's and Rob's code examples look like tables with header lines, where you add the header line <i>it_dict</i> as many times as lines in <i>ist_dict</i> match the condition (so more like a shorthand for showing intent than actual coding?).

yes, for me it was lazyness. (I'd rather write code that writes code instead writing code.... )

> I'd say as always - it depends. For <em>overall</em> optimization I favor the <i>insert only required entries</i> approach. I suspect that despite the fast <i>lazy copy</i> for <span style="font-family:Courier New">it_dict[] = ist_dict[]</span> the following <i>DELETE</i> statement probably triggers a complete copy, correct?<br />

yes. the whole table body is copied.

Kind regards,

Hermann

0 Kudos

Hi Rob, Rui, Harald, Fernando,

ok, the next step was the INSERT:


LOOP AT ist_dict WHERE ASSIGNING <fs>
   charactname  EQ pv_charactname  AND
   language_int = pv_language_int.
  INSERT <fs> INTO TABLE it_dict.
ENDLOOP.

result:


  Number        Gross =         Net Gross (%) Net (%) Call

5.573.397  29.387.079  = 29.387.079   14,0   14,0  Insert IT_140813
   11.975  39.419.066    10.031.987   18,8    4,8  Loop at IT_140791

i was a little bit surprised since i expected an improvement but we made things worse.

The optimized loop + single line insert was in sum more expensive than the delete where

which is an array operation with an not optimized loop where + append of indexes to be delted + delete.

We "improved" from 23 seconds to 39.3 seconds, great.

We can see that a total of 5.573.397 rows is inserted (rows to keep) that is per execution i average 465 rows. (the delete deleted in average 726 lines).

we switched than to APPEND:


  Number        Gross =         Net Gross (%) Net (%) Call

11.975  14.821.852     9.314.814    7,7    4,8  Loop at IT_140790
5.573.397   5.507.038  =  5.507.038    2,9    2,9  Append IT_140812
11.964   3.491.469  =  3.491.469       1,8     1,8  Insert IT_140813 

That was appending single lines to a temporary standard table and an array insert from that temp table to the sorted table it_dict.

Now we had ~18.3 seconds (coming from 23.6 seconds in the original version).

Directly appending should lead to a total of ~14-15 seconds (no insert anymore and the append a tiny little bit slower.

So what do you think? Happiness?

Kind regards,

Hermann

0 Kudos

Hhmm, I'd say it's kind of expected that the INSERT .. INTO TABLE is rather awkward. Assuming that the lookup for the insertion position is done using a traditional binary search we should for each INSERT end up with the maximum number of lookups to get to the end of the table, i.e. log₂(lines(it_dict)). For a table with 5mio entries this should be on average 21 lookups per insert...

For the DELETE on the other hand we just need the one time table scan to identify entries (so same loop as required for the INSERT), collect all relevant index references of all lines to be deleted and can then DELETE entries via index. I'm not sure how this works, but it's clear that there has to be some index adjustment and possibly some memory clean-up (though this might be a deferred task for the garbage collector?!). The index adjustment should be fast for a linear index and probably slower for a tree index (though the times still seem kind of slow). But this is really guesswork unless you reveal some more internals here...

So what do you think? Happiness?

Yes, in the end it's nice to see the APPEND solution turn up as the best one. Bliss would be to get a statement INSERT/APPEND .. LINES OF itab WHERE ...

Anyhow, one last comment to check if my understanding is correct: Only the DELETE solution should lead to building an extra index for the table. The INSERT and APPEND solution should both result in a directly sorted table without any additional index requirement. If that's true I'm almost wondering if that might also be a possible performance improvement for future table lookups, due to one less level of indirection (but maybe too tiny to measure).

Please correct me if I stated any rubbish...

Cheers, harald

0 Kudos

Hi Harald,

>Hhmm, I'd say it's kind of expected that the INSERT .. INTO TABLE ...end up with the maximum number of lookups to get to the end of the table,

you are right here. But i didn't knew the number of inserts before. With the figures given we tried since it was stated the the majority of the lines is deleted. I was surprised about the high number of single line inserts. Further more the delete where scans the whole tablle and the loop where is optimized (we have the leading part of the key) but again the bigger the result set (the hits of the where, the less achievement). My thirst tought was, if we can get rid of the unoptmized loop (from the delete) and replace it with an optimized one we will be faster. I didn't took into account that there might be many many inserts.

>For the DELETE ... But this is really guesswork unless you reveal some more internals here...

I'm no kernel developer and only read and understand c / c++ a tiny little bit....

But i learned that the effort of the delete system call depends on the distribution of the rows

to be deleted. if it is adjacenct ranges it is bettter than "every 2nd line"... . Index and memory

management is of course a different topic as well and should depend on the same i guess.

Generally we should not think aobut this

>>So what do you think? Happiness?

>Yes, in the end it's nice to see the APPEND solution turn up as the best one.

the best one, SO FAR.

>Bliss would be to get a statement INSERT/APPEND .. LINES OF itab WHERE ...

you are on the right track.... . You know that we have array inserts/appends.

insert/append lines of itab1 from i1 to i2 to itab2....

When it comes to tuning we check

- if the execution of the coding is necessary at all -> this is the case here

- if we can make the execution fastert -> we did this here

another step is to check if we can reduce the number of executions to speed it

up further.

Do you think we can improve it further to reach the ultimate happiness?

> Only the DELETE ... should both result in a directly sorted table

Theoretically it should be like this, but you never know since there are many rules /exceptions when an index is created and

when not. If i'm interested in it (and i'm normally not) i would check it in the debugger.

Kind regards,

Hermann

0 Kudos

Hi Hermann,

Can you please repeat what you had in the beginning and what you have now?

Maybe I am not seeing the logic correctly but I don't think that with this:

DELETE it_dict WHERE charactname  NE pv_charactname
                   AND   language_int = pv_language_int.

and this:

LOOP AT ist_dict WHERE ASSIGNING <fs>
   charactname  EQ pv_charactname  AND
   language_int = pv_language_int.
   APPEND ...

you'll end up with the same records.

0 Kudos

You know that we have array inserts/appends. insert/append lines of itab1 from i1 to i2 to itab2

Yes I knew, but... each time I need an array insert/append it's either all or based on some condition. Never encountered a need so far for adding consecutive ranges of rows...

Do you think we can improve it further to reach the ultimate happiness?

Unqualified comment: Nope, I suspect that's it. Depending on usage one could of course think of adjustments like possibly having it_dict just as a standard table instead of sorted or if ist_dict anyhow stays around possibly define our own index of rows that we need (so just a key with reference to the corresponding row in <em>ist_dict</em>). I'm tempted to think though that most of such improvements would probably be rather obscure and violate our beloved KISS principle...

For me this was an interesting, yet somehow disconnected discussion. The programs I've had to tune so far usually were dominated by database operations; any overly long ABAP processing could always be traced to silly mistakes like doing lookups in huge internal standard tables and similar things. It's probably the use model, where essentially large data quantities first have to be retrieved from the database, but in the end the actual processing never was overly complicated and could easily be tuned (giving the "red lantern" back to the SQL queries).

Thanks for the new insights. Please keep posting.

0 Kudos

Hi Rui,

hm...

REPORT z_del_test_1.

TYPES: BEGIN OF t1,

arbgb LIKE t100-arbgb,

sprsl LIKE t100-sprsl,

END OF t1.

DATA: wa TYPE t1.

DATA itm TYPE STANDARD TABLE OF t1.

DATA it TYPE STANDARD TABLE OF t1.

wa-sprsl = 'D'.

wa-arbgb = 'A'.

APPEND wa TO itm.

wa-sprsl = 'D'.

wa-arbgb = 'B'.

APPEND wa TO itm.

wa-sprsl = 'D'.

wa-arbgb = 'C'.

APPEND wa TO itm.

it = itm.

DELETE it WHERE arbgb ne 'B' AND sprsl = 'D' .

LOOP AT it INTO wa.

WRITE / wa.

ENDLOOP.

ULINE.

LOOP AT itm INTO wa WHERE arbgb = 'B' AND sprsl = 'D'.

WRITE / wa.

ENDLOOP.

Result:

test

B D

-


B D

0 Kudos

Hi Harald,

> Nope, I suspect that's it.

we ended at 1.3 seconds.

I'll tell you how, after lunch...

Kind regards,

Hermann

0 Kudos

Hermann,

D is not the only language that exists: that's called being ethnocentric..

Try changing one of the languages to another value.

0 Kudos

If you started with:

char = p_char AND lang NE p_lang

If you want the opposite (you want to APPEND the ones you DON'T want to DELETE), and from what I learned in school:

NOT ( char = p_char AND lang NE p_lang ) 
becomes
char NE p_char OR lang = p_lang

Right !?

0 Kudos

Hi Rui,

you are right if we have more than one language.

This however, they stated, will not be the case since

the selections are based on sy-langu.

But tests are planned for sure.

Kind regards,

Hermann

0 Kudos

Hi Rui,

>

char = p_char AND lang NE p_lang

the other way round:

>

char NE p_char AND lang = p_lang

Kind regards,

Hermann

0 Kudos

>

> Hi Rui,

>

> you are right if we have more than one language.

> This however, they stated, will not be the case since

> the selections are based on sy-langu.

Ok, then I suppose the problem is simpler because it is not:

DELETE it_dict WHERE charactname  NE pv_charactname
                   AND   language_int = pv_language_int.

but just

DELETE it_dict WHERE charactname  NE pv_charactname

(since language doesn't really matter). Which is cool because your 1.3 seconds solution will be valid for more real life scenarios.

Is it easy to know what other values there are for charactname (the ones we want to discard) ?

Rui

0 Kudos

I also seem to remember that DELETE FROM .. TO.. is much faster. For that we would need to determine where that characteristic starts (that is fast, with a single READ) and where it ends (for that we would need to loop until a different one appears, I suppose).

Don't know how fast that would be, but I suppose not the 1.3 secs we are aiming at.

0 Kudos

Hi Rui,

> I also seem to remember that DELETE FROM .. TO.. is much faster.

interesting comment: that is something we didn't test so far. It still would

copy all lines and from ist_dict to it_dict and then delete the unwanted ones.

I assume it will be slower than 1.3 seconds, would be interesting though. But

i think i should not ask for further tests with achieved result.

Kind regards,

Hermann

0 Kudos

Hi Rob, Rui, Harald,

ok, we arrived here:


 Number        Gross =         Net Gross (%) Net (%) Call
   11.975  14.821.852     9.314.814    7,7    4,8  Loop at IT_140790
5.573.397   5.507.038  =  5.507.038    2,9    2,9  Append IT_140812

14 seconds. Neither the LOOP (which works with ASSIGNING, the INTO is slower) nor

the APPEND (which shows 1 microsecond per call (line)) can be made faster, that is

i don't have any ideas any more.

Next step: How often do we execute (optimized) calls.

Well, the APPEND obviously 5.573.397 times (giving 5.5 seconds since each append needs

1 micro second). And the LOOP? 11975 times (777 microseconds each call). The LOOP

of course does 5.573.397 assigns as well... but as tested, the into is slightly slower.

Can we reduce these numbers?

As Rui already said, if we would be able to determine the range that we need we can switch

from single line processing to array.

Finding the first line... easy... read binary search transporting no fields ... we have sy-tabix.

Finding the llast one... moving on, step by step, checking the key? will give 5.507.038 steps

and 5.507.038 checks. Will this be faster than the LOOP? We didnt't try.... Since we would do

it on the ABAP statement level i would assume that the kernel can do it faster. So we only

talk about the APPEND which is 5.5 seconds. Even if we delete the operation we would stay

with 9.3 seconds for the LOOP.

The last bit came from the application developer knowing the application in detail:

When ist_dict is filled it is done in chunks per charactname. At this point he fills an

additional global sorted table with key charactname, lines. That value is read after

the first line has be read with the binary search. Now we have the first line and last line

(first + lines value from the global sorted table). We then do

INSERT LINES OF its_dict FROM ind_from TO ind_to INTO TABLE it_dict.

Now we have: (without inserting the global sorted table which is in the one digit milli second range in total)


 Number        Gross =         Net Gross (%) Net (%) Call
 11.964   1.098.808  =  1.098.808    0,7    0,7  Insert IT_140817
 11.964     126.771  =    126.771    0,1    0,1  Read Table IT_140794
 11.964      42.611  =     42.611    0,0    0,0  Read Table IT_140795

From 23.6 down to 1.3. Now we reached ultimate happiness

Kind regards,

Hermann

0 Kudos

Hi,

i'm now closing the "question" / thread, giving 2 points to everybody who answered because I appreciated your contribution.

Hope you found this sample case as interesting as i did.

Kind regards,

Hermann

0 Kudos

Hermann (et al),

Thank for the exercice, it was interesting.

I was feeling like working this afternoon so I made some tests. Let me add my conclusions. If you can please also check if I am measuring what really needs to be measured; it is easy to make silly mistakes and include between the GET RUN TIME's thinks that really should be there.

I started with two internal tables:

DATA: lt_cawnt  TYPE SORTED TABLE OF cawnt WITH NON-UNIQUE KEY atinn spras adzhl datuv,
      lt_cawnt2 TYPE SORTED TABLE OF cawnt WITH NON-UNIQUE KEY atinn spras adzhl datuv.

From CAWNT I will read 3 different atinn's:

0000000907 >> 2396 entries

0000000924 >> 6215 entries

0000000973 >> 1942 entries

I will then filter only 924. I chose these because 924 is around 60% of the total (I think it was more or less the same in your example) and because I wanted the "correct" one to be in the middle (because of the FROM and TO logic).

First case: Copy and Delete

lt_cawnt2[] = lt_cawnt[].
DELETE lt_cawnt2 WHERE atinn <> lc_atinn.

Second case: Loop and Insert

LOOP AT lt_cawnt[] ASSIGNING <fs> WHERE atinn = lc_atinn.
  INSERT <fs> INTO TABLE lt_cawnt2.
ENDLOOP.

Third case: Loop and Append

LOOP AT lt_cawnt[] ASSIGNING <fs> WHERE atinn = lc_atinn.
  APPEND <fs> TO lt_cawnt2.
ENDLOOP.

Fourth case: Copy and Delete From To

lt_cawnt2[] = lt_cawnt[].
READ TABLE lt_cawnt2 ASSIGNING <fs> WITH KEY atinn = lc_atinn.
lv_to = sy-tabix - 1.
DELETE lt_cawnt2 FROM 1 TO lv_to.
LOOP AT lt_cawnt2 ASSIGNING <fs> WHERE atinn <> lc_atinn.
  lv_from = sy-tabix.
  EXIT.
ENDLOOP.
DELETE lt_cawnt2 FROM lv_from.

Fifth case: Loop and Append From To

READ TABLE lt_cawnt ASSIGNING <fs> WITH KEY atinn = lc_atinn.
lv_from = sy-tabix.
LOOP AT lt_cawnt ASSIGNING <fs> FROM lv_from WHERE atinn <> lc_atinn.
  lv_to = sy-tabix - 1.
  EXIT.
ENDLOOP.
APPEND LINES OF lt_cawnt FROM lv_from TO lv_to TO lt_cawnt2.

0 Kudos

Hi Rui,

interesting. Please share your results with us.

Looks good at the first glance.

you may want to add a 6th test case if you want a little surprise.

same like 5 only append changed to insert: (that is currently our "final version")

READ TABLE lt_cawnt ASSIGNING <fs> WITH KEY atinn = lc_atinn.

lv_from = sy-tabix.

LOOP AT lt_cawnt ASSIGNING <fs> FROM lv_from WHERE atinn lc_atinn.

lv_to = sy-tabix - 1.

EXIT.

ENDLOOP.

INSERT LINES OF lt_cawnt FROM lv_from TO lv_to INTO TABLE lt_cawnt2.

Kind regards,

Hermann

0 Kudos

I also measured only the code to determine FROM and TO (to see how much you'd save if you already know them in advance).

READ TABLE lt_cawnt ASSIGNING <fs> WITH KEY atinn = lc_atinn.
lv_from = sy-tabix.
LOOP AT lt_cawnt ASSIGNING <fs> FROM lv_from WHERE atinn NE lc_atinn.
  lv_to = sy-tabix - 1.
  EXIT.
ENDLOOP.

These were my results (they slightly vary from execution to execution, but not that much):

Start (     10.553  lines )                                                                                
Copy + Delete                     4.130  (      6.215  lines )  
Loop + Insert                    24.104  (      6.215  lines )  
Loop + Append                    10.676  (      6.215  lines )  
Copy + Delete From To             1.494  (      6.215  lines )  
Loop + Append From To             2.932  (      6.215  lines )  
Determine From To                   960

So:

a) In my case de 1st approach was faster than the 3rd one (in your case you noticed an improvement there)

b) Copy + Delete From To seems to be the faster alternative in this case (especially if we already have FROM and TO)

c) CAWNT only has 10 fields. I suppose that for a bigger table COPYing all the records (to delete afterwards) would not be that good, so the 5th alternative would probably start to be better.

0 Kudos

Just read your last comment, so now with the surprise..

Start (     10.553  lines )                                                                                
Copy + Delete                     4.154  (      6.215  lines )     
Loop + Insert                    24.308  (      6.215  lines )     
Loop + Append                    10.845  (      6.215  lines )     
Copy + Delete From To             1.487  (      6.215  lines )     
Loop + Append From To             2.932  (      6.215  lines )     
Loop + Insert From To             1.772  (      6.215  lines )     
Determine From To                   958

(I must confess it is a surprise.. I probably wouldn't even think of trying it.)

0 Kudos

Hi Rui,

thanks for your tests. Looking at your figures

>

Start (     10.553  lines )                                        
>                                                                    
> Copy + Delete                     4.154  (      6.215  lines )     
> Loop + Insert                    24.308  (      6.215  lines )     
> Loop + Append                    10.845  (      6.215  lines )     
> Copy + Delete From To             1.487  (      6.215  lines )     
> Loop + Append From To             2.932  (      6.215  lines )     
> Loop + Insert From To             1.772  (      6.215  lines )     
> Determine From To                   958

we may should try the delete from to as well since it could save a little bit more...

> Just read your last comment, so now with the surprise..

> (I must confess it is a surprise.. I probably wouldn't even think of trying it.)

I was surprised too.

Please don't ask me why the insert is faster than the append: i don't know it (yet).

I would have expected the append to be the same or faster... but not slower.

Kind regards,

Hermann

0 Kudos

More tests while I haven't deleted the program...

If I keep the first characteristic (and not the one in the middle), the winner is the same (but the 1st case is now slower than the 3rd):

(I had to make small adjustments to correctly use FROM and TO when the characteristic is not in the middle)

Copy + Delete                     5.381  (      2.396  lines )
Loop + Insert                     7.888  (      2.396  lines )
Loop + Append                     2.994  (      2.396  lines )
Copy + Delete From To               509  (      2.396  lines )
Loop + Append From To             1.041  (      2.396  lines )
Loop + Insert From To               691  (      2.396  lines )
Determine From To                   400

If I keep the last one, then option 6 is now clearly the best:

Start (     10.553  lines )                                                                                
Copy + Delete                     5.582  (      1.942  lines ) 
Loop + Insert                     6.573  (      1.942  lines ) 
Loop + Append                     3.361  (      1.942  lines ) 
Copy + Delete From To             1.581  (      1.942  lines ) 
Loop + Append From To               922  (      1.942  lines ) 
Loop + Insert From To               566  (      1.942  lines ) 
Determine From To                   301

Finally, if the characteristic we are looking for is actually not there, then:

Start (     10.553  lines )                                                                                
Copy + Delete                     6.280  (          0  lines )      
Loop + Insert                        14  (          0  lines )      
Loop + Append                         4  (          0  lines )      
Copy + Delete From To               173  (          0  lines )      
Loop + Append From To                10  (          0  lines )      
Loop + Insert From To                 8  (          0  lines )      
Determine From To                     4

Well, sorry for continuing to post on your answered thread. We are ready for the next one..

0 Kudos

Hi Rui,

thanks for your interesting comments and tests. Appreciated.

Regarding APPEND / INSERT i asked the development:

The generic insert has an internal optimization that kicks in in special cases. E.g. if the source

and the target table are of the same type AND if the target table is empty. That's why the INSERT

is faster in our tests where we have the same table types and empty target tables.

Kind regards,

Hermann

0 Kudos

Rui, Hermann,

Interesting stuff, missed out on the last parts of the conversation because I had to get some sleep...

One small comment to the additional coding for the range inserts/deletes. Whenever we do a READ on an internal table or LOOP at an internal table and we don't need any of the data, i.e. like in our case where we just wanted to retrieve index positions, I'd use the variation [READ TABLE .. TRANSPORTING NO FIELDS|http://help.sap.com/abapdocu_70/en/ABAPREAD_TABLE_OUTDESC.htm#!ABAP_ALTERNATIVE_4@4@] or [LOOP AT .. TRANSPORTING NO FIELDS|http://help.sap.com/abapdocu_70/en/ABAPLOOP_AT_ITAB_RESULT.htm].

I couldn't do any testing, but I suspect runtime doesn't really change. In theory though it might still be a minor optimization, e.g. if the table is accessed via index there we should be able to skip having a pointer assignment for referencing the actual data. However, I think the code is much cleaner because it clearly shows intent and possibly even avoids definition of a useless variable (which we then have to explain in the extended program check). I know that within the thread we were not out for clean coding, but I still thought I mention it, because I'm always surprised how seldom this option is used by developers...

Oh, and Rui, my compliments for spotting the inaccurate application of De Morgan's laws (though it sound that in the end in this particular case it was still ok, but initially that wasn't obvious). I never even thought that far to judiciously check the conditions.

Thanks to all for the interesting contributions. In the end I'd say that the real benefit would be if SAP would introduce the option to bulk append/insert lines from one internal table into another based on a WHERE condition. Even if it's just syntactic sugar for now, we'd all benefit from later optimizations.

Curiously awaiting similar future postings.

former_member194613
Active Contributor
0 Kudos

I did not read the last page yet. But I have massive problems with the inversion of logical expressions, which is discussed here


NOT ( A NE A0 AND B = B0)

is


A = A0 OR B NE B0

This is standard logic.

The rest is problematic as the same coding is probabally for cases where

+ only a few records are deleted

+ and for cases where a lot of records are deleted.

Depending on the case the postive or the negative expression should be faster. I do not see how you can know in advance

which case it is.

char NE p_char AND lang = p_lang

former_member194613
Active Contributor
0 Kudos

if no other language than sy-langu is involved, then leave it away

Be aware that the sorted table allows an APPEND, and you must take care that the APPEND works and does not lead to a dump,

i.e. if the frist table is sorted or a sorted table and the second one shall have the same order, then you can use the LOOP and the APPEND. This should be the fastest solution.

Only in the cases where only a few records have to be deleted, the DELETE WHERE will be faster.

0 Kudos

Hi,

the language has been removed meanwhile

But neither the APPEND (line by line) nor the array APPEND FROM TO was the fastest,

but the generic INSERT FROM TO. In our case the APPEND FROM TO was about 3 times slower (3 seconds)

than the INSERT FROM TO (1 second). This was a little surprise for me. But i learned that in

very specific cases (same table type and empty target table) the generic INSERT has

an internal optimization whih is not the case for the APPEND.

Kind regards,

Hermann

former_member194613
Active Contributor
0 Kudos

When I compare you measurements I have more problems:

DELETE WHERE


  Number        Gross =         Net Gross (%) Net (%) Call  
   11.975  23.651.879    15.612.054   12,0    7,9  Loop at IT_126570   
8.698.597   8.039.825  =  8.039.825    4,1    4,1  Append IT_126574 

LOOP and INSERT


Number        Gross =         Net Gross (%) Net (%) Call
 
5.573.397  29.387.079  = 29.387.079   14,0   14,0  Insert IT_140813
   11.975  39.419.066    10.031.987   18,8    4,8  Loop at IT_140791

The LOOP at WHERE with the positive condition (lower one) should be faster than the LOOP with NE condition, much faster.

The smaller numbr of APPENDs should also be faster. And the COPY in the first case does not appear at all!!

So the LOOP and APPEND should be much faster here.