01-27-2012 7:38 PM
Hello all,
I have been given a task to improve the performance of a program that uses the DELETE statement to delete directly from the database.
**This is a custom table - the client is not foolish enough to, say, delete directly from MARA...
the statement is simple:
DELETE FROM ztab WHERE cldat EQ p_date.
Note that cldat is the first field in the primary key of this table.
The issue arrives because ztab has approx 8 million records and the delete will be removing on average about 1.5 million records.
The database we are using is DB2 (don't know version yet) and from previous reseach I have seen that DELETE... CLIENT SPECIFIED may help performance - possibly due to how the optimizer works? Does anyone have experience with this?
Another method I plan to test is breaking up the DELETE into more manageable chunks - ie: DELETING 100,000 records at a time. I have not been able to find any literature on how to determine the optimal size of these data intervals. Is there anything out there besides trial and error?
Finally, would having a COMMIT WORK after every DELETE help performance by clearing the LUW? Or is having it at the end of the process be sufficient?
Thank you all,
James
01-27-2012 7:51 PM
I think you should break it up into smaller chunks with a COMMIT after each delete. The reason is not performance, but the rollback area. If you delete too many records without a commit, the rollback area may be exceeded with a dump.
Rob
01-27-2012 7:51 PM
I think you should break it up into smaller chunks with a COMMIT after each delete. The reason is not performance, but the rollback area. If you delete too many records without a commit, the rollback area may be exceeded with a dump.
Rob
01-27-2012 8:00 PM
Rob,
Thanks. I am thinking the same thing and have suggested that, but the client is looking for some hard - or at least hardish - numbers on performance and rollback size before we make any changes.
James
01-27-2012 8:09 PM
I think the numbers you are looking for are specific to your system. Your DBAs should be able to give you some helpful data.
It might also help to do a database re-org before the delete.
Rob
01-30-2012 1:11 PM
Hello guys,
you are giving some advices to split the deletion into smaller chunks, but you don't say how.
I did not see the "packagesize" parameter in the delete statement and I don't think it exists.
If there are already additional fields that allow to split the deletion into smaller intervals - fine. But what if there are no
Drop/recreate indexes is a good idea. If there will be no indexes, I expect the deletion to be very fast. And I hope that it will not cause problems with roll-back segment. At the end of the day, 1,5 mil entries is not THAT many.
Regards,
Yuri
01-30-2012 1:38 PM
My impression from reading the initial post was that James already knew how to split and was now wondering about the optimal block size.
Standard program RSTBPDEL can serve as an example, here they chose 2000 as block size. RBDCPCLR uses more complex logic. The common approach seems to be to fill an internal table and use that in "DELETE dbtab FROM itab" statements.
Thomas
01-30-2012 2:49 PM
My impression from reading the initial post was that James already knew how to split and was now wondering about the optimal block size.
>
> Standard program RSTBPDEL can serve as an example, here they chose 2000 as block size. RBDCPCLR uses more complex logic. The common approach seems to be to fill an internal table and use that in "DELETE dbtab FROM itab" statements.
>
> Thomas
And all thins only to resolve the problem of the rollback segments size
Because from the performance point of view, it should be definitely slower.
01-30-2012 2:59 PM
Yuri, I know you're an expert working at SAP on these topics, so is there any formula for this? Of course it is faster without such block processing, however I was taking the rollback area size problem quite seriously so far, are you saying we should not worry about it? I'm thinking there must be a threshold from when on it starts becoming a potential problem.
Thanks
Thomas
01-30-2012 3:09 PM
Yuri, I know you're an expert working at SAP on these topics, so is there any formula for this? Of course it is faster without such block processing, however I was taking the rollback area size problem quite seriously so far, are you saying we should not worry about it? I'm thinking there must be a threshold from when on it starts becoming a potential problem.
> Thanks
> Thomas
Well, depends on the size of the area. And probably on other factors like the width of the table. I guess that deleting a table with 50 fields 20 bytes each will be different from deletion of the table with 5 fields 10 bytes each.
Thomas, I can try and test tomorrow when I have some free time.
01-30-2012 6:44 PM
Back from the weekend and lots of great replies - thanks to all.
To start with, performance is my #1 priority. When looking at the program and being informed of the nunber of records being deleted, my next thought was that there was no commit work so there is a chance of the operation failing in the middle and rolling back. My third thought was that SAP would probably want to handle this in smaller bites, so I was looping for an optimal size.
UPDATE: The 8 milllion records was apparently the test system. Production has 82 million records, so I was off by a factor of 10. Not sure if the number of records to delete will increase as well - looking into it.
Also, I have not had my access arrpoved to the system so I can not test anything yet - apparently it takes more than a week. I will keep everyone posted.
James
01-30-2012 6:58 PM
Also bear in mind that system parameters may be different in production vs. test system. So results from test system are not necessarily a reliable indicator of how things will work in production.
Rob
01-30-2012 7:08 PM
Rob,
I already KNOW that they are different and have mentioned this to the client, so any improvement I get will ony be a guide and not a hard prediction. As you would expect in this situaltion, there are more app servers and CPUs available on the production side. I have mentioned to the client but have not had a chance to look at load balancing, session balancing, ect on the servers.
At the moment I am focusing on the ABAP side because that is all I can at the moment, but I am definately keeping an eye on the resource/BASIS side of this.
James
01-30-2012 7:26 PM
More info because I know you all love it:
Custom table has 32 fields, 507 char wide.
In the dev system I am looking at, the table size category is 8 (4.3M to 8.6M). The production table has 82M records. Would it make much of a difference to bump this up to size category 9 (8.6M to 340M)?
Data class is APPL1 (transaction data) and buffering is not allowed.
01-27-2012 9:13 PM
Rob,
Most definately this will be collaberative with the BASIS team. I was hoping for some general guidelines a bit more specific than "more than 100 and less than 100,000,000" to start with. Fine tuning we can do as we go and get data relevent to this particular instance.
It is day 3 of this contract and I don't have a ton of info yet, but the client is still looking for some numbers to start from.
Thanks.
James
01-27-2012 9:19 PM
I mentioned this in another thread in this forum a couple of days ago and since no one disagreed, I'll mention it here.
If there are secondary indexes on the table, consider dropping them and then recreating them after the delete. This should save the database the overhead of maintaining the index for each delete.
Rob
01-27-2012 9:39 PM
01-27-2012 9:49 PM
For commit blocks, I think I have seen values between 500 and 10,000 rows in SAP programs, it depends on some factors like row length, available resources, system parameters...
If the blocks are too small, then the frequent commits cause too much overhead, if they are too large, you run into the problems described by Rob.
In any case you should implement this block size as a parameter in the deletion program, so you can zoom in on the best size for your scenario after a few tries.
Thomas
01-27-2012 9:59 PM
Thomas,
Yup, was planning to set up a parameter - if only for testing. Thanks for the reply.
James
01-31-2012 11:26 PM
Marking as ANSWERED.
Thanks to everyone for their comments. Ended up using blocks of up tp 25000 max and deleting them with a COMMIT WORK after each block. The run in much much faster now - under 30 min. Client is happy and I am on the next issue.
03-31-2016 8:23 AM
Hi James ,
Can you share your deletion statement for creating chunkts and commit .
My problem is , client has copy of bseg now want to delete data.
In new custom table no indexed created , no roll back require .
Now client use SE14 but due to some security reason want a z program .
Please suggest .
02-01-2012 1:26 PM
there is no optimal size for these blocks (or it would depend on too many factors). The difference between 1010.000, 250.000 and 1100.000 is marginal. Blocking has huge effects when you start it, 1001 versus 1*100 but becomes saturated.
>size category 9 (8.6M to 340M)?
size category has no effects, it is only an information and not very reliable.