11-26-2012 11:57 AM
Hi All,
We are working on performance improvement for a program and decided to take the approach of breaking down the data (close to 500,000 records) in packets and calling them in a separate dialog work process by calling the RFC in a separate task (call Function ... in separate task). The number of tasks called is dependent on the number of work processes available determined by function SPBT_INITIALIZE.
The executable program (Report) will be scheduled in background. It will select all the data (500,000 records), break it into packets, say 4 packets with 125,000 records each and call function module with 125,000 records, starting 4 tasks in parallel.
Since the volume of data is quite high, need help with few questions:
1) Though the data is independent but it would take more than 10 mins/600 secs and the dialog work process triggered by the asynchronous RFC would time out. Is there a way to handle this?
2) What could be other ways to process the data?
Thanks in advance!
Shyam
11-27-2012 9:56 PM
Sounds quite simple: choose smaller packets to reduce runtime of each single task.
You need to place your RFC-calls inside a DO loop over the number of packages, counting tasks started and ended in global variables and start new tasks as long as there are processes available (use WAIT UNTIL for this), also have a look at this example:
http://help.sap.com/saphelp_nwpi71/helpdata/en/43/5621defc1be74eb25de334f464b9cf/content.htm
Thomas
11-28-2012 8:56 AM
Thanks Thomas for responding.
On further test runs, we realized that the data set might be huge in some cases and might take more than 10 mins for that task. In such a case, the dialog process would time out.
Can we conclude that parallel processing using RFC might not work in such a case? If so, what could be other options for achieving parallel processing?
Regards,
Shyam
11-28-2012 9:40 AM
Hi,
if you use ARFC you have to make sure that a package will finish within the timeout values.
The sometimes mentionend dirty trick to reset the time slice should not be used. Work on
package size and rum time of a package in order to achieve a stable solution.
alternatively you can use background jobs to process packages in parallel. They will not time
out but they will not come back when they are finished. Therefore you either need some master job
that pushs work to jobs and starts them with a package or make the slave jobs pull for packages when they finished one before the job ends.
Kind regards,
Hermann
11-28-2012 12:11 PM
What exactly holds you from preparing smaller packages, that will not time out?
We probably need to understand more about the data to be processed and the dependencies within.
Thomas
11-28-2012 1:10 PM
11-28-2012 3:48 PM
preparing smaller packages
This!
It is often better to have more packages than processes. This will often lead to a much better (less skewed) distribution. Otherwise often one of 4 packages gets most of the "big" (expensive) data to do, while the other three packages are done in seconds. The tradeoff is a overall higher packet initialization overhead.
And besides that depending on the sap system (system with more than 40 dialog workprocesses) i would not try to allocate them all at once, so specifying an upper limit for the number of processes can make sense, optimally changeable through the report variant.
Cheers Michael
11-29-2012 3:41 AM
Hi Thomas,
The scenario is that we are creating data packets based on customer and company code combination, selecting open line items and clearing them using FBO5 using 'POSTING_INTERFACE_CLEARING’ function module.
When we did data analysis in a system that had production like data, we realized that the data for some specific kunnr/bukrs combination was huge and would not be possible to process within the set timeout for dialog process. It is not possible to slice that data further. Hence, I guess background jobs option would work in our case. Your thoughts?
Thanks,
Shyam
11-29-2012 7:53 AM
Hi,
background would work in any case.
A workaround could be:
increase the time out parameter
run your program
reset the time out parameter
Some customers are doing it like this.
Things you have to consider:
The parameter rdisp/max_wprun_time is a dynamically changeable instance
parameter and affects everybody on an application server (and you need to set it
per applications server where your program is supposed to run on).
Thus any user can start long dialog processes in the time frame when
the parameter is set. This can lead to out of workprocess situations when
many users start such programs. If your prorgram runs e.g. at night and you
don't have many dialog users in parallel .. fine... otherwise you clearly have
that risk.
The change of the parameter setting could be done with ABAP:
In any case involve a basis guy. And of course you would need proper authorizations.
Adiddionatlly the workload distribution needs extra care: you have to start the long running package first (not as one of the last ones) and process (all) other packages in parallel to it...
Kind regards,
Hermann