on 12-09-2014 8:11 AM
Hi all,
we have about 3500 processes a day and the DB is getting bigger very fast. So I began archiving via "java archiving cockpit" which is very "awesome" to use but thats another story. My main problem is that for archiving one day of production it is taking about 3 to 4 HOURS. The system already ran about 1 1/2 year. I'am scared to even calculate the time it would take me to get the system "clean". Is there any chance to get this faster?
Many thanks,
Tobias
Hi Tobias,
The parallel archiving is typically quite fast. Of course, the database health is a significant factor in the speed it can run, and of course you need available CPU cores. You should monitor the database during the archiving run to see about query and index performance. Check the use of proper indexes for all the queries. Use the OpenSQL monitors of NWA to note any expensive queries or DML operations.
Regarding your IO errors, make sure the ICM settings described in SAP Note 1623502 are correct. My preference is to write small files, leaving the default of 50MB and perhaps lowering it if you get IO errors even after applying note 1623502. If you have persistent errors with IO you should have a system administrator check the IO on the system where you write the archives. If you are writing over a shared filesystem perhaps there's some network problems.
In order to understand your performance expectations it's best to start without parallel the archiving and work up to the maximum parallelism your system can support.
Begin with:
archiving.maxNumberofJobs = 5
archiving.maxNumberofProcessPerJob = 1000
archiving.maxNumberofSeries = 1 <== ensures you only run one Job at a time
This will archive a maximum of 5,000 process instances in one archiving session. You mention your daily avg is 3,500 so this would be all that's needed to maintain the DB at constant # of instances. Check in your archive store to see how many files you have after a session. Check also the duration of the jobs. Subsequently, increase the parallelism (NumberofSeries) and the number of Jobs in each series as needed.
Understand that the NumberofJobs will determine how long a single series runs, and the NumberofSeries determines how many jobs run in parallel. Each series (parallel job) can easily consume a full CPU core, so be aware of overloading the system. If you only have two cores and you try to run 6 series you won't see any speed improvement.... As a rule of thumb I never set the maxNumberofSeries higher than 1/2 the # of CPU cores on either APP or DB servers when I archive in off-hours and I wouldn't set it higher that 1/6 the # of cores for archiving when business operations are ongoing.
regards, Nick
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Since we're getting desperate here, because the archiving is still very slow and the most time-consuming part seems to be the deletion of entries in BC_BPEM_BL_ENTRY which contains half a billion entries:
Am I correct when I assume that this table contains all the business logs that can be seen in the process monitoring perspective in the NWA?
If so, wouldn't it be an option to delete entries from this table directly in the database, e.g. "delete from BC_BPEM_BL_ENTRY where OCCURRED_ON < [some date 6 months in the past]" and then use the standard archiving.
Would this only influence the process monitoring in the NWA, because detail information for processes that were older than 6 months wasn't present anymore, or might that cause other problems?
My hope is, that the system would behave like if the business logs log level had been set to "error" or "none" in the first place.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Tobias,
I would like to get some more details about your system/processes.
1. What is the typical size of each process instance?
2. How many nodes are there in system where BPM is running?
3. Are these simple processes or it also has reference processes as well? If yes, typically how many of them are there per instance?
Regards
Satish Bihari
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Satish,
1. How do get the typical size of the process? Is there any statistic i can use?
2. We have two nodes running BPM
3. We have referenced processes. These are called between five and seven times. Those are not all different process definitions. We also call the same referenced process several times with different values depending on the business case.
Regards
Tobias
You can roughly estimate order of the process by process context size. Process context are data object which stores the values of data comming from outside (of process) and used to transfer the values across artifacts.
Say, I am sending the 1 MB xml content to start event and storing them as data object. This means my process context would be ~1 MB. Similarly if I have intermediate events and it also stores the value comming out side then that needs to be added. Similarly if I create a data object and store values to it as part of output mapping.
This aspect is relevent because process context size has impact on archiving performance. Larger the process context size, more time it will take to archive the process instance.
Another aspect which impact the archiving performance is business log level. If this is set to full, then process would create large number of business log and accordingly archiving would take more time.
Another short cut to find out size of process is to archive one instance (give the process instance ID) in archiving cockpit. Go and check the file size in archived location. This would give us fair idea on size of the process (hoping that most of the process instance is of similar size). This approach will also give us time it takes to archive one instance.
Another aspect, as you have two node, running parrallel archiving with more that 2 jobs may not be very effective as more than one job would get scheduled on same node. So, in your case max number of Job should be 2.
Regards
Satish Bihari
Dear Tobias,
Sorry for delayed response. It should be maxNumberofJobs. It works like as follows
Series1 Series2
|-----------------|-----------------|
| job1 | job3 |
| job2 | job4 |
|-----------------|-----------------|
Here job 1 and job 2 runs in parallel if that many nodes are available. Similarly Job3 and Job4 runs in parallel but after job 1 and job2 finishes. Basically Series 2 will start after Series 1.
Note: There are slight chances that in some cases, job1 and job2 get scheduled on same node even though there is a free node available for node 2.
Hope this clears your confusion.
Regards
Satish Bihari
I have another question. I set up the params that there are max 5k processes archived with one run.
What happens if i shedule a daily job that selects all completed processes older than 90 days which are way more than 5k ofcourse. Would the system just select the first 5k completed processes that it could found and the next day the next bunch and so on?
Regards Tobias.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Are you using parallel archiving?
See Parallel Archiving (New) - What's New in SAP NetWeaver 7.3 EHP1 (Release Notes) - SAP Library
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hello Christian,
yes we are using parallel archiving. Currently we are able to run 4 jobs at the same time (with 1000 process instances each). If we try more jobs, we get an IO exception while writing to the filesystem. Also we are not really aware of how much the additional jobs may affect the performance of the production.
Regards,
Tobias
Hi Tobias,
it is recommended that you run the archiving outside of regular business hours to not affect production performance.
If you face performance issues during archiving, it is best to have a DB admin monitor the database and check for bottlenecks (e.g. reorganize indices).
Regards,
Christian
SAP provides out-of-the-box functionality to archive BPM Processes which is comparatively faster (way faster) than other methods. Hardly takes a few minutes.
Refer the document : BPM Archiving Functionality from SAP
Hope it helps.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
86 | |
10 | |
10 | |
9 | |
7 | |
7 | |
6 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.