cancel
Showing results for 
Search instead for 
Did you mean: 

Archiving BPM Processes extremly slow

former_member191044
Active Contributor
0 Kudos

Hi all,

we have about 3500 processes a day and the DB is getting bigger very fast. So I began archiving via "java archiving cockpit" which is very "awesome" to use but thats another story. My main problem is that for archiving one day of production it is taking about 3 to 4 HOURS. The system already ran about 1 1/2 year. I'am scared to even calculate the time it would take me to get the system "clean". Is there any chance to get this faster?

Many thanks,

Tobias

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

Hi Tobias,

The parallel archiving is typically quite fast. Of course, the database health is a significant factor in the speed it can run, and of course you need available CPU cores. You should monitor the database during the archiving run to see about query and index performance. Check the use of proper indexes for all the queries.  Use the OpenSQL monitors of NWA to note any expensive queries or DML operations.

Regarding your IO errors,  make sure the ICM settings described in SAP Note 1623502 are correct. My preference is to write small files, leaving the default of 50MB and perhaps lowering it if you get IO errors even after applying note 1623502. If you have persistent errors with IO you should have a system administrator check the IO on the system where you write the archives. If you are writing over a shared filesystem perhaps there's some network problems.

In order to understand your performance expectations it's best to start without parallel the archiving and work up to the maximum parallelism your system can support.

Begin with:

archiving.maxNumberofJobs = 5

archiving.maxNumberofProcessPerJob = 1000

archiving.maxNumberofSeries = 1  <== ensures you only run one Job at a time

This will archive a maximum of 5,000 process instances in one archiving session. You mention your daily avg is 3,500 so this would be all that's needed to maintain the DB at constant # of instances. Check in your archive store to see how many files you have after a session. Check also the duration of the jobs. Subsequently, increase the parallelism (NumberofSeries) and the number of Jobs in each series as needed.

Understand that the NumberofJobs will determine how long a single series runs, and the NumberofSeries determines how many jobs run in parallel. Each series (parallel job) can easily consume a full CPU core, so be aware of overloading the system. If you only have two cores and you try to run 6 series you won't see any speed improvement.... As a rule of thumb I never set the maxNumberofSeries higher than 1/2 the # of CPU cores on either APP or DB servers when I archive in off-hours and I wouldn't set it higher that 1/6 the # of cores for archiving when business operations are ongoing.

regards, Nick

former_member191044
Active Contributor
0 Kudos

Hi Nick,

thank you very much for the detailed answer. I have just checked the settings and I wonder why there are two params "archiving.maxNumberofProcessPerJob". Is this wanted? Which one shall i adjust if i want more processes for each job?

Regards,

Tobias

junwu
Active Contributor
0 Kudos

some typo there....

Answers (5)

Answers (5)

steffenmorawietz
Explorer
0 Kudos

Since we're getting desperate here, because the archiving is still very slow and the most time-consuming part seems to be the deletion of entries in BC_BPEM_BL_ENTRY which contains half a billion entries:

Am I correct when I assume that this table contains all the business logs that can be seen in the process monitoring perspective in the NWA?

If so, wouldn't it be an option to delete entries from this table directly in the database, e.g. "delete from BC_BPEM_BL_ENTRY where OCCURRED_ON < [some date 6 months in the past]" and then use the standard archiving.

Would this only influence the process monitoring in the NWA, because detail information for processes that were older than 6 months wasn't present anymore, or might that cause other problems?

My hope is, that the system would behave like if the business logs log level had been set to "error" or "none" in the first place.

satish_bihari
Explorer
0 Kudos

Dear Steffen,

I would recommend that you raise one CSN on BC-BMT-BPM-MON so that we can have closer look at your system and involve other experts who can help here.

Regards

Satish Bihari

satish_bihari
Explorer
0 Kudos

Hi Tobias,

I would like to get some more details about your system/processes.

1. What is the typical size of each process instance?

2. How many nodes are there in system where BPM is running?

3. Are these simple processes or it also has reference processes as well? If yes, typically how many of them are there per instance?

Regards

Satish Bihari

former_member191044
Active Contributor
0 Kudos

Hi Satish,

1. How do get the typical size of the process? Is there any statistic i can use?

2. We have two nodes running BPM

3. We have referenced processes. These are called between five and seven times. Those are not all different process definitions. We also call the same referenced process several times with different values depending on the business case.

Regards

Tobias

satish_bihari
Explorer
0 Kudos

You can roughly estimate order of the process by process context size. Process context are data object which stores the values of data comming from outside (of process) and used to transfer the values across artifacts.

Say, I am sending the 1 MB xml content to start event and storing them as data object. This means my process context would be ~1 MB. Similarly if I have intermediate events and it also stores the value comming out side then that needs to be added. Similarly if I create a data object and store values to it as part of output mapping.

This aspect is relevent because process context size has impact on archiving performance. Larger the process context size, more time it will take to archive the process instance.

Another aspect which impact the archiving performance is business log level. If this is set to full, then process would create large number of business log and accordingly archiving would take more time.

Another short cut to find out size of process is to archive one instance (give the process instance ID) in archiving cockpit. Go and check the file size in archived location. This would give us fair idea on size of the process (hoping that most of the process instance is of similar size). This approach will also give us time it takes to archive one instance.

Another aspect, as you have two node, running parrallel archiving with more that 2 jobs may not be very effective as more than one job would get scheduled on same node. So, in your case max number of Job should be 2.

Regards

Satish Bihari

former_member191044
Active Contributor
0 Kudos

I am a bit confused with the parameter setup. Do you want me to set the archiving.maxNumberofJobs to 2 or the archiving.maxNumberofSeries?

Regards Tobias

satish_bihari
Explorer
0 Kudos

Dear Tobias,

Sorry for delayed response. It should be maxNumberofJobs. It works like as follows

Series1        Series2

|-----------------|-----------------|

|   job1        | job3          |

|   job2        | job4          |

|-----------------|-----------------|

Here job 1 and job 2 runs in parallel if that many nodes are available. Similarly Job3 and Job4 runs in parallel but after job 1 and job2 finishes. Basically Series 2 will start after Series 1.

Note: There are slight chances that in some cases, job1 and job2 get scheduled on same node even though there is a free node available for node 2.

Hope this clears your confusion.

Regards

Satish Bihari

former_member191044
Active Contributor
0 Kudos

Dear Satish,

thank you for that very good explanation. Now it is clear to me. So i will try it with max 2 jobs but more series and check the runtimes. I will give feedback in this thread.

Regards,

Tobias

former_member191044
Active Contributor
0 Kudos


I have another question. I set up the params that there are max 5k processes archived with one run.

What happens if i shedule a daily job that selects all completed processes older than 90 days which are way more than 5k ofcourse. Would the system just select the first 5k completed processes that it could found and the next day the next bunch and so on?

Regards Tobias.

steffenmorawietz
Explorer
0 Kudos

Also, would the system select the processes in a specific order, i.e. would the oldest processes get archived first in this scenario?

ch_loos
Product and Topic Expert
Product and Topic Expert
0 Kudos
former_member191044
Active Contributor
0 Kudos

Hello Christian,

yes we are using parallel archiving. Currently we are able to run 4 jobs at the same time (with 1000 process instances each). If we try more jobs, we get an IO exception while writing to the filesystem. Also we are not really aware of how much the additional jobs may affect the performance of the production.

Regards,

Tobias

ch_loos
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Tobias,

it is recommended that you run the archiving outside of regular business hours to not affect production performance.

If you face performance issues during archiving, it is best to have a DB admin monitor the database and check for bottlenecks (e.g. reorganize indices).

Regards,

Christian

former_member191643
Active Contributor
0 Kudos

SAP provides out-of-the-box functionality to archive BPM Processes which is comparatively faster (way faster) than other methods. Hardly takes a few minutes.

Refer the document : BPM Archiving Functionality from SAP

Hope it helps.

former_member191044
Active Contributor
0 Kudos

Hello Siddhant and thanks for your reply,

this document shows how to use "java archiving cockpit" which I am using. This tool is probably anything but fast.

former_member191643
Active Contributor
0 Kudos

Hey Tobias,

Sorry I did not go through your question properly. But I don't think there is any other way as of today.