How to avoid modeling errors in Netweaver BPM? Par...

Former Member · ‎04-27-2009

Preface

As a seasoned BPM practitioner you will have become aware of www.workflowpatterns.com which is one of the best BPM resources on the Web. Conceived and maintained by BPM "overlords" Wil van der Aalst, Arthur ter Hofstede and others, it provides a comprehensive survey and classifies recurrently used control flow, data flow and resource perspective patterns. As such, it ranks among my favorite BPM web sites. And by the way, I currently have a separate SDN article in the making which shall give you an overview over how we support many of these patterns in Galaxy.

For the time being (and as a "sneak preview"), I would like to elaborate on some of the more complicated patterns which do not even have a counterpart among the BPMN model elements. I am specifically talking about the "critical section" (WCP39), "thread split" (WCP42), and "thread merge" (WCP41) patterns.

Thread Split

Let's start with the simple stuff, which is how to spawn a number of concurrent threads (aka "tokens"). If that number is small and known at design time, we may actually make use of the ill-famous AND-split/XOR-merge pattern which I already talked about in an How to avoid modeling errors in Netweaver BPM? Part 1: Gateway fun!:

In the example, the downstream "Postfix Process" activity was triggered three times from tokens being spawned by the upstream AND-split/XOR-merge combination. If the number of threads to split is only known at runtime, a different patterns applies:

Here, a flexible "Total" number of threads is spawned using a plain loop pattern. For the sake of simplicity I omitted the details of how the "Prefix Process" fragment initializes the "Index" and "Total" data objects to "1" and the number of to-be-spawned threads, respectively.

Thread Merge

To later merge back concurrent flow (i.e., multiple tokens) on a single control flow branch, a "Thread Merge" pattern needs to reliably consume those tokens and (upon consumption of the last token) produce a single token for the downstream flow.

With AND-join gateways, BPMN already delivers synchronization gateways for concurrent flow (token) on multiple branches out of the box. When it comes to tokens residing on a single control flow branch, things are way more complicated then it may seem at first glance. For an illustration, let's have a look into a straightforward (yet erroneous) approach to synchronize a "Total" number of tokens from a single branch:

The idea was to directly redirect ("Total"-1) tokens to the end event (thereby visiting the "Increment Index" activity) and only pass the last ("Total"-th) token to the "Postfix Process" downstream activity. The problem in here is an inherent race condition between the XOR-split gateway (which compares the integer "Index" data object to the "Total" number of threads to be merged) and the "Increment Index" activity (which increments the "Index" data object by one). In detail, there is no guarantee in which order the Galaxy runtime executes particular process steps (here: the XOR-split gateway and the "Increment Index" activity). For instance, if two tokens were to be merged it may actually happen that the XOR split is triggered twice without any interleaved execution of the "Increment Index" activity. As a result, both tokens would be re-directed to the "Increment Index" branch, omitting the "Postfix Process".

We obviously have to avoid that race condition and make sure that before a successor token triggers the XOR-split, the predecessor token must have completed the "Increment Index" activity:

What we have done is to artificially put the "Increment Index" activity and the downstream XOR split into a joint "critical section". Multiple tokens (representing threads) that emit from the "Prefix Process" process fragment (note that this cannot be a plain activity which never spawns more than one token on its outbound edge) queue up in front of the AND-join gateway where the first token is instantly synchronized with another token from the upstream AND-split. The successor tokens are only (one by one) synchronized whenever a predecessor token has left the critical section (i.e., passed through the "Increment Index" - XOR-split sequence). The very last token (when "Total=Index" holds true) will be directed to the "Postfix Process" activity.

Critical Section

This example can, in fact, be generalized to a universal "Critical Section" pattern where at most one token passes through a process fragment at a time. Critical sections in executable process models come in handy to artificially enforce exclusively using resources.

The only assumption we are making is to have the number of concurrent threads that are about to enter the critical section explicitly known at runtime. For the sake of simplicity, I (again) omitted the details of how the "Index" and "Total" data objects are initialized.

So much on that end! Any comments on these patterns or suggestions on how to further simplify these are highly welcome. Stay tuned for my upcoming SDN article on workflow pattern coverage in Galaxy which will (hopefully) appear soon (sometime in May).