cancel
Showing results for 
Search instead for 
Did you mean: 

CPS up and running but jobs not started / how know if "CPS is alive" ?

Former Member
0 Kudos

Hello,

Last weekend we had a major incident with our CPS Data Base.

The root cause was “db log” full … and unfortunately for us, the monitoring via Solman was not set properly, and consequence  no alert has been generated.

Since the incident the solman monitoring has been reviewed ,  some metrics improved.

But we still have a problem. In Solman the metrics have been created/designed to ping / measure  / check the availability of database, of central instance, the J2EE status etc .. but we would like monitor  the real capacity for CPS to run/execute jobs  to  be sure he is alive.

Currently to be sure that CPS  is up and running / working /processing jobs,  a new job “sent mail” has been created .This job  run 4x/day sending a sms with subject “ I’m alive”.  a test has been done and if DB log is full, the job cannot be processed, so no message is delivered.

But we prefer to have message only when a problem occurs. Meaning that job sending the message “I’m alive” cannot run.

How can we manage/catch a “missing” job ,after 2x he has not been run as scheduled/planned  to raise an alert in Solman ?

Or if you have an better idea to manage the “I’m alive and working”, you are welcome.

Thanks,

Delphine

Accepted Solutions (0)

Answers (3)

Answers (3)

Former Member
0 Kudos

Hello ,

I prefer the first from Guy. reason is that when we detect the probleme I was still able to log on in CPS or related Netweaver and in SAPMMC all was "green". but no job were submitted from 5h ago ...

and another side, the solman specialist let me know that in this case, a GRMC scenario should not be able to detect this .

thanks a lot for your suggestions/collaboration.

Kind regards,

Delphine

Former Member
0 Kudos

Delphine,

Here is the ABAP code I use in SAP to check if the heartbeat job I set up in Redwood for each SAP instance has run or not.

Hopefully, you find it useful if you wish to implement my suggestion.

Regards

Guy

nanda_kumar21
Active Contributor
0 Kudos

I recommend setting up Availability monitoring through GRMG in solman.

Thanks

Nanda

Former Member
0 Kudos

We use a fairly simple method that is easy to implement - heartbeat monitors.  For each system that CPS runs jobs on, we also set up an hourly CPS job called HEARTBEAT for that system that executes on the top of the hour.  It can run a simple ABAP report like RSUSR000, or if the system is a non-ABAP environment it can create a flag file via a Redwood agent, for example.


An ordinary (SM36) SAP batch job (or script run via cron in my other example), is then scheduled to run hourly on the half hour.  In an ABAP environment the job runs a report that checks table TBTCO to see If a job named HEARTBEAT has been executed by the CPS user in the last 45 mins.  If it can't find a job, it assumes that there is a problem with CPS and sends an SMTP email to our support staff to check the issue.  In my other non-ABAP example a shell script could check the the flag file created by the HEARTBEAT job is less than an hour old, and send an SMTP email if it is older than an hour.

If you don't have a 24x7 team that would pick up these alert emails, you can always use an app like Boxcar on your iOS device.  It has ab email address associated with it that makes the app sound a loud alert if s message is sent to that address.  You then use that address in your SMTP alerts.