You have the CCMS that goes PING?

JimSpath · ‎11-07-2009

CCMSPING

This tale starts with, perhaps, a foolish inconsistency. For a while, I noticed that Solution Manager Early Watch Alert reports were showing invalid system availability metrics. Not off by a little, but off by a lot, showing 49% or 50% uptime on systems that were absolutely present for end users for the entire week, barring known short backup periods. [UPDATE 18-Jan-2010: for the finale, see link at bottom]

After head scratching with the Basis team, I opened a ticket with SAP. No answer for a couple days, and then the surprise answer, "are you running CCMSPING?" A surprise, because neither the Early Watch report, nor any documentation I recalled seeing said that was the only valid method to gauge uptime. Why didn't the Early Watch report show the data source, and the quality of the analysis, and the alternatives? I'm not sure, but I will ask. Once I see whether the alternative works correctly. Which is where we are today.

Early Watch Alert

What does the alert look like?

Each week, a chart is produced, something like this:

That graphic was copied and pasted intact from a recent EWA report. If I had to guess why there is a spike in ABAP errors in July, I would be thinking, support packs went in. The metric we're interested in, because our customers feel it, is availability, shown as blue diamonds.

Let's take a closer look, shall we? This graphic is the lower left hand corner of the EWA image, zoomed a few hundred percent, with red lines and "100%" added for emphasis.

Rather challenging to be alerted that your system is not 100% available each week? How many 9s? As the Aussies say, "Yowie!"

Links:

SAP Help link

- Monitoring

My prior blogs mentioning ccmsping:

Master Strategy

Meanwhile, we were informed by one of our SAP contacts that the Solution Manager overview guide, ponderously titled "Master Guide - SAP Enhancement Package 1 for SAP Solution Manager 7.0" had been updated to include an analysis of how many Solution Manager systems are required for best practices.

The guide (10 MB PDF) is available below http://service.sap.com/instguides or, if the internet daemons are kind and have not moved the URL around:

https://service.sap.com/~sapidb/011000358700000291922009E.PDF

I re-factored the 2 tier system drawing, as it was fuzzy, and there were components on it we didn't use, as a Visio drawing. In order to try to fit that 8.5x11 inch view onto a 500x700 pixel image for SCN, I exported it as a Scalable Vector Graphic, brought that into OpenOffice(.org), then rescaled it and saved it as a Portable Network Graphic file. The text didn't scale with the boxes, so you can generally read the font here. I can probably post the SVG file somewhere if anyone wants it.

VSD, SVG, ODG, PNG

One to Many, Many to Many?

After I reread the manuals for CCMS Monitoring, found the latest online manuals (I think - Monitoring Setup Guide / for SAP NetWeaver 7.0 / 2004s SPS 12 ) I wasn't sure how many CCMSPING agents to deploy, and where. Our classroom exercise had us set up 2 agents, monitoring 2 systems, reporting to 2 central systems. I think one agent (or process) can monitor multiple agents, but we chose to set our our CCMSPING initial trial with one agent (or one process) connected to one remote system.

On the right side of the drawing above is our Solution Manager development environment. We started 4 agents, each with a different serial number, to connect to our 4 sandboxes. If this works, we'll set up the remainder of the agents on production, connected to everything but the sandboxes.

Agent pre-work

It might be a little confusing, but port values need to go into /etc/services (or Windows equivalent) before the availability checking can reach the destination. They look something like this:

sapmsDNA 3642/tcp

Agent registration

The manuals say to start up the agent via a command line process. We used Windows in class, UNIX in the real world. A sample:

$ ccmsping -R -push -n13

No spaces between the "n" and the "13". Yes it said it in the book, but it's so normal to type "-n 13" for parameter lines! If you include the space, you get the not-very-friendly "usage" report.

What happens after this is you supply a bunch of answers to prompts. The main key is that you're connecting the agent to Solution Manager, not to the target machine, and secondarily, use client 000. Probably. We were tempted to include the application server to be monitored, but that is done inside the GUI.

Our systems have the gateway and message processors on the same host, so those answers are identical. And, make sure you know the account password for the central registry. We used CSMREG (the account, not the password, of course).

If nothing goes wrong, you'll get:

Start agent? n/[y] :

Again, the book goes on to tell you how to unregister the agent, when it should go into more detail on how to stop it and start it, since those will be done much more often when things go wrong. So, start back up with:

ccmsping -DCCMS -push -n13

And end nicely with:

$ ccmsping -stop -push -n13

INFO: CCMS agent ccmsping working directory is /usr/sap/tmp/ccmsping_13/ccmsping

INFO: CCMS agent ccmsping config file is /usr/sap/tmp/ccmsping_13/ccmsping/csmconf
INFO: Stopping Agent Process 1495094...
INFO: Process 1495094 was stopped. Return value: 0
INFO: Background process was stopped.
INFO: Agent was stopped.

EXITING with code 0

Isn't the verbosity wonderful? We're probably going to set these commands up in our Enterprise Scheduler as well as build in a process check for easier continuity.

Configuring

What follows are a bunch of screen shots showing our progress in getting the agents known to Solution Manager, and where we are now.

The first 2 shots are the basic start monitoring and overview options. We followed the instructions. I've had to shrink the images to crop on the SCN blog limit, but they should get you started.

For "Responsible CCMSPING" we chose to use a different agent, with a different serial number, for each target system. You might be able to ping multiple systems from one agent, though the instructions seem to be saying use a different "n". Not totally clear to me. But what we did works, so we're staying with it unless someone has a reason not to do this.

My Basis contact showed me you can drag and drop systems from un-monitored state to monitored. I was clicking and clicking. And clicking.

There's a CCMSPING pulldown to test basic functions. Use it. If that fails, you're missing something.

And even if it works, you aren't ensured everything is right. Continue from RZ21 to RZ20.

Above, you can see what the report looks like when the agent has started and been configured, but not correctly. I needed to fix the agent identity.

Once the right "responsible agent" was selected, the data started to show up.

Above, the SCM sandbox has been through a cycle and shows 100% uptime, a dialog response time, and a user count, from the message server login process. Some of the dates and times are zero but will update on later cycles. The other sandbox is connected, but has not completed a cycle, so all values are null.

A bit later, the SCM sandbox shows a partial uptime, as it had been bounced for parameter changes. This change was not done as part of my agent work, though it confirmed that the report is correct.

Lastly

We added Java monitoring after the ABAP checks. While we followed the cookbook, we're getting 100% red lights. More debugging of the debugger is needed.

If you notice timestamp discrepancies above, it's because I switched from CEN to EST after I noticed the screens weren't showing me wall time. I'll probably ask that the system clock be set to local time, not Walldorf time.

POST-LASTLY

For the conclusion to this issue, see the next blog, "You could hear a ping drop."

You have the CCMS that goes PING?

VSD, SVG, ODG, PNG

Are you there, SAP? It's me, Jelena

Integration Point of MM-FI-SD in SAP ERP

SAP Project System - A ready Reference ( Part 1 )