When we installed CA Wily Introscope for the first time in our SAP environment, it was installed by a consultant, and it would not even run when he left. Someone determined it was due to a lack of memory, so they doubled the amount of allocated RAM. It still wouldn't run, so it was doubled again. Lather, rinse, repeat.
To make a (very) long story short, we finally found the right sizing process, and went through a sizing to determine how much we actually needed. But the sizing made many assumptions that we could not verify and had to use ballpark numbers for. As a result, we found we had to watch the system very closely to see if it was performing as expected.
What we found was that, on occasion, we would see gaps in metrics. This was most noticeable when looking at graphs with a 15-second interval, and looks something like the following, where the gaps in between the dots indicate that there are dots (metrics) missing. It's sort of like connect-the-dots - if a dot is missing, you skip over that one and connect the next two:
In fact, all of our graphs during this period of time had gaps in the metrics just as this one did.
So, on advice from someone in our company knowledgeable about Introscope, we found that there are metrics one can look at to determine the health of the Introscope server itself. These metrics will let you know if Introscope has enough resources to do its job correctly. And it became obvious that ours did not, as evidenced by the following graph of Harvest Duration which is a metric that should mostly be below 3,000 ms, and ours was averaging over 20,000 ms:
So, we restarted Introscope (not a "fix", but that's what was done in this case), and things returned to normal, as you can see in the following graph. Again, this is Harvest Duration, with no gaps in metrics, and only one spike over 8,000 ms, which is acceptable.
All of the recommended values to watch out for with respect to the health of Introscope can be found in the Introscope Investigator. And here is how to get to them, along with the general rules to go by.
- Open an Investigator window
- Navigate to *SuperDomain*>Custom Metric Host (Virtual)>Custom Metric Process (Virtual)>Custom Metric Agent (Virtual)(*SuperDomain*)>Enterprise Manager
- Metrics you want to look at, and you will likely want to look at a 24-hour or 1-week window of time to see if historically the EM has been within acceptable working parameters.
- Health>Harvest Capacity (%) - Recommended is <=75%; in trouble if it’s constantly > 75%. Spikes are ok.
- Health>Heap Capacity (%) - Recommended is <=75%; in trouble if it’s constantly > 75%. Spikes are ok.
- Health>Incoming Data Capacity (%) - Recommended is <=75%; in trouble if it’s constantly > 75%. Spikes are ok.
- Health>SmartStor Capacity (%) – Recommended is <=75%; in trouble if it’s constantly > 75%. Spikes are ok.
- Tasks>Harvest Duration (ms) – Recommended is < 3000ms; in trouble is > 7500ms
- Tasks>SmartStor Duration (ms) – Recommended is <3500ms; in trouble is >= 15000ms
- Overall capacity (%) – Recommended is <=75%; in trouble if it’s constantly > 75%. Spikes are ok.