cancel
Showing results for 
Search instead for 
Did you mean: 

Subject: ALL_SERVICES_ALERTS Danger Event

BillW
Participant
0 Kudos

Hi All,

I received this alert and can't find anything in the logs that would back this alert up. I know this is a canned watch. The only thing that comes close is that it seems the CMS auto restarted for some reason. As the Date Modified shows that in the CMC. If the CMS restarts then everything on the server would also restart. The other servers have older Date Modified entries.

What would cause this to happen?

Subject: ALL_SERVICES_ALERTS Danger Event

Danger Rule evaluated to true for "ALL_SERVICES_ALERTS" watch.

Danger Rule: BOProdCluster.APS.Visualization$'Health State'==0 || BOProdCluster.APS.Visualization$'Health State'==5 || BOProdCluster.APS.Analysis$'Health State'==0 || BOProdCluster.APS.Analysis$'Health State'==5 || BOProdCluster.APS.Auditing$'Health State'==0 || BOProdCluster.APS.Auditing$'Health State'==5 || BOProdCluster.APS.Connectivity$'Health State'==0 || BOProdCluster.APS.Core$'Health State'==0 || BOProdCluster.APS.Core$'Health State'==0 || BOProdCluster.APS.DF$'Health State'==0 || BOProdCluster.APS.LCM$'Health State'==0 || BOProdCluster.APS.Monitoring$'Health State'==0 || BOProdCluster.APS.Search$'Health State'==0 || BOProdCluster.APS.WEBI$'Health State'==0 || BOProdCluster.APS.WEBIDSLBridge$'Health State'==0 || BOProdCluster.APS.WEBIDSLBridge1$'Health State'==0 || BOProdCluster.AdaptiveJobServer$'Health State'==0 || BOProdCluster.CentralManagementServer$'Health State'==0 || BOProdCluster.ConnectionServer$'Health State'==0 || BOProdCluster.ConnectionServer32$'Health State'==0 || BOProdCluster.ConnectionServer32$'Health State'==0 || BOProdCluster.InputFileRepository$'Health State'==0 || BOProdCluster.OutputFileRepository$'Health State'==0 || BOProdCluster.WebApplicationContainerServer$'Health State'==0 || BOProdCluster.WebIntelligenceProcessingServer$'Health State'==0 || BOProdCluster.WebIntelligenceProcessingServer1$'Health State'==0 || BOProdCluster.WebIntelligenceProcessingServer2$'Health State'==0 || BOProdCluster.WebIntelligenceProcessingServer3$'Health State'==0 || Cluster58.APS.Analysis$'Health State'==0 || Cluster58.APS.Auditing$'Health State'==0 || Cluster58.APS.Connectivity$'Health State'==0 || Cluster58.APS.Core$'Health State'==0 || Cluster58.APS.DF$'Health State'==0 || Cluster58.APS.LCM$'Health State'==0 || Cluster58.APS.Search$'Health State'==0 || Cluster58.APS.Visualization$'Health State'==0 || Cluster58.APS.WEBI$'Health State'==0 || Cluster58.APS.WEBIDSLBridge$'Health State'==0 || Cluster58.APS.WEBIDSLBridge1$'Health State'==0 || Cluster58.AdaptiveJobServer$'Health State'==0 || Cluster58.CentralManagementServer$'Health State'==0 || Cluster58.ConnectionServer$'Health State'==0 || Cluster58.ConnectionServer32$'Health State'==0 || Cluster58.DashboardsCacheServer$'Health State'==0 || Cluster58.DashboardsProcessingServer$'Health State'==0 || Cluster58.EventServer$'Health State'==0 || Cluster58.InputFileRepository$'Health State'==0 || Cluster58.OutputFileRepository$'Health State'==0 || Cluster58.WebApplicationContainerServer$'Health State'==0 || Cluster58.WebIntelligenceProcessingServer$'Health State'==0 || Cluster58.WebIntelligenceProcessingServer1$'Health State'==0 || Cluster58.WebIntelligenceProcessingServer2$'Health State'==0 || Cluster58.WebIntelligenceProcessingServer3$'Health State'==0 || BOProdCluster.APS.Connectivity$'Health State'==0 || BOProdCluster.APS.Core$'Health State'==0 || BOProdCluster.APS.Core$'Health State'==0 || BOProdCluster.APS.DF$'Health State'==0 || BOProdCluster.APS.LCM$'Health State'==0 || BOProdCluster.APS.Monitoring$'Health State'==0 || BOProdCluster.APS.Search$'Health State'==0 || BOProdCluster.APS.WEBI$'Health State'==0 || BOProdCluster.APS.WEBIDSLBridge$'Health State'==0 || BOProdCluster.APS.WEBIDSLBridge1$'Health State'==0 || BOProdCluster.AdaptiveJobServer$'Health State'==0 || BOProdCluster.CentralManagementServer$'Health State'==0 || BOProdCluster.ConnectionServer$'Health State'==0 || BOProdCluster.ConnectionServer32$'Health State'==0 || BOProdCluster.ConnectionServer32$'Health State'==0 || BOProdCluster.InputFileRepository$'Health State'==0 || BOProdCluster.OutputFileRepository$'Health State'==0 || BOProdCluster.WebApplicationContainerServer$'Health State'==0 || BOProdCluster.WebIntelligenceProcessingServer$'Health State'==0 || BOProdCluster.WebIntelligenceProcessingServer1$'Health State'==0 || BOProdCluster.WebIntelligenceProcessingServer2$'Health State'==0 || BOProdCluster.WebIntelligenceProcessingServer3$'Health State'==0 || Cluster58.APS.Analysis$'Health State'==0 || Cluster58.APS.Auditing$'Health State'==0 || Cluster58.APS.Connectivity$'Health State'==0 || Cluster58.APS.Core$'Health State'==0 || Cluster58.APS.DF$'Health State'==0 || Cluster58.APS.LCM$'Health State'==0 || Cluster58.APS.Search$'Health State'==0 || Cluster58.APS.Visualization$'Health State'==0 || Cluster58.APS.WEBI$'Health State'==0 || Cluster58.APS.WEBIDSLBridge$'Health State'==0 || Cluster58.APS.WEBIDSLBridge1$'Health State'==0 || Cluster58.AdaptiveJobServer$'Health State'==0 || Cluster58.CentralManagementServer$'Health State'==0 || Cluster58.ConnectionServer$'Health State'==0 || Cluster58.ConnectionServer32$'Health State'==0 || Cluster58.DashboardsCacheServer$'Health State'==0 || Cluster58.DashboardsProcessingServer$'Health State'==0 || Cluster58.EventServer$'Health State'==0 || Cluster58.InputFileRepository$'Health State'==0 || Cluster58.OutputFileRepository$'Health State'==0 || Cluster58.WebApplicationContainerServer$'Health State'==0 || Cluster58.WebIntelligenceProcessingServer$'Health State'==0 || Cluster58.WebIntelligenceProcessingServer1$'Health State'==0 || Cluster58.WebIntelligenceProcessingServer2$'Health State'==0 || Cluster58.WebIntelligenceProcessingServer3$'Health State'==0

The metrics that have crossed their respective thresholds:

BOProdCluster.CentralManagementServer$'Health State'

BOProdCluster.CentralManagementServer$'Health State'

Appreciate any suggestions.

BW

Accepted Solutions (1)

Accepted Solutions (1)

Toby_Johnston
Advisor
Advisor
0 Kudos

Hey Bill,

If the CMS restarted then it would trigger this alert since BOProdCluster.CentralManagementServer$'Health State' watch would be triggered if the server is stopped.

One thing you can do is edit the watch and change the threshold to only trigger the watch if it has been in danger state for > 10 minutes for example.  This way, if the server is simply restarted the watch won't give an alert.

Cheers

Toby

BillW
Participant
0 Kudos

Yea Toby,

That's what I did. My concern is if the CMS did start and there's nothing in the logs that show this happening, where can I find what really happen? I've looked at all the logs and the event logs. Have you ever run across this type of issue?

Toby_Johnston
Advisor
Advisor
0 Kudos

Hey Bill,

If the CMS is restarted, then on the server where the CMS is running, in the Application event log there will be a warning entry from the source Server Intelligence Agent that says something like:

[Node Name: BI42LCM2]

[User Name: BI42LCM2-0$]

Server Intelligence Agent is requesting server BI42LCM2.CentralManagementServer to terminate.

You could also check under CMC->Monitoring->Metrics -- >'Server Running State' metric then view history and change the date range to see when the server was stopped over the past days/weeks/months etc


Regards

Toby





Answers (0)