Heartbeat Analysis Script Causes Backlog on Consolidator (295986)



The information in this article applies to:

  • Microsoft Operations Manager 2000

This article was previously published under Q295986

SYMPTOMS

From the Microsoft Operations Manager (MOM) Administrator console, under Monitor/All Agents, some agents are shown as being unavailable or never appear at all. However, the OnePoint service may be properly started on the target computers. If you stop and start the OnePoint service on the Data Access Server/Consolidator-Agent Manager (DCAM) server, most agents start to appear properly, but over time, start to appear to be down. While this is happening, events are not collected and alerts do not appear. In addition, an event 21272 might be generated in the Application log (Consolidator queue is full).

CAUSE

When you have frequent network communications problems, slow network connections or firewalls between many agents and the DCAM, agents may not be able to send heartbeats to the Consolidator. When this occurs, an event 21209 is generated.

When MOM receives an event 21209, the Agent Heartbeat Failure (Consolidation) rule runs. If several 21209 events are generated (a network problem in an environment with a lot of agents), the rule significantly slows down the consolidator. Because the consolidator queues become full, agent heartbeats do not get through frequently enough, and agents might appear to be down.

WORKAROUND

Disable the "Agent Heartbeat Failure (Consolidation)" rule in environments with more that 50 agents and frequent network communication problems, slow (56K frame or dial-up) network connections, or firewalls between many agents and the DCAM (where the firewall is configured to prevent the heartbeat from the DCAM to the agent). This rule is located under Rules\Processing Rule Groups\Microsoft Operations Manager\Consolidator and is enabled by default.

STATUS

Microsoft has confirmed that this is a problem in the Microsoft products that are listed at the beginning of this article.

Modification Type:MinorLast Reviewed:6/13/2005
Keywords:kbbug kbnetwork KB295986