Tuning and Troubleshooting Monitor Health

Monitors report their health in the Monitors List page, accessed from the left navigation rail. The monitor’s health is also visible in the header of the view or editing page for a single monitor. Reported health can be in one of the following states:

  • Running - The monitor is working as expected

  • Warnings - The monitor is executed as expected but has issues that need investigating. Warnings may be caused by trouble reaching destinations, upstream data changes that invalidate findings, or similar concerns.

  • Failed - The monitor’s most recent attempt to evaluate data ended in an error.

  • Disabled - The monitor is not evaluating. This is most likely to be an administrative decision, but it is possible for a hyperactive monitor to be disabled by Observe.

Common Trouble States

  • Failed to evaluate monitor – This causes a Failed health state that lasts until the next successful execution. A monitor may fail to evaluate because of dataset re-materialization. Materialization failures may indicate hitting an on-demand materialization limit. Resolving this state requires reviewing your Usage Dashboard and Acceleration Manager to determine if datasets the monitor needs are unable to materialize.

  • Detected upstream data updates – Observe maintains awareness of monitored data stability, and will set a Warning state due to changes to data that has already been evaluated. Changes in upstream data can lead to flapping of alert states, false negative, or false positive alerts. The root cause of a change can be due to late-arriving data or a stability delay that is not sufficient for the expected upstream data. Note that upstream stability is not only a matter of external data providers; data instability can be produced by joining a dataset with a periodic update as well. To correct this issue, gradually increase the monitor’s stabilization delay value.