Troubleshooting

My data is not arriving in Observe

To debug the agent, first run the observe-agent status command in order to pull useful metrics and confirm that the agent is running. This guide will detail some of the most common issues and where to look in the status output to try and determine if that’s the issue.

$ observe-agent status

  Status: Running

  Host Info
  ================
  HostID: ec237474-deac-8894-bb30-8cc3d4ab5dcd
  Hostname: ip-172-31-16-133
  BootTime: 2024-04-18T23:42:38Z
  Uptime: 11m1s
  OS: linux
  Platform: ubuntu
  PlatformFamily: debian
  PlatformVersion: 12.2
  KernelArch: x86_64
  KernelVersion: 6.1.0-13-cloud-amd64

  Agent Metrics
  ================
  ExporterQueueSize: 0
  CPUSeconds: 1.548263s
  MemoryUsed: 29.15625MB
  TotalSysMemory: 34.034195MB
  Uptime: 4434.3013s
  AvgServerResponseTime: 40.093124ms
  AvgClientResponseTime: 1.528125ms

    Logs Stats
    ===============
    ReceiverAcceptedCount: 11
    ReceiverRefusedCount: 0
    ExporterSentCount: 11
    ExporterSendFailedCount: 0

    Metrics Stats
    ===============
    ReceiverAcceptedCount: 1025
    ReceiverRefusedCount: 0
    ExporterSentCount: 1025
    ExporterSendFailedCount: 0

    Traces Stats
    ===============
    ReceiverAcceptedCount: 11
    ReceiverRefusedCount: 0
    ExporterSentCount: 11
    ExporterSendFailedCount: 0

If the agent is not running

$ observe-agent status

  Status: Not Running

This means the agent process has crashed. You can enable debug logging mode by modifying your observe-agent.yaml config file and settings debug to true.

...
# Debug mode - Sets agent log level to debug
debug: true
...

Try to restart the agent with systemctl restart observe-agent and use SYSTEMD_LESS=FRXMK journalctl -u observe-agent to see agent logs.

If the agent service isn’t running, use the Event Viewer app to find any error events and logs emitted by the agent. You will be able to find the events and logs under the Event Viwer (Local)/Windows Logs/Application (Source = ObserveAgent).

If the agent is running but the receiver is rejecting inbound logs/metrics

This issue can be identified via the ReceiverRefusedCount metric. If this count is higher than expected it means that the collector is refusing inbound metrics or spans. This generally indicates that there’s a compatibility issue between the configured receivers and the data that is being delivered to the collector.

If the agent is running but the exporter is receiving errors

This issue can be identified via the ExporterSendFailedCount. Other metrics that are relevant are the AvgServerResponseTime.The most common issues that exporter errors indicate are authorization and networking issues. The first step is to run the diagnose command which will attempt to ping various Observe endpoints to narrow down the issue.

$ observe-agent diagnose

Running diagnosis checks...
Networking Check
================

Running network check against https://123456789.collect.observeinc.com/
Request to test URL responded with response code 200

Auth Check
================

Running auth check against https://123456789.collect.observeinc.com/status...
Request to test URL failed with error {"ok":false,"message":"Unauthorized"}. Please check that the token is present in the
`observe-agent.yaml` config file and that the token is valid.

If the diagnose command returns a networking error, double check the egress firewall and networking settings and make sure that the host on which the agent is running has outbound access to the public internet.

If the diagnose command returns an authentication error, double check that the required fields token and observe_url are set to valid values.

My data is being ingested but it’s not showing up in the Explorer UI.

Use the in-product Contact Support button to contact Observe. On the left side navigation menu, click Docs & Support, Contact Support, and Send Us a Message to contact an Observe Data Engineer.