Troubleshooting¶
My data is not arriving in Observe¶
To debug the agent, first run the observe-agent status
command in order to pull useful metrics and confirm that the agent is running. This guide will detail some of the most common issues and where to look in the status output to try and determine if that’s the issue.
$ observe-agent status
Status: Running
Host Info
================
HostID: ec237474-deac-8894-bb30-8cc3d4ab5dcd
Hostname: ip-172-31-16-133
BootTime: 2024-04-18T23:42:38Z
Uptime: 11m1s
OS: linux
Platform: ubuntu
PlatformFamily: debian
PlatformVersion: 12.2
KernelArch: x86_64
KernelVersion: 6.1.0-13-cloud-amd64
Agent Metrics
================
ExporterQueueSize: 0
CPUSeconds: 1.548263s
MemoryUsed: 29.15625MB
TotalSysMemory: 34.034195MB
Uptime: 4434.3013s
AvgServerResponseTime: 40.093124ms
AvgClientResponseTime: 1.528125ms
Logs Stats
===============
ReceiverAcceptedCount: 11
ReceiverRefusedCount: 0
ExporterSentCount: 11
ExporterSendFailedCount: 0
Metrics Stats
===============
ReceiverAcceptedCount: 1025
ReceiverRefusedCount: 0
ExporterSentCount: 1025
ExporterSendFailedCount: 0
Traces Stats
===============
ReceiverAcceptedCount: 11
ReceiverRefusedCount: 0
ExporterSentCount: 11
ExporterSendFailedCount: 0
If the agent is not running¶
$ observe-agent status
Status: Not Running
This means the agent process has crashed. You can enable debug logging mode by modifying your observe-agent.yaml
config file and settings debug
to true
.
...
# Debug mode - Sets agent log level to debug
debug: true
...
Try to restart the agent with systemctl restart observe-agent
and use SYSTEMD_LESS=FRXMK journalctl -u observe-agent
to see agent logs.
If the agent service isn’t running, use the Event Viewer
app to find any error events and logs emitted by the agent. You will be able to find the events and logs under the Event Viwer (Local)/Windows Logs/Application
(Source = ObserveAgent).
If the agent is running but the receiver is rejecting inbound logs/metrics¶
This issue can be identified via the ReceiverRefusedCount
metric. If this count is higher than expected it means that the collector is refusing inbound metrics or spans. This generally indicates that there’s a compatibility issue between the configured receivers and the data that is being delivered to the collector.
If the agent is running but the exporter is receiving errors¶
This issue can be identified via the ExporterSendFailedCount
. Other metrics that are relevant are the AvgServerResponseTime
.The most common issues that exporter errors indicate are authorization and networking issues. The first step is to run the diagnose
command which will attempt to ping various Observe endpoints to narrow down the issue.
$ observe-agent diagnose
Running diagnosis checks...
Networking Check
================
Running network check against https://123456789.collect.observeinc.com/
Request to test URL responded with response code 200
Auth Check
================
Running auth check against https://123456789.collect.observeinc.com/status...
Request to test URL failed with error {"ok":false,"message":"Unauthorized"}. Please check that the token is present in the
`observe-agent.yaml` config file and that the token is valid.
If the diagnose command returns a networking error, double check the egress firewall and networking settings and make sure that the host on which the agent is running has outbound access to the public internet.
If the diagnose command returns an authentication error, double check that the required fields token
and observe_url
are set to valid values.
My data is being ingested but it’s not showing up in the Explorer UI.¶
Use the in-product Contact Support button to contact Observe. On the left side navigation menu, click Docs & Support, Contact Support, and Send Us a Message to contact an Observe Data Engineer.