Service Explorer¶

The Service Explorer is a core feature of Observe’s APM offering and supports a number of use cases across incident response and development.

Use cases of the Service Explorer¶

Service & database discovery: see all of the microservices and databases in your system.
Service & database dependency mapping: see the communication relationships among microservices as traffic flows through them.
Service & database inspection: get a detailed snapshot of the health of a particular service, including RED metrics, endpoints, errors & exceptions, deployments, and Kubernetes pods.

Workflows supported by the Service Explorer¶

Service Explorer includes several workflows to help drill down into service performance, correlate deployments and dependencies, and view underlying telemetry signals such as logs and traces to get to the root cause of problems during an incident or to identify potential optimizations to improve baseline performance.

Correlate service health with new deploys¶

When inspecting a service or endpoint, view deployment markers on RED metric charts to correlate spikes in latency or errors to a new deploy. Use the Deployments tab to see all active versions of your service or endpoint, view RED metrics grouped by deploy to spot anomalous deploys, and pivot to traces related to a specific deploy.

Correlate service health with downstream dependencies, and view the blast radius¶

Use the service-scoped dependency map when inspecting a service to visualize the communication flow between your service and its upstream callers & downstream dependencies. View the throughput of upstream dependencies and errors/latency of downstream dependencies to quickly spot potential hotspots elsewhere in your system that may be affecting or affected by your service.

Correlate service health with Kubernetes infrastructure¶

View service or endpoint performance over time broken down by Kubernetes pod. Pivot to traces running on a particular pod to isolate the pod as a potential performance bottleneck. From traces, pivot into pod infrastructure dashboards.

Pivot to logs, metrics, and traces in context¶

Once you spot an anomaly, you naturally want to check out the underlying data in order to troubleshoot it some more. Service Explorer provides several workflows to support this:

When troubleshooting slow/erroring services in a service map, pivot directly to logs, metrics, and traces linked to the service to gain more insight into potential root causes.
When troubleshooting slow/erroring edges in a service map, pivot to traces where that edge is present to gain more insight into that particular communication flow between services.
As covered in the other workflows, there are several other pivot points into related traces to help continue the troubleshooting journey.

Inspect endpoints¶

When inspecting a service, see the performance of each endpoint so you can quickly find slow or erroring endpoints, and pivot directly to traces containing the endpoint. Select an endpoint to filter the entire inspector view to just that endpoint – now all of the KPIs and other information (deployments, errors/exceptions, k8s pods, etc.) are scoped to that endpoint.

View errors and exceptions and identify newly-occurring exceptions¶

Errors and exceptions are the strongest signals of potential problems in a microservice or an endpoint. Service Explorer displays the top errors/exceptions from span and span event data, as well as time-series grouped by unique error message & exception stack trace. From there you can pivot into traces and exception logs for a given error or exception.

View database dependencies and inspect database health¶

Observe automatically discovers databases from OpenTelemetry spans that are identified as database calls, and surfaces them in the Service Explorer. Service maps include communication between services and databases, and databases can be inspected much like microservices. When inspecting a database, see slow/erroring operations and statements, and pivot to traces where those operations or statements are invoked to continue troubleshooting.

View slow and erroring traces for a service or endpoint¶

Often the most interesting traces are slow or erroring traces. When inspecting a service or endpoint, use the Traces tab to view the slowest traces, error traces, or both slow and erroring traces that flow through the currently-inspected service or endpoint.

Save interesting findings to a Worksheet¶

Most charts (including service dependency maps) in the Service Explorer can be added to Worksheets, which can be used as notebooks during an incident or simply to extend the out-of-the-box functionality by building custom views on the data.

The Service Explorer is powered by OpenTelemetry data and requires the OpenTelemetry App. Learn more about how the Service Explorer uses OpenTelemetry data provided by the app.