Observe Basic Data Processing Model¶
The Observe platform is built to capture and visualize machine data form a wide variety of sources. On top of the Observe platform, we’ve built a task-specific application to manage IT infrastructure deployed on top of the AWS cloud infrastructure and the Kubernetes container orchestration system, as a first example of how to use the features of the platform.
The Observe platform contains many affordances for capturing, processing, and presenting data, with special attention paid to relationship between entities, and evolution of such relationships over time.
The main steps along this path are:
Data Capture and Forwarding
Originally, the data came from somewhere. This might be messages printed to the console from a batch process. These data may be log files on disk coming from a server such as a web, application, or database server. These may be events emitted by some small internet-of-things sensor or device, or by a machine on a factory floor, or by a cash register in a department store. Data may even already live in a database table somewhere, prepared by some other system (this is common for, for example, customer profile information.) Which origin data come from is important to capture as metadata about the data, so that appropriate processing can be made in steps further down. As an example, the time zone in effect at the time of data capture may be important later, when correlating events across a globally distributed enterprise. Additionally, which physical host or machine or container or database the data come from, is important metadata, as opposed to the entities perhaps mentioned within the data. For example, a web server may emit a log statement saying that user X logged in from ip Y to application area Z. The log message contains references to entities X, Y, and Z, but the entity of which web server actually emitted the log statement, is metadata about the origin, not found within the datum itself.
The Observe model is to be agnostic about how data are generated — we don’t have a custom API that a customer has to use to generate data for Observe. Instead, we capture data through whatever means are already available. If a customer wants use a rich data generation API (such as OpenTracing or Prometheus or logging full JSON encoded objects) then that’s easy to add using whatever mechanism works best for that customer.
Data Capture and Forwarding¶
Data are captured either using special collection services known as “agents,” or by pointing data producers directly at collection services. A customer can use industry standard agents like
fluent-bit to capture log files or other data inputs, and a customer can also choose to host one or more instances of the
observe-agent for capturing data.
observe-agent is especially useful on systems using Prometheus, as it can query Prometheus endpoints and push the results into the Observe system.
observe-agent also adds metadata about where it’s running and where it’s capturing data.
Observe runs a set of ingestion endpoints in the cloud. For data that doesn’t come in through Snowflake data sharing, this is the destination where the customer hands it off, and it can no longer be directly modified by the customer. At this point, information such as which registered customer provided the data, and through which customer-defined integration name, is attached as metadata. Service authentication is also done at this point — data that are not accompanied by proper customer-specific credentials are not accepted into the system.
Customer data production may be bursty. This is especially true when new systems are onboarded, and historical data are captured. Additionally, while Observe works to maintain industry-leading uptime, there exists the possibility of an outage on the Observe platform or dependent Snowflake data processing side. To avoid having to reject data provided by customers, all data collected go through a buffer stage, with sufficient storage capacity for several days of data ingest. Under normal circumstances, the queuing latency in this buffer is negligible, but during ingest spikes or temporary capacity outages, this buffer makes sure the data will eventually be processed if they have been accepted.
Data are pulled from the buffer, and loaded into the Snowflake data warehouse, through a process known as the loader. The function of this loader is to collate data arriving for individual customers into per-customer load requests, as well as format and forward data and metadata in a mode suitable for semi-structured SQL processing.
All the stages until now clearly keep a separation between “the data” that the customer initially provided, and “the metadata” that were captured around the data. Because the data are loaded in a form as un-touched as possible into the first permanent store, it is always possible for the customer to change their mind about how to process data, and go back to the initial data store to apply new processing rules. We call this unmodified original data “evidence.”
Once evidence is loaded into the base layer (which we call the “Observations table” or the “firehose”) the process of refining and shaping it to well-behaved entities with relations starts. When starting with Observe, a customer will get one or more pre-installed transformation configurations, for example for AWS infrastructure or Kubernetes clusters, but the platform allows customers to modify these initial configurations, to extend them with further derived configurations, and to create new basic configurations from scratch.
Transformation is viewed as successive steps of refinement, where datasets are selected, filtered, and processed out of the raw observation stream. For example, a set of rules may select observations from a Kubernetes apiserver that talks about container creation, lifetime, and death, and extract the container ID, cluster ID, and other relevant fields out of those log events, and create a dataset called “container events.” A further derived transform may take these container events, identify resource keys in the events (in this case, cluster ID + cluster-specific container ID,) and make the system build a resource out of this set of updates. Those resources are then available to other processing streams that happen to have the same kind of identifier in them, so we can talk about “services running in containers” and so forth.
The majority of the Observe platform implementation focuses on making all necessary data and metadata available for the transform step, and efficiently implementing the transform step both for pre-configured, and user-configured datasets. Decisions made in this area include anything from how frequently to pre-process incoming data, to whether to process the data only on demand, or accelerate the result of a transform to make it immediately accessible to queries without further processing.
Transforms are described using statements in the temporal algebra query language we created called OPAL. These transforms also run in an environment that is defined for the transforms in question — for example, if a transform joins four different datasets, that transform runs after the transforms creating those datasets have output their results. This is an implementation decision made by choosing to treat stream processing as a never-ending sequence of small batches, which makes processing more efficient than a pure stream-based system.
Once a dataset is defined, and if the system decides to accelerate its transform (rather than just remembering the transform rules and applying them on demand when a query is run,) one or more tables are created for its results in the Snowflake data warehouse. Tables may be partitioned both across attributes, across time, and across functional areas. Frequently changing attributes of a resource, such as metrics, may be stored separately from seldom-changing attributes, like the designated name or CPU type of a host. Datasets may be partitioned in time, to allow for larger datasets without exceeding particular built-in volume limitations of per-table size in the underlying Snowflake database.
To the user, the specific storage chosen by the storage subsystem is not visible, as the datasets present themselves and behave as per their definitions in dataset schema and metadata. However, the correct choice of storage for each part of the dataset has significant efficiency impact.
Once the user wants to query some state of the system being observed, a query is formed on top of the existing datasets, using the OPAL query language. This query is additionally conditioned to be easily presented in the user interface. For example, an OPAL statement that runs as a transform will unconditionally process all matching data, whereas a UI may limit the number of rows presented to something like 1,000 rows, because the user will not be expected to scroll through millions of results, but will instead further aggregate and filter the query to find the results they are interested in.
The queries formulated in the user interface generally do not come from direct user-input of OPAL statements, but instead are built by the user using affordances in the user interface, such as “follow link to related dataset” and “show only values in the top-10 list,” or clicking to focus on a specific set of entities or time range.
Another user of the query language is the user interface created for browsing datasets and metadata.
Interactive data exploration benefits from data further conditioned than what a raw processing query can provide. Thus, the presentation includes affordances such as “rolling up” resource states (returning one row per resource instance, with all the states of that resource over time merged into a single column,) and “linking” key columns — showing the name of the target entity, in place of the specific key value used to declare a foreign key relationship. The presentation layer also supports calculating summaries and statistics about the columns of data being presented, allowing the user interface to display sparklines, histograms, top-k displays and other helpful affordances in context with data being displayed. Because many such summaries cannot be efficiently stream transformed, they are rendered as part of the presentation layer, rather than as part of the underlying tree of transform streams.