Worksheets

Explore the details of your data in a Worksheet: filter to rows of interest, extract new fields, create a visualization, and more. Since the underlying data isn’t changed, you can try multiple scenarios as you shape and transform.

Worksheet for a Container Resource Set

Note

A future version of this tutorial will include a sample dataset.

Introduction to data modeling

This example uses data from a set of IoT devices. They report measurements like temperature and humidity, when doors are opened or closed, and power usage of appliances. Like other data sources, the data is visible in the Firehose, also known as the Observation Event Stream. The Firehose has basic search capabilities, but a spreadsheet-like Worksheet gives you many more options, from searching and filtering to creating additional Event Streams and Resource Sets.

Quick Search dialog, search for Observation and open in a Worksheet

Open a Worksheet button Worksheet for the Observation Event Stream using Quick Search icon Quick Search, located in the left rail.

The Observation Event Stream, open in a Worksheet

This basic Worksheet contains a single Stage, a table showing the results of the transformations applied to it. In this case, there are none yet. It is still an undifferentiated collection of raw observations. Since there is so much data here, the first step is to filter to what you want to look at: the IoT sensor data.

Filter

When this HTTP data source was configured, it included a unique path: smartthings-bridge. This became a value for path in the EXTRA field.

Column menu for the EXTRA column

To filter on this value, select Filter JSON from the EXTRA column menu to open the Filter JSON dialog.

The Filter JSON dialog, with the value smartthings-bridge selected

The Filter JSON dialog allows you to select which data you want to see, either by field name or individual value. Select Value from the dropdown and search for smartthings-bridge to show only the matching rows.

Add a second Stage

With the data filtered, OBSERVATION_KIND and EXTRA now contain redundant information. You could delete them right now, but perhaps they might be useful for a later investigation. Click Link New Stage to create a second Stage based on the first one.

Stages progressively build on one another, but they are not themselves independent datasets. Every Stage in a Worksheet maintains a history of the actions applied to it, inheriting the state of the parent. All the data remains in the base dataset. Worksheet Stages are transient views of that data.

Delete Column menu item highlighted in a second Stage

Now you can use Delete Column from the column heading menu to remove OBSERVATION_KIND and EXTRA, without affecting the previous Stage.

Create a dataset

The second Stage now shows only the data of interest: the ingest timestamp and the JSON payload. This is a good point to think about how you might want to use this data.

The underlying source for this Worksheet is the Observation Event Stream, which contains everything from all your data sources. Even with only a few sources, this is an enormous amount of data. Operating directly on all of it, every time, won’t give the best performance. A better option is to create a new Event Stream dataset, so subsequent operations are only applied to the relevant data.

With the second Stage selected (the right rail should say “Linked from Observation”) click Publish New Event Stream. Give this dataset a unique name, such as “Worksheet example IoT raw events”, and click Publish.

The two Stages in the Worksheet are consolidated to a single Stage, with data from the newly created Event Stream. The history is not lost, but is maintained in the Event Stream Definition.

Extract fields

Next, explore the contents of FIELDS:

Extract From JSON dialog, with attribute, deviceId, and value selected

Select Extract from JSON in the FIELDS column menu to create new fields from the JSON payload. For this data, deviceEvent.attribute, deviceEvent.deviceID, and deviceEvent.value are a good place to start. This gives you the ID of the sensor, what type of information it reports, and the value of that reading.

Column heading menu with Remove Empty highlighted

But some of these Observations aren’t actually valid sensor data. To show just the good readings, remove the null value rows by selecting Remove Empty from the deviceId column menu.

Create a new Resource Set

These extracted fields show a basic picture of the data, but the raw deviceId isn’t particularly meaningful. There is a deviceInfo.label in the payload, you could extract a field for that as well. But if you later want to shape this same data in different ways, each new Worksheet would need to extract label every time. A more convenient option is to create a Device Resource each can link to.

To model each sensor as a Resource, start with a new Worksheet for the “Worksheet example IoT raw events” dataset you created earlier. But this time, extract fields related to the sensor itself: deviceInfo.deviceId and deviceinfo.label. Use Remove Empty on deviceId to show only valid values.

At the bottom of the right rail, select Create Resource in the Actions menu to open the Create Resource dialog. (If you don’t see it, make sure you don’t have a column selected.) Select the label and deviceId fields, and also deviceId as the Primary Key so there is a unique key for other datasets to link to. Click Create. This creates a second Stage for deviceId.

Open menu with Set as Resource Set Label highlighted

Next, indicate which field contains the Resource names. This allows other Worksheets or Landing Pages to show a meaningful device name instead of deviceId. From the label column menu, select Set as Resource Set Label.

Right rail with Publish New Resource Set button

If the list of label values look good, click Publish New Resource Set to create the Resource Set dataset. Give it a unique name, like “Worksheet example IoT Device” and publish.

Save the updated Worksheet (optional)

You now have two Worksheets, an Event Stream, and a Resource Set. The Worksheets use data from the Event Stream and Resource Set, but those datasets no longer depend on the Worksheets they were created from. You may save them for later use, or create a new Worksheet next time you need one.

Explore more complex data

You may have Event Streams and Resource Sets created by other members of your team. In this case, shaping and linking existing datasets may be a more common way to use a Worksheet than working with raw data from a new source.

The Device Events Event Stream, in a Worksheet.

Here is an example of the same IoT data, modeled as a collection of linked Event Streams, Resource Sets, and Resources. More detailed shaping and linking makes it easier to understand the relationship between different types. From here, you can perform many different types of investigations with the data.

Filter and visualize

Start with a question: When did someone last make coffee? Coffee at Flamingo House starts by turning on the electric kettle to boil water. Starting with a Worksheet for Device Events, filter Device to just the Water Kettle events.

The Device Events Event Stream, showing only the water kettle

The kettle has two sensors: power and switch. The power sensor reports how much power the kettle is using, so when this value spikes that means someone has turned it on. That is a good choice for the next filter.

With only the power readings, the timechart now shows a spike in usage a few hours ago. To see this in more detail, create a visualization.

Select Add Visualization from the More menu in the upper right and select value for the Y-Axis and Average for Function. Since there’s only one kettle, there is no need to select a GROUP BY option.

Chart showing a spike in power usage around 8AM

The large spike confirms this is when someone turned on the kettle for coffee, at approximately 8AM.

Explore linked datasets

Next question: How does the temperature vary throughout the house?

Open a new Worksheet for Device Events, change the time range to Past 24 hours, and filter sensor to just the temperature sensors. Then add a visualization to show temperature in each room.

Chart of temperature by room. Living room temperature varies from 76 F mid-afternoon to 62 F at 6AM

The temperature readings show a clear pattern throughout the day: slowly cooling through evening and early morning, and then quickly rising around 6:30 in the morning. The living room appears to have the widest range. How do you view just those readings in the visualization? Go back to its parent Stage and filter Room to only show Living Room. The visualization also shows only the living room.

Each value in Room is a link, right click on one and select the Living Room Resource to view other details about the living room. The Living Room Resource Landing Page shows that the living room belongs to, or is a type of, the Resource Set Room. If you want to know more about other rooms, click Room to open the Room Resource Set Landing Page.

Back on the Living Room Resource Landing Page, the Room Overview board in the Fields tab shows details about the humidity, temperature, and other metrics. The Events tab looks much like the Device Events Event Stream in the Worksheet, although everything here only applies to the Living Room.

While you work with data in a Worksheet, you can jump to other related datasets to investigate a detail, and then return to modeling in the Worksheet.

The OPAL console

So far, all this modeling has only used the UI. But every UI action also generates an OPAL statement, visible at the bottom of the page in the console.

OPAL, the Observe Processing and Analysis Language, is a full featured query and modeling language. You can perform the same UI actions with OPAL statements, or add additional statements to your OPAL script for operations not yet available in the UI. Choose whichever you are most comfortable with.

OPAL statements in the console, filter and col_drop

For example, the Filter JSON menu action to filter to smartthings-bridge is equivalent to this OPAL filter statement:

filter (string(EXTRA.path) = "/smartthings-bridge")

For more about OPAL, see Observe Processing and Analysis Language. For additional OPAL examples, see Helpful hints: OPAL. Please contact us if you have questions about data modeling or OPAL, and look for more more resources in the near future.