Worksheets

Explore the details of your data in a Worksheet to filter to rows of interest, extract new fields, create a visualization, and more. Since the underlying data doesn’t change, you can try multiple scenarios as you shape and transform.

Worksheet for a Container Resource Set

Figure 1 - Worksheet for a container resource set

Introduction to data modeling

This example uses data from a set of IoT devices. The devices report measurements such as temperature and humidity, when doors open or close, and power usage of appliances. As with other data sources, you can view the data in the Observation Event Stream. You can perform basic searches, but using the spreadsheet-like Worksheet gives you many more options, from searching and filtering to creating additional Event Streams and Resource Sets.

Quick Search dialog, search for Observation and open in a Worksheet

Figure 2 - Quick search

Open a Worksheet button Worksheet for the iot/Raw Events event stream using Quick Search icon Quick Search, located in the left menu.

The Observation Event Stream, open in a Worksheet

Figure 3 - iot/Raw events event stream

This basic Worksheet contains a single Stage of data without any transformations, and appears as an undifferentiated collection of raw observations. To reduce the quantity of data in the Worksheet, filter the data to view the IoT sensor data.

Filtering data

When you configured the HTTP data source initially, the Worksheet included a unique path, smartthings.

Column menu for the Fields column

Figure 4 - Column menu for Fields

To filter on this value, select Filter from the fields column menu to open the Filter dialog.

The Filter  dialog, with the value smartthings-bridge selected

Figure 5 - Filter dialogue options

The Filter dialog allows you to select the data you want to see, either by field name or individual value. Select Value from the dropdown menu and search for smartthings to only display the matching rows.

Adding a second Stage

With the data filtered, OBSERVATION_KIND and EXTRA columns contain redundant information. You could delete them, but you might want to use them to model data at a future date. Click Link New Stage to create a second Stage based on the first one.

Stages progressively build on one another, but are not independent datasets. Every Stage in a Worksheet maintains a history of the actions applied to it, inheriting the state of the parent. All of the data remains in the original dataset. You use Worksheet Stages to display transient views of the data.

Delete Column menu item highlighted in a second Stage

Figure 6 - Delete column

Now you can Delete Column from the column heading menu to remove OBSERVATION_KIND and EXTRA, without affecting the previous Stage.

Creating a dataset

The second Stage now displays only the data of interest, the ingest timestamp and the JSON payload. Give careful consideration to how you want to use the data to gain insights into your devices.

The underlying source for this Worksheet uses the Observation Event Stream, which contains everything from all of your data sources. Even with only a few sources, this can be a large amount of data. Performing operations directly on all of it, every time, doesn’t provide the best performance. Instead, create a new Event Stream dataset, so subsequent operations only apply to the relevant data.

With the second Stage selected, (the right menu should display “Linked from Observation”) click Publish New Event Stream. Create a unique name for the dataset, such as “Worksheet example IoT raw events”, and click Publish.

Observe consolidates the two Stages in the Worksheet into a single Stage, with data from the newly created Event Stream. You don’t lose any history, because Observe maintains in the Event Stream Definition.

Extracting fields

Next, explore the contents of FIELDS:

Extract From JSON dialog, with attribute, deviceId, and value selected

Figure 7 - Extract from JSON

Select Extract from JSON in the FIELDS column menu to create new fields from the JSON payload. For this data, deviceEvent.attribute, deviceEvent.deviceID, and deviceEvent.value provide a good starting point. This gives you the ID of the sensor, the type of reported information, and the value of the reading.

Column heading menu with Remove Empty highlighted

Figure 8 - Remove Empty fields

But some of these Observations don’t provide valid sensor data. To show just the valid readings, remove the null value rows by selecting Remove Empty from the deviceId column menu.

Creating a new Resource Set

These extracted fields provide you with a basic picture of the data, but the raw deviceId doesn’t provide particularly meaningful information. There is a deviceInfo.label in the payload, you could extract a field for that as well. But if you later want to shape this same data in different ways, you need to extract label for each new Worksheet. Instead, create a Device Resource linking the information.

To model each sensor as a Resource, start with a new Worksheet for the “Worksheet example IoT raw events” dataset you created earlier. But this time, extract the fields related to the sensor : deviceInfo.deviceId and deviceinfo.label. Use Remove Empty on deviceId to show only valid values.

At the bottom of the right menu, select Create Resource in the Actions menu to open the Create Resource dialog. If you don’t see it, make sure you don’t have a column selected. Select the label and deviceId fields, and also deviceId as the Primary Key so you have a unique key for other datasets to link information. Click Create. This creates a second Stage for deviceId.

Open menu with Set as Resource Set Label highlighted

Figure 9 - Set as Resource Set

Next, indicate which field contains the Resource names. This allows other Worksheets or Landing Pages to show a meaningful device name instead of deviceId. From the label column menu, select Set as Resource Set Label.

Right rail with Publish New Resource Set button

Figure 10 - Publish new Resource Set

If the list of label values looks good, click Publish New Resource Set to create the Resource Set dataset. Create a unique name such as “Worksheet example IoT Device” and publish.

Linking to another dataset

In the original Worksheet, you can now display a more readable label instead of deviceId.

Right rail with Link to Resource Set dialog

Figure 11 - Link to Resource Set

From the deviceId column menu, select Link to Resource Set and choose the Resource Set you just created. The correct Linked Resource Key should already be set to deviceId. Update the Link Name to give the linked field a more meaningful name, such as “Device”. Click Apply to finish.

The deviceId column is now the Device column, with device names instead of ID values.

The Device field now contains names of devices, instead of device IDs

Figure 12 - Device field with device names

Saving the updated Worksheet (optional)

You now have two Worksheets, an Event Stream, and a Resource Set. The Worksheets use data from the Event Stream and Resource Set, but those datasets no longer depend on the previously created Worksheets. You may save them for later use, or create a new Worksheet next time you need one.

Exploring more complex data

You may have Event Streams and Resource Sets created by other members of your team. In this case, shaping and linking existing datasets may be a more common way to use a Worksheet than working with raw data from a new source.

The Device Events Event Stream, in a Worksheet.

Figure 13 - Device Events in a Worksheet

Here is an example of the same IoT data, modeled as a collection of linked Event Streams, Resource Sets, and Resources. More detailed shaping and linking makes it easier to understand the relationship between different types. From here, you can perform many different types of investigations with the data.

Filtering and visualizing data

Start with a question: When did someone last make coffee? Coffee at Flamingo House starts by turning on the electric kettle to boil water. Starting with a Worksheet for Device Events, filter Device to just the Water Kettle events.

The Device Events Event Stream, showing only the water kettle

Figure 14 - Display only water kettle events

The kettle has two sensors: power and switch. The power sensor reports how much power the kettle uses, and when this value spikes, someone has turned on the kettle. That is a good choice for the next filter.

With only the power readings, the timechart now shows a spike in usage a few hours ago. To see this in more detail, create a visualization.

Select Add Visualization from the More menu and select value for the Y-Axis and Average for Function. Since there’s only one kettle, you don’t have to select a GROUP BY option.

Chart showing a spike in power usage around 8AM

Figure 15 - Power usage around 8 a.m.

The large spike confirms that the kettle became active at approximately 8 a.m..

Exploring linked datasets

Ask a second question: how does the temperature vary throughout the house?

Open a new Worksheet for Device Events, change the time range to Past 24 hours, and filter sensor to just the temperature sensors. Then add a visualization to show temperature in each room.

Chart of temperature by room. Living room temperature varies from 76 F mid-afternoon to 62 F at 6AM

Figure 16 - Living room temperatures

The temperature readings show a clear pattern throughout the day and the rooms slowly cool through the evening and early morning, and then quickly rise around 6:30 a.m.. The living room appears to have the widest range. How do you view just those readings in the visualization? Go back to its parent Stage and filter Room to only show Living Room. The visualization also displays only the living room.

Each value in Room is a link, right click on one and select the Living Room Resource to view other details about the living room. The Living Room Resource Landing Page shows that the living room belongs to, or is a type of, the Resource Set Room. If you want to know more about other rooms, click Room to open the Room Resource Set Landing Page.

Return to the Living Room Resource Landing Page, and the Room Overview board in the Fields tab shows details about the humidity, temperature, and other metrics. The Events tab looks much like the Device Events Event Stream in the Worksheet, although everything here only applies to the Living Room.

While you work with data in a Worksheet, you can jump to other related datasets to investigate a detail, and then return to modeling in the Worksheet.

The OPAL console

So far, all this modeling has only used the UI. But every UI action also generates an OPAL statement, visible at the bottom of the page in the console.

OPAL, the Observe Processing and Analysis Language, is a full featured query and modeling language. You can perform the same UI actions with OPAL statements, or add additional statements to your OPAL script for operations not yet available in the UI. Choose whichever you are most comfortable with.

OPAL statements in the console, filter and col_drop

Figure 17 - OPAL statements in the console

For example, the Filter JSON menu action to filter to smartthings-bridge is equivalent to this OPAL filter statement:

filter (string(EXTRA.path) = "/smartthings-bridge")

For more about OPAL, see Observe Processing and Analysis Language. For additional OPAL examples, see Helpful hints: OPAL.