Language syntax

Basic statements, expressions, and queries

OPAL verbs and functions may be combined to perform more complex operations. For example, create a new field with a constructed string value by using make_col and strcat() together:

// New column "location" containing string with both city and country
make_col location:strcat(city, ",", country)

Or calculate a value:

// Convert degrees C to degrees F
make_col temperature_f:(temp*9/5 + 32)

Each line (step) of an OPAL script builds on previous ones, so the city example might continue on like this:

make_col location:strcat(city, ",", country)
filter location="Boston, US"


A standard OPAL query is a linear series of steps, with one or more dataset inputs. Each action contributes to the final output, but the data coming from those inputs can’t be modified in the process.

Subqueries provide more flexibility, as they allow each input to have unique shaping or other pre-processing. Subqueries dramatically reduce the need for intermediate datasets in complex shaping.

The first subquery in a query may filter a dataset one way, the second another, and a third combine those two results with an entirely different dataset and more actions. Together they produce a result, which might be the final result of your OPAL script or passed as an input for additional shaping.

Subqueries are a new feature. If you have questions about how to use them, please let us know!


Subquery containing a single filter

The simplest subquery has one definition block and one result block. A subquery is not required for a basic filter of course, but here it stands in for whatever OPAL your task may require.

In this example, <- @ indicates we are using the default primary input, weather, without explicitly specifying it by name. The final <- @test_a {} invokes @test_a to generate the result.

// Starting from the "weather" dataset, which contains field "city"
@test_a <- @ {
  filter city="Chicago"

// this is the final result of the entire query,
// with subquery @test_a as its input
<- @test_a {}

Either of these blocks may have empty bodies, although without filter city="Chicago" the @test_a subquery would have no effect.

Example Worksheet
Newly-opened Untitled Worksheet showing temperature data for Chicago. The event table has fields timestamp (PST), id, city, description, temp, and country.

You may also explicitly specify the primary input dataset by name:

@test_a <- @weather {
  filter city="Chicago"

For the primary input, you may use either method. Because this is the dataset you originally started with in your Worksheet, Observe can determine what the input dataset should be. Any additional linked datasets, however, must be referenced by name. (See below for examples.)


Spaces and other special characters in dataset names must be HTML escaped (example: %2F for /.) You may find it more convenient to select a dataset from the input menu instead:

OPAL console showing an open menu after the second @ on the first line, with the "Test%20Package%2FWeather%20Events" dataset highlighted.

Subquery results may also be used as an input for any verb that accepts a dataset:

// Filter Countries resources to North America
// Resulting @NACountries is now available to use elsewhere
@NACountries <- @Countries {
	filter country = "US" or country = "MX" or country = "CA"

// Join (inner join) the subquery results with primary input City
// to get the data for just countries in North America
join [email protected]

Queries containing multiple subqueries

Chain multiple subqueries together by using the output of the first as the input to the second:

// Subquery a
@test_a <- @ {
  filter city="Chicago"

// Subquery b
@test_b <- @test_a {
  filter contains(description, "cloud")

// Final result only needs the output from @test_b,
// since that builds on previous operations done in @test_a
<- @test_b {}

Two subqueries with different operations on the primary input, combined using union. Note that the final result block is also a subquery, which may contain additional OPAL statements:

// Subquery a
@test_a <- @Weather {
  filter city="Chicago"

// Subquery b
@test_b <- @Weather {
  filter city="Portland"

// Start with results of @test_a, union with those
// from @test_b, then filter
<- @test_a {
  union @test_b
  filter contains(description, "cloud")

Adding other datasets as inputs

A subquery may use any dataset as an input by linking it. To do this, select + Add New Input from the input menu in the OPAL console:

Input menu, with Add New Input highlighted
// Filter and extract from Observation
@test_a <- @Observation {
  filter (not is_null(EXTRA.poller_type)) and (string(EXTRA.poller_type) = "weather")
  make_col city:string(
  filter city="Chicago"

// Weather/Raw Events also has weather data
@test_b <- @WeatherRawEvents_41074100 {
  make_col city:string(
  filter city="Portland"

// Combine the two, extracting an additional field
@test_c <- @test_a {
  union @test_b
  make_col description:string([0].description)

// Add another filter to the final result
<- @test_c {
  filter contains(description, "cloud")

Additional OPAL syntax details


OPAL allows single line comments beginning with // anywhere whitespace is permitted, except inside a string literal. For multi-line comments, you may also use /* and */ start and end delimiters.

// only need deviceId and label
make_col deviceId:string(FIELDS.deviceInfo.deviceId), label:string(FIELDS.deviceInfo.label)

 * TODO: better handle null deviceID values

filter not is_null(deviceId) // ignore anything without a deviceId

Multi-line statements

Indent to continue a statement on the next line:

// select only the needed fields

Also, regular expressions may be broken into smaller units on multiple lines. Note that each component of a larger regex must itself be a valid regex, and are whitespace delimited:

// these two statements are equivalent
extract_regex data, /(?P<deviceid>[^|]*)\|count:(?P<counts>[^|]*)\|env:(?P<env>[^|]*)/

extract_regex data, /(?P<deviceid>[^|]*)\|/

Field names

In most cases, field (column) names may contain any character except double quote ", period ., colon :, or backslash \.

make_col "ΔT":float64(field3)
make_col "占用率":float64(field4)
make_col "0_3µm":float64(um03)
make_col "✅":bool(done)

To reference a field with non-alphanumeric characters in an OPAL statement, use double quotes and prepend @..

make_col temp_difference:@."ΔT"

Regex extracted columns from extract_regex are limited to alphanumeric characters (A-Z, a-z, 0-9).