Example: Using Unified Hosts-Domains and URL Threatlists#

In this example, you explore using the Unified Hosts-Domains Threatlist and Unified URL Threatlist against data that contains Web activity URLs.

You cover the following topics in this tutorial:

Shaping the Data#

You have a Dataset with a FIELDS column with the following data:

{"source":"weblogs","timestamp":"1684950343.405","url":"https://docs.observeinc.com/en/latest/content/common-topics/GS-observe.html"}
Example Data URL Activity

Figure 1 - Example Data URL Activity

  1. Click on Untitled Worksheet and rename it to Example/Host-Domain-URL IOC Matches.

  2. Click on Stage 1 and rename it to Form Data.

Extracting Fields from the JSON Payload#

3. Open the menu for the FIELDS column and select Extract from JSON.

4. From the right menu, select the following fields. Some fields may be in nested objects:

  • source

  • timestamp

  • url

5. Click Apply.

The extraction creates the following OPAL:

// Note we use a lower function on the url to avoid case matching issues later
make_col source:string(FIELDS.source),
    timestamp:string(FIELDS.timestamp),
    url:lower(string(FIELDS.url))

Forming timestamp from Event Data#

6. You want to designate the timestamp, in Epoch seconds, from the data as the event timestamp. The JSON extraction creates a string by default. You want to change it to a valid timestamp object.

Change the string function to the from_seconds function to create a proper timestamp object type.

// Extract fields from JSON
make_col source:string(FIELDS.source),
    timestamp:from_seconds(FIELDS.timestamp),
    url:lower(string(FIELDS.url)) 

7. Add the following OPAL to set a set_valid_from. For better performance, don’t use a new value that deviates too much from the original ingest time.

// Set valid from for the data
set_valid_from options(max_time_diff:4h), timestamp

Forming Host Field from the URL Field#

8. Next you extract the host value from the url because you want to check this value as well as the full url value against the Threat Intelligence IOCs. Add the following OPAL code:

// extract host from url
extract_regex url, /\/(?P<url_host>[^\/:]+)[:\/]?/

9. Add the following OPAL to reduce the columns to only the fields of interest.

// Keep fields of interest
pick_col timestamp,
    source,
    url,
    url_host

10. Click RUN.

Formed Example Data URL Activity

Figure 2 - Formed Example Data URL Activity

11. Click Save Worksheet.

Matching the url_host Field to the IOCs#

  1. Minimize the OPAL console so you can select Link New Stage.

  2. Rename the new Stage Host IOC Matching.

  3. Open the OPAL console.

  4. Click the Inputs tab.

  5. Search and add Unified Hosts-Domains Threatlist.

  6. Note the name of the added Input as you use it in the join statement.

Inputs Host Value IOC Matches

Figure 3 - Inputs Host Value IOC Matches

6. Click the OPAL tab and add the following OPAL code.

// Note we use the lower() function to ensure values match consistently

join on(lower(url_host)=lower(@"Threat_Intel_Basic/Unified Hosts-Domains Threatlist".tip_domain)),
  tip_match:@"Threat_Intel_Basic/Unified Hosts-Domains Threatlist".tip_domain,
  tip_provider:@"Threat_Intel_Basic/Unified Hosts-Domains Threatlist".tip_provider,
  tip_provider_id:@"Threat_Intel_Basic/Unified Hosts-Domains Threatlist".tip_provider_url,
  tip_severity:@"Threat_Intel_Basic/Unified Hosts-Domains Threatlist".tip_severity,
  tip_tags:@"Threat_Intel_Basic/Unified Hosts-Domains Threatlist".tip_tags,
  tip_source:@"Threat_Intel_Basic/Unified Hosts-Domains Threatlist".tip_source,
  tip_tlp:@"Threat_Intel_Basic/Unified Hosts-Domains Threatlist".tip_tlp,
  tip_category:@"Threat_Intel_Basic/Unified Hosts-Domains Threatlist".tip_category,

  tip_match_field:"url_host"

7. Click Run.

8. Click Save worksheet.

Host Value IOC Matches

Figure 4 - Host Value IOC Matches

Matching the url Field to IOCs#

  1. Minimize the OPAL console.

  2. Return to the Stage Form Data and select Link New Stage. This links you to the original data.

  3. Rename the new Stage URL IOC Matching.

  4. Open the OPAL console.

  5. Click Inputs.

  6. Search and add the Unified URL Threatlist dataset.

  7. Note the name of the added Input as you use in the join statement.

Inputs URL Value IOC Matches

Figure 5 - Inputs URL Value IOC Matches

8. Click the OPAL tab and update it using the following OPAL code.

// Note we use the lower() function to ensure values match consistently

join on(lower(url)=lower(@"Threat_Intel_Basic/Unified URL Threatlist".tip_url)),
  tip_match:@"Threat_Intel_Basic/Unified URL Threatlist".tip_url,
  tip_provider:@"Threat_Intel_Basic/Unified URL Threatlist".tip_provider,
  tip_provider_id:@"Threat_Intel_Basic/Unified URL Threatlist".tip_provider_url,
  tip_severity:@"Threat_Intel_Basic/Unified URL Threatlist".tip_severity,
  tip_tags:@"Threat_Intel_Basic/Unified URL Threatlist".tip_tags,
  tip_source:@"Threat_Intel_Basic/Unified URL Threatlist".tip_source,
  tip_tlp:@"Threat_Intel_Basic/Unified URL Threatlist".tip_tlp,
  tip_category:@"Threat_Intel_Basic/Unified URL Threatlist".tip_category,
  tip_match_field:"url"
URL Value IOC Matches

Figure 6 - URL Value IOC Matches

Merging url and url_host IOC Matches#

You now have two different stages from the main data matching each unified list.

  1. To merge the results, locate the stage URL IOC Matching and select Link New Stage.

  2. Rename the new Stage Merged Results.

  3. On the OPAL console, click Inputs.

  4. Search and add the Host IOC Matching.

  5. Note the name of the added Input as you use it in the union statement.

Inputs Merged IOC Matches

Figure 7 - Inputs Merged IOC Matches

6. Add the following OPAL to combine the results in the stage.

//merge in the host IOC matches
union @"Host IOC Matching_-vlua"

// use a statsby to reduce to what all matches by the URL in case you get matches on host and full url

statsby timestamp:last(timestamp),
  tip_match:string_agg_distinct(tip_match,","),
  tip_match_field:string_agg_distinct(tip_match_field,","),
  tip_provider:string_agg_distinct(tip_provider,","),
  tip_provider_id:string_agg_distinct(tip_provider_id,","),
  tip_severity:string_agg_distinct(tip_severity,","),
  tip_tags:array_agg_distinct(tip_tags),
  tip_source:string_agg_distinct(tip_source,","),
  tip_tlp:string_agg_distinct(tip_tlp,","),
  tip_category:string_agg_distinct(tip_category,","),
  group_by(url)
Merged Value IOC Matches

Figure 8 - Merged IOC Matches

You have now completed the usage example. You could also use this Worksheet to make a Dataset and subsequent Monitor if so desired.