Data Export

Warning

Data Export is an advanced feature currently in Public Preview, please file a support ticket if you wish to participate in the preview program. We are also rolling out to new regions regularly, so please check if your region is currently supported.

Overview

Observe provides industry leading retention, but there are scenarios where you may wish to move data in Observe to an AWS S3 bucket that is owned by your organization. You can use Observe’s Data Export feature to move data automatically from any Event Dataset to an S3 bucket via two distinct job types; duplicate data and post-retention data.

S3 Data Export is a good fit for the following scenarios:

  • You need to retain data for compliance purposes, but it does not need to be searchable.

  • You need to share subsets of Observe data to other teams.

  • Provide portability of your data post-ingestion.

When using Data Export, be aware of the following prerequisites:

  • The S3 bucket must be in the same region as your Observe tenant.

  • Exports will have an up to 2 hour delay from event arrival.

  • You can only export data from Event Datasets.

  • Hibernation cannot be applied to datasets that are associated with an S3 Export job.

Export Jobs

To create an export job, navigate to the Account Settings page of your Observe tenant and select the Data export option from the left-nav. Observe supports two primary export job types. Export jobs support exporting in either NDJSON format (gzip compression) or Parquet format (snappy compression). All export job names must be unique. All export Destination values must have a trailing / character. There are two export job types:

Duplicate data

This export job copies data from the selected Dataset from the time of job creation, exporting new data as it arrives, to the designated S3 bucket.

Post-retention data

This export job copies data from the selected Dataset to the designated S3 bucket after the data has reached its retention limit.

Export Job Creation

When configuring your export job, ensure that your S3 bucket is in the same region as your Observe tenant. Each Observe region will also have a unique value for the IAM Role that you will need to add to your S3 bucket IAM policy.

Export Job Creation

Figure 1 - Example Export Job Creation

S3 Policy Configuration

For Observe to write data to your S3 bucket, create the following S3 bucket IAM policy. Replace the Principal value with the Role value from the Observe UI.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "<ROLE VALUE FROM THE OBSERVE EXPORT JOB CREATION MODAL>"
            },
            "Action": [
                "s3:GetBucketLocation",
                "s3:GetObject",
                "s3:GetObjectVersion",
                "s3:ListBucket",
                "s3:DeleteObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::<YOUR BUCKET NAME>/*",
                "arn:aws:s3:::<YOUR BUCKET NAME>"
            ]
        }
    ]
}

Export Job Details

All export jobs have the following attributes:

  • State: Active or Failed. You can hover over the Failed state to learn more about the failure reason.

  • Job / Description: The name and description of the export job.

  • Dataset: The name of the dataset you are exporting.

  • Destination: The S3 bucket to which you are exporting your data.

  • Older than: This is the age at which the data will be exported, as determined from its arrival in the dataset.

  • Earliest timestamp: This is the timestamp of the the oldest event/data that has been exported.

  • Latest export: This is timestamp of the most recent event/data that has been exported.

  • Created by: The name of the user who created the export job, and thes export job creation date.

Export Job State

Export jobs can either be in an Active or Failed state. Hovering over the state pill will provide additional details as to why the export job has failed. If your job has failed due to your bucket being in the wrong region, or due to a misconfigured IAM policy, you can hover over the failure to retry the job. For example, if your export job fails due to misconfigured IAM policy, you can update the bucket policy and then retry. Note that it can take up to 90 seconds for the export job state to reflect Failed or Active after creation or retry.

Export Job Creation

Figure 2 - Failed Export Job

Export Job Deletion

Export jobs can be deleted by hovering over the row of the job you want to delete, and clicking the trash-can icon on the right-hand side. You will be presented with a confirmation dialog. Note that deleting an export job does not delete data that has already been exported.

Export Job Delete

Figure 3 - Delete Export Job