pick_col
Aliases: colpick (deprecated).
pick_col [columnbinding: expression]+
Projects the pipeline to an explicit list of columns in argument order, binding each output name to an expression and dropping every input column that is not selected.
Datasets intended as parents of child datasets often use pick_col to pin a stable public schema, preserve explicit output column order, and prevent new upstream columns from automatically appearing in children.
Time columns and resource keys
If the input defines valid-from and/or valid-to columns, the output must still designate those roles—by picking or renaming the existing time columns, or by supplying a compatible row_timestamp binding when that applies. On resource-shaped inputs, every primary-key column must appear (as a direct column reference or an equivalent binding).
Names and metadata
Duplicate output column names in a single pick_col are rejected. A binding that only renames or reorders an existing column preserves that column’s field metadata when the declared type matches the input column type; richer expressions rebuild metadata from the expression. If required fields for a declared interface are omitted, that interface is dropped from the output metadata.
Use make_col to add columns, drop_col to remove them without full projection, and rename_col when only names change.
Categories
Accelerable
pick_col is always accelerable if the input is accelerable. A dataset that only uses accelerable verbs can be accelerated, making queries on the dataset respond faster.
Examples
pick_col month_number, month_name
Keeps only the two listed columns in argument order, dropping every other field from the shared months table.
pick_col num:month_number, label:string(month_name)
Renames and casts columns while projecting away the rest of the schema so downstream steps see short, explicit names.
pick_col month_name, month_number
Reorders the same two columns by listing month_name first, showing that output column order follows the argument list.
Updated 1 day ago