parse_csv¶
Description¶
Parses an input string as character-separated values, where the default separator is a comma (comma-separated values.) Additionally, can un-quote backslash-quoted entities in the input string, which is a common extension to the base CSV format described in RFC4180.
The return value is an array of strings, or NULL if the input value could not be parsed.
The value
argument is the string to parse. Values are separated by a
particular separator character or short string, which defaults to ‘,’. Values
that contain the separator, or contains double-quote characters, must be
enclosed with double-quotes. Double-quotes within the value must be duplicated.
This follows the conventions of RFC 4180. This value is any valid utf-8 string.
The optional unescape
boolean argument allows you to specify that strings in
double-quotes will also unescape characters like ‘\n’ to newline, ‘\r’ to
carriage return, and \u00XX as a hex character (but high character plane hex
literals (\uXXXX) are not supported.) By default, RFC 4180 CSV files do not
unescape backslashes, so the default is for this feature to be off. The full
set of supported characters are shown in the table below.
The optional separator
string argument allows you to specify some separator
character other than comma, such as space ‘ ‘, semicolon ‘;’, tab ‘\t’ or colon
‘:’, depending on what your input data requires. Emoji and other utf-8 as
separators are supported. The default separator is a comma ‘,’.
Quoted Character |
Meaning |
---|---|
\ |
backslash () |
“ |
double-quote (“) |
b |
backspace (\u0008) |
f |
formfeed (\u000C) |
n |
newline (\u000A) |
r |
carriage return (\u000D) |
t |
horizontal tab (\u0009) |
u00XX |
hexadecimal character value XX |
Return type¶
generic array
Domain¶
This is a scalar function (calculates a single output value for a single input row.)
Categories¶
Usage¶
parse_csv(value, [ unescape ], [ separator ])
Argument |
Type |
Optional |
Repeatable |
Restrictions |
---|---|---|---|---|
value |
string |
no |
no |
none |
unescape |
bool |
yes |
no |
constant |
separator |
string |
yes |
no |
constant |
Examples¶
make_col cols:parse_csv(substring(log, 24))
Parse the comma-separated data starting at offset 24 from the input line ‘log’, into an array of strings. The field separator is comma.
make_col c:parse_csv('3.14;"hello; world!";;true', ';')
Example output:
c |
---|
[“3.14”,”hello; world!’,””,”true”] |