Skip to main content

Action: extract

Extract data from plain text, using a pattern

No auto-conversion of numbers takes place; use convert for that. (You can now use convert instead of output-fields to indirectly invoke 'convert' action) You may use named capture groups instead of output-fields but either way fieldnames are restricted to letters, digits and underscores.

Field Summary

Field NameTypeDescriptionDefault
conditionexpressionOnly run this action if the condition the specified condition is met-
input-fieldfieldField containing the data_raw
removeboolRemove the field after usagefalse
warningboolWarn on non-matching eventsfalse
dropboolRemove non-matching eventsfalse
patternregexThe pattern to match on-
output-fieldsarray of fieldsField names where values are stored-
convertarray of (field,type) pairsInstead of output-fields, invoke convert action-

Fields

condition

Type: expression

Only run this action if the condition the specified condition is met

input-field

Type: field

Default: _raw

Field containing the data

Example

Input:

{"uptime":" 10:34:51 up  2:06,  1 user,  load average: 0.40, 0.28, 0.23"}

Pipe Language Snippet:

extract:
input-field: uptime
remove: true
pattern: 'load average: (\S+), (\S+), (\S+)'
output-fields:
- m1
- m5
- m15

Output:

{"m1":"0.40","m5":"0.28","m15":"0.23"}

remove

Type: bool
Alias: remove-field
Default: false

Remove the field after usage

Example: Parse output of uptime command

Input:

10:34:51 up  2:06,  1 user,  load average: 0.40, 0.28, 0.23

Pipe Language Snippet:

extract:
input-field: _raw
remove: true
pattern: 'load average: (\S+), (\S+), (\S+)'
output-fields:
- m1
- m5
- m15

Output:

{"m1":"0.40","m5":"0.28","m15":"0.23"}

Example: Without input-field removed

Input:

10:34:51 up  2:06,  1 user,  load average: 0.40, 0.28, 0.23

Pipe Language Snippet:

extract:
input-field: _raw
pattern: 'load average: (\S+), (\S+), (\S+)'
output-fields:
- m1
- m5
- m15

Output:

{"_raw":"10:34:51 up  2:06,  1 user,  load average: 0.40, 0.28, 0.23","m1":"0.40","m5":"0.28","m15":"0.23"}

warning

Type: bool

Default: false

Warn on non-matching events

Example

Input:

PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.060 ms

Pipe Language Snippet:

extract:
input-field: _raw
remove: true
drop: true
warning: true
pattern: '(\S+) ms$'
output-fields:
- latency

Output:

{"_raw":"PING localhost (127.0.0.1) 56(84) bytes of data."}
{"latency":"time=0.060"}
[WARN] extract: no captures with regex action-extract step 1
LINE: {"_raw":"PING localhost (127.0.0.1) 56(84) bytes of data."}

Example: Without warn

Input:

PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.060 ms

Pipe Language Snippet:

extract:
input-field: _raw
remove: true
pattern: '(\S+) ms$'
output-fields:
- latency

Output:

{"_raw":"PING localhost (127.0.0.1) 56(84) bytes of data."}
{"latency":"time=0.060"}

drop

Type: bool

Default: false

Remove non-matching events

Example

Input:

PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.060 ms

Pipe Language Snippet:

extract:
input-field: _raw
remove: true
drop: true
pattern: '(\S+) ms$'
output-fields:
- latency

Output:

{"latency":"time=0.060"}

Example: Without the drop

Input:

PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.060 ms

Pipe Language Snippet:

extract:
input-field: _raw
remove: true
pattern: '(\S+) ms$'
output-fields:
- latency

Output:

{"_raw":"PING localhost (127.0.0.1) 56(84) bytes of data."}
{"latency":"time=0.060"}

pattern

Type: regex

The pattern to match on

Example

Input:

num=1
num=2
num=3

Pipe Language Snippet:

extract:
pattern: 'num=(?P<n>\d+)'

Output:

{"_raw":"num=1","n":"1"}
{"_raw":"num=2","n":"2"}
{"_raw":"num=3","n":"3"}

output-fields

Type: array of fields

Field names where values are stored

Example: extract the round trip time for the ping

Input:

PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.060 ms

Pipe Language Snippet:

extract:
input-field: _raw
remove: true
drop: true
pattern: '(\S+) ms$'
output-fields:
- latency

Output:

{"latency":"time=0.060"}

Example: an optional match may or may not set a field

Input:

4-01
02

Pipe Language Snippet:

extract:
input-field: _raw
remove: true
pattern: '((\d+)-)*(\d+)'
output-fields: [day, hour]

Output:

{"day":"4","hour":"01"}
{"hour":"02"}

convert

Type: array of (field,type) pairs

Instead of output-fields, invoke convert action

Example: Parse output of uptime command

Input:

 10:34:51 up  2:06,  1 user,  load average: 0.40, 0.28, 0.23

Pipe Language Snippet:

extract:
input-field: _raw
remove: true
pattern: 'load average: (\S+), (\S+), (\S+)'
convert:
- m1: num
- m5: num
- m15: num

Output:

{"m1":0.40,"m5":0.28,"m15":0.23}