This page explains how to use the parse operator function in APL.
The parse
operator in APL enables you to extract and structure information from unstructured or semi-structured text data, such as log files or strings. You can use the operator to specify a pattern for parsing the data and define the fields to extract. This is useful when analyzing logs, tracing information from text fields, or extracting key-value pairs from message formats.
You can find the parse
operator helpful when you need to process raw text fields and convert them into a structured format for further analysis. It’s particularly effective when working with data that doesn’t conform to a fixed schema, such as log entries or custom messages.
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
Splunk SPL users
In Splunk, the rex
command is often used to extract fields from raw events or text. In APL, the parse
operator performs a similar function. You define the text pattern to match and extract fields, allowing you to extract structured data from unstructured strings.
ANSI SQL users
In ANSI SQL, there isn’t a direct equivalent to the parse
operator. Typically, you use string functions such as SUBSTRING
or REGEXP
to extract parts of a text field. However, APL’s parse
operator simplifies this process by allowing you to define a text pattern and extract multiple fields in a single statement.
kind
: Optional parameter to specify the parsing mode. Its value can be simple
for exact matches, regex
for regular expressions, or relaxed
for relaxed parsing. The default is simple
.Expression
: The string expression to parse.StringConstant
: A string literal or regular expression pattern to match against.FieldName
: The name of the field to assign the extracted value.FieldType
: Optional parameter to specify the data type of the extracted field. The default is string
.*
: Wildcard to match any characters before or after the StringConstant
....
: You can specify additional StringConstant
and FieldName
pairs to extract multiple values.The parse operator returns the input dataset with new fields added based on the specified parsing pattern. The new fields contain the extracted values from the parsed string expression. If the parsing fails for a particular row, the corresponding fields have null values.
For log analysis, you can extract the HTTP request duration from the uri
field using the parse
operator.
Query
Output
_time | req_duration_ms | uri |
---|---|---|
2024-10-18T12:00:00 | 200 | /api/v1/resource?duration=200 |
2024-10-18T12:00:05 | 300 | /api/v1/resource?duration=300 |
This query extracts the req_duration_ms
from the uri
field and projects the time and duration for each HTTP request.
For log analysis, you can extract the HTTP request duration from the uri
field using the parse
operator.
Query
Output
_time | req_duration_ms | uri |
---|---|---|
2024-10-18T12:00:00 | 200 | /api/v1/resource?duration=200 |
2024-10-18T12:00:05 | 300 | /api/v1/resource?duration=300 |
This query extracts the req_duration_ms
from the uri
field and projects the time and duration for each HTTP request.
In OpenTelemetry traces, the parse
operator is useful for extracting components of trace data, such as the service name or status code.
Query
Output
_time | service.name | trace_id |
---|---|---|
2024-10-18T12:00:00 | frontend | a1b2c3d4-frontend |
2024-10-18T12:01:00 | cartservice | e5f6g7h8-cartservice |
This query extracts the service.name
from the trace_id
and projects the time and service name for each trace.
For security logs, you can use the parse
operator to extract status codes and the method of HTTP requests.
Query
Output
_time | method | status |
---|---|---|
2024-10-18T12:00:00 | GET | 200 |
2024-10-18T12:00:05 | POST | 404 |
This query extracts the HTTP method and status from the method
field and shows them along with the timestamp.
This example parses the content_type
field to extract the datatype
and format
values separated by a /
. The extracted values are projected as separate fields.
Original string
Query
Output
This example parses the user_agent
field to extract the operating system name (os_name
) and version (os_version
) enclosed within parentheses. The extracted values are projected as separate fields.
Original string
Query
Output
This example parses the uri
field to extract the endpoint
value that appears after /api/v1/
. The extracted value is projected as a new field.
Original string
Query
Output
This example demonstrates how to parse the id
field into three parts: region
, tenant
, and userId
. The id
field is structured with these parts separated by hyphens (-
). The extracted parts are projected as separate fields.
Original string
Query
Output
The parse operator supports a relaxed mode that allows for more flexible parsing. In relaxed mode, Axiom treats the parsing pattern as a regular string and matches results in a relaxed manner. If some parts of the pattern are missing or do not match the expected type, Axiom assigns null values.
This example parses the log
field into four separate parts (method
, url
, status
, and responseTime
) based on a structured format. The extracted parts are projected as separate fields.
Original string
Query
Output
The parse operator supports a regex mode that allows you to parse use regular expressions. In regex mode, Axiom treats the parsing pattern as a regular expression and matches results based on the specified regex pattern.
This example demonstrates how to parse Kubernetes pod log entries using regex mode to extract various fields such as podName
, namespace
, phase
, startTime
, nodeName
, hostIP
, and podIP
. The parsing pattern is treated as a regular expression, and the extracted values are assigned to the respective fields.
Original string
Query
Output
When using the parse operator, consider the following best practices:
By following these best practices and understanding the capabilities of the parse operator, you can effectively extract and transform data from string fields in APL, enabling powerful querying and insights.
extend
operator when you want to add calculated fields without parsing text.project
to select and rename fields after parsing text.extract
to retrieve the first substring matching a regular expression from a source string.extract_all
to retrieve all substrings matching a regular expression from a source string.This page explains how to use the parse operator function in APL.
The parse
operator in APL enables you to extract and structure information from unstructured or semi-structured text data, such as log files or strings. You can use the operator to specify a pattern for parsing the data and define the fields to extract. This is useful when analyzing logs, tracing information from text fields, or extracting key-value pairs from message formats.
You can find the parse
operator helpful when you need to process raw text fields and convert them into a structured format for further analysis. It’s particularly effective when working with data that doesn’t conform to a fixed schema, such as log entries or custom messages.
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
Splunk SPL users
In Splunk, the rex
command is often used to extract fields from raw events or text. In APL, the parse
operator performs a similar function. You define the text pattern to match and extract fields, allowing you to extract structured data from unstructured strings.
ANSI SQL users
In ANSI SQL, there isn’t a direct equivalent to the parse
operator. Typically, you use string functions such as SUBSTRING
or REGEXP
to extract parts of a text field. However, APL’s parse
operator simplifies this process by allowing you to define a text pattern and extract multiple fields in a single statement.
kind
: Optional parameter to specify the parsing mode. Its value can be simple
for exact matches, regex
for regular expressions, or relaxed
for relaxed parsing. The default is simple
.Expression
: The string expression to parse.StringConstant
: A string literal or regular expression pattern to match against.FieldName
: The name of the field to assign the extracted value.FieldType
: Optional parameter to specify the data type of the extracted field. The default is string
.*
: Wildcard to match any characters before or after the StringConstant
....
: You can specify additional StringConstant
and FieldName
pairs to extract multiple values.The parse operator returns the input dataset with new fields added based on the specified parsing pattern. The new fields contain the extracted values from the parsed string expression. If the parsing fails for a particular row, the corresponding fields have null values.
For log analysis, you can extract the HTTP request duration from the uri
field using the parse
operator.
Query
Output
_time | req_duration_ms | uri |
---|---|---|
2024-10-18T12:00:00 | 200 | /api/v1/resource?duration=200 |
2024-10-18T12:00:05 | 300 | /api/v1/resource?duration=300 |
This query extracts the req_duration_ms
from the uri
field and projects the time and duration for each HTTP request.
For log analysis, you can extract the HTTP request duration from the uri
field using the parse
operator.
Query
Output
_time | req_duration_ms | uri |
---|---|---|
2024-10-18T12:00:00 | 200 | /api/v1/resource?duration=200 |
2024-10-18T12:00:05 | 300 | /api/v1/resource?duration=300 |
This query extracts the req_duration_ms
from the uri
field and projects the time and duration for each HTTP request.
In OpenTelemetry traces, the parse
operator is useful for extracting components of trace data, such as the service name or status code.
Query
Output
_time | service.name | trace_id |
---|---|---|
2024-10-18T12:00:00 | frontend | a1b2c3d4-frontend |
2024-10-18T12:01:00 | cartservice | e5f6g7h8-cartservice |
This query extracts the service.name
from the trace_id
and projects the time and service name for each trace.
For security logs, you can use the parse
operator to extract status codes and the method of HTTP requests.
Query
Output
_time | method | status |
---|---|---|
2024-10-18T12:00:00 | GET | 200 |
2024-10-18T12:00:05 | POST | 404 |
This query extracts the HTTP method and status from the method
field and shows them along with the timestamp.
This example parses the content_type
field to extract the datatype
and format
values separated by a /
. The extracted values are projected as separate fields.
Original string
Query
Output
This example parses the user_agent
field to extract the operating system name (os_name
) and version (os_version
) enclosed within parentheses. The extracted values are projected as separate fields.
Original string
Query
Output
This example parses the uri
field to extract the endpoint
value that appears after /api/v1/
. The extracted value is projected as a new field.
Original string
Query
Output
This example demonstrates how to parse the id
field into three parts: region
, tenant
, and userId
. The id
field is structured with these parts separated by hyphens (-
). The extracted parts are projected as separate fields.
Original string
Query
Output
The parse operator supports a relaxed mode that allows for more flexible parsing. In relaxed mode, Axiom treats the parsing pattern as a regular string and matches results in a relaxed manner. If some parts of the pattern are missing or do not match the expected type, Axiom assigns null values.
This example parses the log
field into four separate parts (method
, url
, status
, and responseTime
) based on a structured format. The extracted parts are projected as separate fields.
Original string
Query
Output
The parse operator supports a regex mode that allows you to parse use regular expressions. In regex mode, Axiom treats the parsing pattern as a regular expression and matches results based on the specified regex pattern.
This example demonstrates how to parse Kubernetes pod log entries using regex mode to extract various fields such as podName
, namespace
, phase
, startTime
, nodeName
, hostIP
, and podIP
. The parsing pattern is treated as a regular expression, and the extracted values are assigned to the respective fields.
Original string
Query
Output
When using the parse operator, consider the following best practices:
By following these best practices and understanding the capabilities of the parse operator, you can effectively extract and transform data from string fields in APL, enabling powerful querying and insights.
extend
operator when you want to add calculated fields without parsing text.project
to select and rename fields after parsing text.extract
to retrieve the first substring matching a regular expression from a source string.extract_all
to retrieve all substrings matching a regular expression from a source string.