String operators
Axiom processing language provides you with different query operators for searching string data types. Below are the list of string operators we support on Axiom processing language. Note: The following abbreviations are used in the table below:- RHS = right hand side of the expression.
- LHS = left hand side of the expression.
- instead of
=~, use== - instead of
in~, usein - instead of
contains, usecontains_cs
| Operator | Description | Case-Sensitive | Example |
|---|---|---|---|
| == | Equals | Yes | "aBc" == "aBc" |
| != | Not equals | Yes | "abc" != "ABC" |
| =~ | Equals | No | "abc" =~ "ABC" |
| !~ | Not equals | No | "aBc" !~ "xyz" |
| contains | RHS occurs as a subsequence of LHS | No | parentSpanId contains Span |
| !contains | RHS doesn’t occur in LHS | No | parentSpanId !contains abc |
| contains_cs | RHS occurs as a subsequence of LHS | Yes | parentSpanId contains_cs “Id” |
| !contains_cs | RHS doesn’t occur in LHS | Yes | parentSpanId !contains_cs “Id” |
| startswith | RHS is an initial subsequence of LHS | No | parentSpanId startswith parent |
| !startswith | RHS isn’t an initial subsequence of LHS | No | parentSpanId !startswith “Id” |
| startswith_cs | RHS is an initial subsequence of LHS | Yes | parentSpanId startswith_cs “parent” |
| !startswith_cs | RHS isn’t an initial subsequence of LHS | Yes | parentSpanId !startswith_cs “parent” |
| endswith | RHS is a closing subsequence of LHS | No | parentSpanId endswith “Id” |
| !endswith | RHS isn’t a closing subsequence of LHS | No | parentSpanId !endswith Span |
| endswith_cs | RHS is a closing subsequence of LHS | Yes | parentSpanId endswith_cs Id |
| !endswith_cs | RHS isn’t a closing subsequence of LHS | Yes | parentSpanId !endswith_cs Span |
| in | Equals to one of the elements | Yes | abc in (“123”, “345”, “abc”) |
| !in | Not equals to any of the elements | Yes | ”bca” !in (“123”, “345”, “abc”) |
| in~ | Equals to one of the elements | No | ”abc” in~ (“123”, “345”, “ABC”) |
| !in~ | Not equals to any of the elements | No | ”bca” !in~ (“123”, “345”, “ABC”) |
| !matches regex | LHS doesn’t contain a match for RHS | Yes | parentSpanId !matches regex g.*r |
| matches regex | LHS contains a match for RHS | Yes | parentSpanId matches regex g.*r |
| has | RHS is a whole term in LHS | No | Content Type has text |
| has_cs | RHS is a whole term in LHS | Yes | Content Type has_cs Text |
Use string operators efficiently
String operators are fundamental in comparing, searching, or matching strings. Understanding the performance implications of different operators can significantly optimize your queries. Below are performance tips and query examples.Equality and Inequality Operators
- Operators:
==,!=,=~,!~,in,!in,in~,!in~
- Use
==or!=for exact match comparisons when case sensitivity is important, as they are faster. - Use
=~or!~for case-insensitive comparisons, or when the exact case is unknown. - Use
inor!infor checking membership within a set of values, which can be efficient for a small set of values.
Subsequence Matching Operators
- Operators:
contains,!contains,contains_cs,!contains_cs,startswith,!startswith,startswith_cs,!startswith_cs,endswith,!endswith,endswith_cs,!endswith_cs.
- Use case-sensitive operators (
contains_cs,startswith_cs,endswith_cs) when the case is known, as they are faster.
Regular Expression Matching Operators
- Operators:
matches regex,!matches regex
- Avoid complex regular expressions or use string operators for simple substring, prefix, or suffix matching.
Term Matching Operators
- Operators:
has,has_cs
- Use
hasorhas_csfor term matching which can be more efficient than regular expression matching for simple term searches. - Use
has_cswhen the case is known, as it is faster due to case-sensitive matching.
Best Practices
- Always use case-sensitive operators when the case is known, as they are faster.
- Avoid complex regular expressions for simple matching tasks; use simpler string operators instead.
- When matching against a set of values, ensure the set is as small as possible to improve performance.
- For substring matching, prefer prefix or suffix matching over general substring matching for better performance.
has operator
Thehas operator in APL filters rows based on whether a given term or phrase appears within a string field.
Importance of the has operator:
-
Precision Filtering: Unlike the
containsoperator, which matches any substring, thehasoperator looks for exact terms, ensuring more precise results. - Simplicity: Provides an easy and readable way to find exact terms in a string without resorting to regex or other more complex methods.
has operators using the abbreviations provided:
- RHS = right-hand side of the expression
- LHS = left-hand side of the expression
| Operator | Description | Case-Sensitive | Example |
|---|---|---|---|
| has | Right-hand-side (RHS) is a whole term in left-hand-side (LHS) | No | ”North America” has “america” |
| has_cs | RHS is a whole term in LHS | Yes | ”North America” has_cs “America” |
| hassuffix | LHS string ends with the RHS string | No | ”documentation.docx” hassuffix “.docx” |
| hasprefix | LHS string starts with the RHS string | No | ”Admin_User” hasprefix “Admin” |
| hassuffix_cs | LHS string ends with the RHS string | Yes | ”Document.HTML” hassuffix_cs ”.HTML” |
| hasprefix_cs | LHS string starts with the RHS string | Yes | ”DOCS_file” hasprefix_cs “DOCS” |
Syntax
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
| Field | string | ✓ | The field filters the events. |
| Expression | scalar or tabular | ✓ | An expression for which to search. The first field is used if the value of the expression has multiple fields. |
Returns
Thehas operator returns rows from the dataset where the specified term is found in the given field. If the term is present, the row is included in the result set; otherwise, it is filtered out.
Example
Output
| event_count | content_type |
|---|---|
| 132,765 | text/html |
| 132,621 | text/plain-charset=utf-8 |
| 89,085 | text/csv |
| 88,436 | text/css |