This page explains how to use the variance aggregation function in APL.
The variance
aggregation function in APL calculates the variance of a numeric expression across a set of records. Variance is a statistical measurement that represents the spread of data points in a dataset. It’s useful for understanding how much variation exists in your data. In scenarios such as performance analysis, network traffic monitoring, or anomaly detection, variance
helps identify outliers and patterns by showing how data points deviate from the mean.
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
Splunk SPL users
In SPL, variance is computed using the stats
command with the var
function, whereas in APL, you can use variance
for the same functionality.
ANSI SQL users
In ANSI SQL, variance is typically calculated using VAR_POP
or VAR_SAMP
. APL provides a simpler approach using the variance
function without needing to specify population or sample.
Expression
: A numeric expression or field for which you want to compute the variance. The expression should evaluate to a numeric data type.The function returns the variance (a numeric value) of the specified expression across the records.
You can use the variance
function to measure the variability of request durations, which helps in identifying performance bottlenecks or anomalies in web services.
Query
Output
variance_req_duration_ms |
---|
1024.5 |
This query calculates the variance of request durations from a dataset of HTTP logs. A high variance indicates greater variability in request durations, potentially signaling performance issues.
You can use the variance
function to measure the variability of request durations, which helps in identifying performance bottlenecks or anomalies in web services.
Query
Output
variance_req_duration_ms |
---|
1024.5 |
This query calculates the variance of request durations from a dataset of HTTP logs. A high variance indicates greater variability in request durations, potentially signaling performance issues.
For OpenTelemetry traces, variance
can be used to measure how much span durations differ across service invocations, helping in performance optimization and anomaly detection.
Query
Output
variance_duration |
---|
1287.3 |
This query computes the variance of span durations across traces, which helps in understanding how consistent the service performance is. A higher variance might indicate unstable or inconsistent performance.
You can use the variance
function on security logs to detect abnormal patterns in request behavior, such as unusual fluctuations in response times, which may point to potential security threats.
Query
Output
status | variance_req_duration_ms |
---|---|
200 | 1534.8 |
404 | 2103.4 |
This query calculates the variance of request durations grouped by HTTP status codes. High variance in certain status codes (e.g., 404 errors) can indicate network or application issues.
stdev
when you need the spread of data in the same units as the original dataset.avg
with variance
to analyze both the central tendency and the spread of data.count
alongside variance
to get a sense of data size relative to variance.percentile
for a more detailed distribution analysis.max
when you are looking for extreme values in addition to variance to detect anomalies.This page explains how to use the variance aggregation function in APL.
The variance
aggregation function in APL calculates the variance of a numeric expression across a set of records. Variance is a statistical measurement that represents the spread of data points in a dataset. It’s useful for understanding how much variation exists in your data. In scenarios such as performance analysis, network traffic monitoring, or anomaly detection, variance
helps identify outliers and patterns by showing how data points deviate from the mean.
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
Splunk SPL users
In SPL, variance is computed using the stats
command with the var
function, whereas in APL, you can use variance
for the same functionality.
ANSI SQL users
In ANSI SQL, variance is typically calculated using VAR_POP
or VAR_SAMP
. APL provides a simpler approach using the variance
function without needing to specify population or sample.
Expression
: A numeric expression or field for which you want to compute the variance. The expression should evaluate to a numeric data type.The function returns the variance (a numeric value) of the specified expression across the records.
You can use the variance
function to measure the variability of request durations, which helps in identifying performance bottlenecks or anomalies in web services.
Query
Output
variance_req_duration_ms |
---|
1024.5 |
This query calculates the variance of request durations from a dataset of HTTP logs. A high variance indicates greater variability in request durations, potentially signaling performance issues.
You can use the variance
function to measure the variability of request durations, which helps in identifying performance bottlenecks or anomalies in web services.
Query
Output
variance_req_duration_ms |
---|
1024.5 |
This query calculates the variance of request durations from a dataset of HTTP logs. A high variance indicates greater variability in request durations, potentially signaling performance issues.
For OpenTelemetry traces, variance
can be used to measure how much span durations differ across service invocations, helping in performance optimization and anomaly detection.
Query
Output
variance_duration |
---|
1287.3 |
This query computes the variance of span durations across traces, which helps in understanding how consistent the service performance is. A higher variance might indicate unstable or inconsistent performance.
You can use the variance
function on security logs to detect abnormal patterns in request behavior, such as unusual fluctuations in response times, which may point to potential security threats.
Query
Output
status | variance_req_duration_ms |
---|---|
200 | 1534.8 |
404 | 2103.4 |
This query calculates the variance of request durations grouped by HTTP status codes. High variance in certain status codes (e.g., 404 errors) can indicate network or application issues.
stdev
when you need the spread of data in the same units as the original dataset.avg
with variance
to analyze both the central tendency and the spread of data.count
alongside variance
to get a sense of data size relative to variance.percentile
for a more detailed distribution analysis.max
when you are looking for extreme values in addition to variance to detect anomalies.