Bucket spans

What is a bucket span?

When Prelert analyzes data, we use the concept of a bucket to divide up a continuous stream of data into batches for processing. For example, if you were using Prelert to monitor the average response time of a system and received a data point every 10 minutes, using a bucket span of 1 hour means that at the end of each hour we would calculate the average (mean) value of the last hour’s worth of data and compute the anomalousness of that average value compared to previous hours.

As you can see from this example, the bucket span has two purposes: it dictates over what time span to look for anomalous features in data, and also determines how quickly anomalies can be detected. Choosing a shorter bucket span allows anomalies to be detected more quickly but at the risk of being too sensitive to natural variations or noise in the input data. Choosing too long a bucket span however can mean that interesting anomalies are averaged away.

This plot shows the data points within each bucket:

Hour-long bucket span example

As another example, if you were looking at the order rate for an online store to check for system malfunctions, receiving a record for every transaction processed, a bucket span of 5 minutes might be a good choice, as this allows for some variablility in the order rate but if the orders are unusually low for several minutes Prelert will generate an anomaly within 5 minutes.

Overlapping buckets

Some of the Prelert analytical functions look for single anomalous data points, e.g. max, which identifies the maximum value seen within a bucket. Others perform some aggregation over the length of the bucket, e.g. mean, which calculates the mean of all the data points seen within the bucket, or count, which calculates the total number of data points within the bucket. There is the possibility that the aggregation might smooth out some anomalies based on when the bucket starts in time. In our earlier example of order rates for an online store, if the processing system malfunctioned and all the transactions were blocked for 4 minutes before being processed in a flurry, the overall count within the 5 minute bucket could appear normal even though the system malfunctioned.

This plot illustrates the data points from the malfunctioning system. The regular bucket boundary is marked by the blue lines, while the overlapping bucket is marked by red.

5 minute overlapping bucket example

To avoid this, you can use overlapping buckets; the Engine API analyzes the data points in two buckets simultaneously, one starting half a bucket span later than the other.

Overlapping buckets are only beneficial for aggregating functions, and should not be used for non-aggregating functions.

Interim results

With overlapping buckets, we need to decide which bucket was the most anomalous and avoid creating more than one anomalous bucket for the same anomaly in the data. In fact we look at the 3 most recent overlapping buckets before deciding which one was the most anomalous, and this creates a delay of up to 2 full bucket spans in outputting the finalized results.

The very latest overlapping buckets are still available to view as interim results, but before we have decided which bucket was the most anomalous these interim results are liable to change. An interim anomaly will never appear and then be removed unless it is replaced by a more significant anomaly.