Engine API

Overview

The Engine API is a RESTful interface that enables developers to incorporate our advanced analytics engine into their applications using any language that supports communications over HTTP. Prelert’s Behavioral Analytics uses artificial intelligence in the form of unsupervised machine learning and advanced computational mathematics to process huge volumes of streaming data. It automatically learns normal behavior patterns represented by the data, then identifies and cross-correlates the anomalies.

Data can be streamed from Big Data stores, from proprietary databases or by uploading a file.

Results are provided in JSON format, and can be programatically incorporated into existing systems for analysis alongside operational data. The system is self-learning and automatically models the data, without needing to be configured or trained. The components in a typical end-to-end system are as follows:

Engine architecture

The API is modeled around the concept of jobs where each job encapsulates the analytics task, the configuration for that task and the results of the analytics. Jobs have an immutable configuration that once created cannot be modified. Each job has a unique ID that must be used in all job related actions including uploading data and querying results. Jobs can be batched, where analysis is performed on a fixed data set. Jobs can also run against streamed data, providing real-time anomaly detection.

Today, the analytics engine routinely processes millions of data points in real-time and identifies performance, security and operational anomalies and their cause as they develop, that can be acted on before impact to the business occurs.

Typical use cases

Enterprises, government organizations and cloud based service providers daily process volumes of machine data so massive as to make real-time human analysis impossible. Changing behaviors hidden in this data provide the information needed to quickly resolve massive service outage, detect security breaches before they result in the theft of millions of credit records or identify the next big trend in consumer patterns. Current search and analysis, performance management and cyber security tools are unable to find these anomalies without significant human work in the form of thresholds, rules, signatures and data models.

By using advanced anomaly detection techniques that learn normal behavior patterns represented by the data and identify and cross-correlate anomalies, performance, security and operational anomalies and their cause can be identified as they develop, so they can be acted on before they impact business.

Whilst anomaly detection is applicable to any type of data, we focus on machine data scenarios. Enterprise application developers, cloud service providers and technology vendors need to harness the power of machine learning based anomaly detection analytics to better manage complex on-line services, detect the earliest signs of advanced security threats and gain insight to business opportunities and risks represented by changing behaviors hidden in their massive data sets. Here are some real-world examples.

Eliminating noise generated by threshold-based alerts

Modern IT systems are highly instrumented and can generate TBs of machine data a day. Traditional methods for analyzing data involves alerting when metric values exceed a known value (static thresholds), or looking for simple statistical deviations (dynamic thresholds).

Setting accurate thresholds for each metric at different times of day is practically impossible. It results in static thresholds generating large volumes of false positives (threshold set too low) and false negatives (threshold set too high).

The Engine API automatically learns, and calculates the probability of a value being anomalous based on its historical beahvior. This enables accurate alerting, and will highlight only the subset of relevant metrics that have changed. These alerts provide actionable insight into what is a growing mountain of data.

We have included a set of worked examples that will show how the Engine API can be used for data analysis.

Additional material is available to download from GitHub.

  • Flight Comparison Website Tutorial using cURL - A great place to start, this introductory tutorial provides sample data to analyze web response times using cURL.
  • Application Performance Management Tutorial using Python - An example of using a data stream to analyze performance anomalies in APM data using Python.
  • Using Windows PowerShell Tutorial - If you are a Windows user, this is a quick start way to learn anomaly detection using Windows PowerShell to analyze sample data.