Categorization

Data for analysis may be unstructured or semi-structured. For example, the log messages below contain useful information in the message data. Each line contains a different message. Each category may contain messages that are very similar but not identical. You would expect three categories for the four messages below - the middle two should be in the same category.

2015-04-30 10:00:00 EST INFO Server started on localhost
2015-04-30 10:00:01 EST INFO User BOB connection attempt accepted
2015-04-30 10:04:15 EST INFO User FRED connection attempt accepted
2015-04-30 10:05:24 EST ERROR User MARY connction failed: Invalid password

In this example it is desirable to analyze records based on a categorization of the values in the log message rather than their actual value. In such a case, an analysis that detects anomalous rates of a certain category of messages or detection of rare categories could reveal useful insight about a system’s operation.

Categorization uses a proprietary algorithm to “categorize” unstructured events. It automatically derives categories and assigns each data record to the most suitable one, prior to performing anomaly detection.

Syntax

Categorization is applied during the analysis. The field to be categorized should be passed in as input.

Input Search:

index=<application_log>

Detector:

rare by prelertcategory [categorizationfield=<message>]
count by prelertcategory [categorizationfield=<message>]

where:

prelertcategory:
 A special field that invokes categroization using the prelertcategorize command.
categorizationfield=<message>:
 Optional. Defines which field to perform categorization on. By default, will use _raw.

From here you may create an Anomaly Search for ongoing analysis by clicking on Create Anomaly Search.

See also

Categorization field

This page

Browse

You are here