Anomaly Search Configuration

To create a new Anomaly Search

  1. In the top level menu, click on Configure then Anomaly Searches
  2. Select New Search and a tabbed dialog will open

Search Details

This tab allow you to configure the general search settings.

Field Description
Search name Unique name for the Anomaly Search. Must not contain whitespace or non-standard characters. Required.
Display name Friendly name for display. Choose something that describes the analysis e.g. website traffic monitor
Search group A search group may contain one or more searches. A search may only be a member of one search group. Allows grouping of results from different searches for the same system to be seen together. e.g. security, CRM, messaging
Search

Contains the Splunk search string used as input to anomaly detection. Can be any Splunk search; however do not include time bounds. e.g. index=_internal

Click on Validate search to view the first 100 results in a separate browser tab.

Action menu link

Provide a link to a custom dashboard or external website.

Label - Text that will be displayed in the action menu

URL - URL containing tokens for string substitution. These can be any top level result field.

e.g. /app/search/mydashboard?id=$host$&earliest=$earliest$&latest=$latest$

e.g. www.myintranet.com/hr?ref=$user$

Analysis Configuration

This tab defines which anomaly detection analysis is to be performed.

Field Description
bucketSpan Set the window for which time series analysis is performed. Typically between 5 mins to 1 hr, although will vary depending on the data. When setting the bucket span take into account the granularity at which you want to analyze by, the typical duration of an anomaly, the frequency of the input data and the frequency at which alerting is required. This setting may only be changed for new and reset jobs.
Field analysis configuration

An anomaly search must have one or more detectors defined.

Detector - The analysis to be performed e.g. sum(bytes) over clientip

Desc - A friendly name for the analysis e.g. Data exfiltration

Sourcetype - Default is blank. An advanced configuration option that allows different analysis to be performed on different sourcetypes, at the same time.

Click on Add Detector to add more detectors.

Influencers

Sets the persons or entities to blame for the anomaly. There can be one or more influencer types, for example user and clientip. Setting an Influencer is strongly recommended as most results views were designed around the influencer.

An influencer can be part of the detector configuration or another relevant field in the input data.

For example, if analyzing total data sent by client machines, you may additionally add user as an influencer. This will highlight any user that significantly contributed to the anomaly eventhough the data was not specifically modeled by user.

If multiple detectors are configured, the influencer severity will be an aggregated view of their anomalouness.

Click on Add by/over/partition fields to influencer to quickly add all detector fields.

Run Mode

Sets the modes in which to run anomaly detection.

Field Description
Continuous

Select this to run the analysis continuously, analyzing new data as it becomes available.

Note: This is implemented as a scheduled search and does not use the Splunk real-time search functionality.

Run LookBack for - Select the length of time, looking back from today, for which to include training data for analysis. This will happen once, prior to running continuously.

Start running continuously after LookBack - Select this to run ongoing anomaly detection.

Historical

Select this to run the analysis as a one-off batch, analyzing over a fixed time range.

Earliest - Set the start time for analysis

Latest - Set the end time for analysis

Click Calculate time span to automatically populate Earliest and Latest based on the data defined by the Search string. This may take some time for larger Splunk indexes.

StatsReduce mode

Recommended for performance, especially if you have multiple indexers and expect your search to return a large number of events.

Not supported for use with categorization or sourcetype-specific detector configurations. Read more on Using StatsReduce in distributed environments.

Advanced Settings

It is unlikely that these advanced setting will need to be changed. Please contact support@prelert.com before changing.

Field Description
persisteIndex The KV store collection in which model state is persisted. Defaults to prelertstate.
quantileIndex The KV store collection in which anomaly quantiles are persisted, which are used for normalization. Defaults to prelertquantiles.
maxSearchBuckets The maximum number of buckets to be retrieved before the current time. Increasing this value for historical analysis can improve performance; however has an extra memory overhead. The default value depends upon the bucketSpan. For a 5 min bucketspan, it is set to 12.
bufferSpan For use when running continuously, this sets the time delay in seconds between the current time and the latest search time. This parameter may need to be increased if the time taken for data to be indexed by Splunk is long. It can also be used to stagger job start times if required. Default is 120.