Glossary A-Z

A-E

Anomaly Score - In order to provide a sensible view of the results, we calculate an Anomaly Score for each bucket time interval. An interval with a high Anomaly Score is significant and requires investigation. It is a sophisticated aggregation of the anomaly records. The calculation is optimized for high throughput, gracefully ages historical data and reduces the signal to noise levels. It adjusts for variations in event rate, takes into account the frequency and the level of anomalous activity and is adjusted relative to past anomalous behavior. In addition, it is boosted if anomalous activity occurs for related entities, for example if disk IO and CPU are both behaving unusually for a host.

Anomaly Search - This contains the configuration information required for an Anomaly Detection analysis. It may be run “continuously” which analyzes new data as it becomes available in Splunk, or “historically” which analyzes past data over a fixed time range. The results from an Anomaly Search are stored in a Splunk index called prelertresults. Insights are created from these results.

Bucket - This is the window for time series analysis. Typically between 5 mins to 1 hr, although will vary depending on the data. When setting the bucket span take into account the granularity at which you want to analyze by, the typical duration of an anomaly, the frequency of the input data and the frequency at which alerting is required.

Clipboard - This is a place to temporarily store anomalies of interest whilst navigating results. Anomalies held in the Clipboard may be used to create Insights.

Categorization - A data transform that uses a proprietary algorithm to automatically categorize unstructured events.

Detectors - These are the analysis functions to be performed, for example sum(bytes) and count. There can be one or more detectors. Further information can be found in Detector Configuration.

Entity - This is an object that appears in our results. An entity may be an influencer or a by, over or partitionfield.

Evaluation Mode - This will run the analysis using the prelertautodetect Splunk command. This is a useful way to quickly see if your analysis functions are relevant before creating your Anomaly Searches. There are limitations to using Evaluation Mode; results are not persisted, it is not possible to create insights, it is not possible to view results in the Entity View and the data is analyzed in reverse time order. This was previously known as AutoDetect.

excludefrequent - A detector configuration option - if set to true, will automatically idenfity and exclude frequently occuring entites which may otherwise have dominated results.

F-J

Individual Analysis - For detectors that use a by field and not an over field. Describes an analysis approach where anomalies are detected compared to an entity’s own past behavior, as opposed to the behavior of the population. Individual analysis models the historical behavior of each entity and therefore has a higher memory requirement. The memory required grows as the number of entities increases, so please ensure you have sufficient resource available. may also be referred to as “temporal analysis”.

Influencers - These are the persons or entities to blame for the anomaly. There can be one or more influencer fields defined in an Anomaly Search, for example user and clientip. An overall anomaly score for each influencer across all detectors is calculated per bucket. Influencers are optional but are strongly recommended.

Insights - An Insight is a collection of anomalies that tell the story of your data. They can be manually or automatically created.

K-O

LookBack - When configuring an Anomaly Search to run historically you can specify a start and end time to run the analysis for. This is known as a LookBack. Also, before starting an Anomaly Search to run continuously, you may wish to train the model. Providing your data can be read using the same search, you can do this by specifying to run a LookBack for x days prior.

Normalized Probability - We calculate a Normalized Probability for each anomaly record, which is a number between 0 and 100. It is a statistically valid and “friendly” representation of the probability of that record being unusual, with 100 being the most anomalous. It is normalized across the period of the model.

P-T

Partition/By/Over fields - These are methods by which to group your analysis in different ways. Further information can be found in Detector Configuration.

Population Analysis - For detectors that use an over field, or both an over and by field. Describes an analysis approach where anomalies are detected compared to the behavior of the population, as opposed to an entity’s own past behavior. Population analysis massively scales due to its lower memory requirement as population behavior is modeled rather than an individuals’. Often used in conjunction with partitionfield to segment modeling for logical groups within the population e.g. by user_department.

Search Group - This is a way to group Anomaly Searches. An Anomaly Search may only be a member of one Search Group.

Search String - This is the Splunk search that defines what data is fed into the analysis engine. It is part of the Anomaly Search configuration.

StatsReduce - Takes advantage of the Splunk architecture to automatically distribute workload and pre-summarize data on indexers. Read more.

summarycountfield - A detector configuration option - should be used if your input data is already pre-summarized e.g. count_per_min.

Timechart Mode - This will run Anomaly Detection for data from a standard Splunk timechart. If you have existing timecharts that you regularly use as KPI’s, try using these timechart commands with Anomaly Detection. From here you can create an Anomaly Search for on-going analysis. This was previosuly known as QuickMode.

U-Z

usenull - A detector configuration option - if set to true, will model where the by/over/partition field is null. Otherwise these will be ignored by default.

This page

Browse

You are here