TOP Splunk Interview Questions with Answers

Question: What is Splunk?

The platform of Splunk allows you to get visibility into machine data generated from different networks, servers, devices, and hardware.

It can give insights into the application management, threat visibility, compliance, security, etc. so it is used to analyze machine data. The data is collected from the forwarder from the source and forwarded to the indexer. The data is stored locally on a host machine or cloud.

Then on the data stored in the indexer the search head searches, visualizes, analyzes and performs various other functions.

Question: What Are the Components Of Splunk?

The main components of Splunk are Forwarders, Indexers and Search Heads. Deployment Server (or Management Console Host) will come into the picture in case of a larger environment.

Deployment servers act like an antivirus policy server for setting up Exceptions and Groups so that you can map and create a different set of data collection policies each for either window-based server or a Linux based server or a Solaris based server.

Splunk has four important components:

Indexer – It indexes the machine data
Forwarder – Refers to Splunk instances that forward data to the remote indexers
Search Head – Provides GUI for searching
Deployment Server –Manages the Splunk components like indexer, forwarder, and serach head in computing environment.

Question: What are alerts in Splunk?

An alert is an action that a saved search triggers on regular intervals set over a time range, based on the results of the search.

When the alerts are triggered, various actions occur consequently. For instance, sending an email when a search to the predefined list of people is triggered.

Three types of alerts:

1. Pre-result alerts : Most commonly used alert type and runs in real-time for an all-time span. These alerts are designed such that whenever a search returns a result, they are triggered.

2. Scheduled alerts : The second most common- scheduled results are set up to
evaluate the results of a historical search result running over a set time range on a regular schedule. You can define a time range, schedule and the trigger condition to an alert.

3. Rolling-window alerts : These are the hybrid of pre-result and scheduled alerts.

Similar to the former, these are based on real-time search but do not trigger each
time the search returns a matching result .

It examines all events in real-time mapping within the rolling window and triggers the time that specific condition by that event in the window is met, like the scheduled alert is triggered on a scheduled search.

Question: What are the Categories Of SPL Commands?

SPL commands are divided into five categories:
1. Sorting Results – Ordering results and (optionally) limiting the number of results.
2. Filtering Results – It takes a set of events or results and filters them into a smaller set of results.
3. Grouping Results – Grouping events so you can see patterns.
4. Filtering, Modifying and Adding Fields – Taking search results and generating a summary for reporting.
5. Reporting Results – Filtering out some fields to focus on the ones you need, or modifying or adding fields to enrich your results or events.

Question: What Happens If the License Master Is Unreachable?

In case the license master is unreachable, then it is just not possible to search the data. However, the data coming in to the Indexer will not be affected. The data will continue to flow into your Splunk deployment.

The Indexers will continue to index the data as usual however, you will get a warning message on top your Search head or web UI saying that you have exceeded the indexing volume.

And you either need to reduce the amount of data coming in or you need to buy a higher capacity of license. Basically, the candidate is expected to answer that the indexing does not stop; only searching is halted.

Question: Explain ‘license violation’ from Splunk perspective?

If we exceed the data limit, then you will be shown a ‘license violation’ error. The license warning that is thrown up, will persist for 14 days. In a commercial license you can have 5 warnings within a 30 day rolling window before which your Indexer’s search results and reports stop triggering. In a free version however, it will show only 3 counts of warning.

Question: What Are Splunk Buckets? Explain the Bucket Lifecycle?

A directory that contains indexed data is known as a Splunk bucket. It also contains events of a certain period. Bucket lifecycle includes following stages:

Hot – It contains newly indexed data and is open for writing. For each index, there
are one or more hot buckets available.

Warm – Data rolled from hot

Cold – Data rolled from warm

Frozen – Data rolled from cold. The indexer deletes frozen data by default, but users can also archive it.

Thawed – Data restored from an archive. If you archive frozen data, you can later
return it to the index by thawing (defrosting) it.

Question: Explain Data Models and Pivot?

Data models are used for creating a structured hierarchical model of data. It can be used when you have a large amount of unstructured data, and when you want to make use of that information without using complex search queries.

A few use cases of Data models are:

Create Sales Reports: If you have a sales report, then you can easily create the total number of successful purchases, below that you can create a child object containing the list of failed purchases and other views

Set Access Levels: If you want a structured view of users and their various access levels, you can use a data model on the other hand with pivots, you have the flexibility to create the front views of your results and then pick and choose the most appropriate filter for a better view of results.

Question: Explain Search Factor (SF) & Replication Factor (RF)?

ü  The search factor determines the number of searchable copies of data maintained by the indexer cluster. The default value of search factor is 2. However, the Replication Factor in case of Indexer cluster, is the number of copies of data the cluster maintains and in case of a search head cluster, it is the minimum number of copies of each search artifact, the cluster maintains.

ü  Search head cluster has only a Search Factor whereas an Indexer cluster has both a Search Factor and a Replication Factor.

ü  Important point to note is that the search factor must be less than or equal to the replication factor

Question: What Is File Precedence In Splunk?

File precedence is an important aspect of troubleshooting in Splunk for an administrator, developer, as well as an architect.

All of Splunk’s configurations are written in .conf files. There can be multiple copies present for each of these files, and thus it is important to know the role these files play when a Splunk instance is running or restarted. To determine the priority among copies of a configuration file, Splunk software first determines the directory scheme.

The directory schemes are either
a)     Global or b) App/user.

When the context is global (that is, where there’s no app/user context), directory priority descends in this order:

1. System local directory — highest priority
2. App local directories
3. App default directories
4. System default directory — lowest priority

When the context is app/user, directory priority descends from user to app to system:

1. User directories for current user — highest priority
2. App directories for currently running app (local, followed by default)
3. App directories for all other apps (local, followed by default) — for exported settings only.
4. System directories (local, followed by default) — lowest priority

Question: Difference Between Search Time And Index Time Field Extractions?

Search time field extraction refers to the fields extracted while performing searches. Whereas, fields extracted when the data comes to the indexer are referred to as Index time field extraction.

You can set up the indexer time field extraction either at the forwarder level or at the indexer level.

Another difference is that Search time field extraction’s extracted fields are not part of the metadata, so they do not consume disk space.

Whereas index time field extraction’s extracted fields are a part of metadata and hence consume disk space.

Question: What Is Source Type In Splunk?

Source type is a default field which is used to identify the data structure of an incoming event. Source type determines how Splunk Enterprise formats the data during the indexing process.

Source type can be set at the forwarder level for indexer extraction to identify different data formats.

Question: What is Splunk App? What is the difference between Splunk App and Add-on?

Splunk Apps are considered to be the entire collection of reports, dashboards, alerts, field extractions and lookups. 

Splunk Apps minus the visual components of a report or a dashboard are Splunk Add-ons. Lookups, field extractions etc are examples of Splunk Add-on.

Question: What is SOS?

SOS stands for Splunk on Splunk. It is a Splunk app that provides graphical view of your Splunk environment performance and issues.

It has following purposes:

ü  Diagnostic tool to analyze and troubleshoot problems
ü  Examine Splunk environment performance
ü  Solve indexing performance issues
ü  Observe scheduler activities and issues
ü  See the details of scheduler and user driven search activity
ü  Search, view and compare configuration files of Splunk

Question: What Is Splunk Indexer and Explain Its Stages?

The indexer is a Splunk Enterprise component that creates and manages indexes. The main functions of an indexer are:
Indexing incoming data

Searching indexed data Splunk indexer has following stages:

Input : Splunk Enterprise acquires the raw data from various input sources and breaks it into 64K blocks and assign them some metadata keys. These keys include host, source and source type of the data.

Parsing : Also known as event processing, during this stage,the Enterprise analyzes and transforms the data, breaks data into streams, identifies, parses and sets timestamps, performs metadata annotation and transformation of data.

Indexing : In this phase, the parsed events are written on the disk index including both compressed data and the associated index files.

Searching : The ‘Search’ function plays a major role during this phase as it handles all searching aspects (interactive, scheduled, searches, reports, dashboards, alerts) on the indexed data and stores saved searches, events, field extractions and views

Question: State the Difference Between Stats and Eventstats commands?

Stats – This command produces summary statistics of all existing fields in your search results and store them as values in new fields.

Eventstats – It is same as stats command except that aggregation results are added in order to every event and only if the aggregation is applicable to that event. It computes the requested statistics like stats but aggregates them to the original raw data.

