Let’s say you want a powerful tool that is optimized to search, monitor and analyze data. Splunk Enterprise is the answer. Splunk can process both unstructured, structured and complex multi-line data. The three main components are: data ingestion, data search, and data visualization. Feature include: easy search/navigate, real-time visibility, historical analytics, reports, alerts, dashboards and visualization. Note: Reports are repeatable searches that can be embedded in a dashboard. So, a dashboard is simply a collection of reports, typically in visual format. Splunk can also be used to send pro-active alerts.
Example Dashboard on the number of B&B properties per neighbourhoods in New York city.
To use Splunk, simply insert your log data into a log file. Make sure a datetime stamp and a set of index-identifying fields are included with every line in your log file. Splunk automatically adds log data to a specific index based on certain fields in your log entry. You don’t specify the index yourself.
An index is a repository for log data. You typically create different indexes for different applications and different types of logs. Think of audit logs, application logs and business logs. Splunk transforms incoming data into events. The events are based on timestamps. Splunk uses indexes to facilitate flexible searching and fast data retrieval, eventually archiving them according to a user-configurable schedule. Splunk uses flat files. It doesn’t require any third-party database software running in the background.
Splunk has a three-tier architecture:
- Splunk Forwarder. Get data or log files into Splunk.
- Splunk Indexer puts your data or log files in a searchable fashion.
- Splunk Search Head is your user interface for searching
As data volumes grow, we will need more forwarders and more indexers. It’s not handy to have stand-alone indexers. Instead we will have an indexer cluster, so that indexer data is replicated to multiple indexers. Forwarders and Search Head are targeted at the index cluster, not at individual indexers.
As an addition to the above. You can also send data directly to Splunk via the so called Http Event Collector. It’s a simple Http call. For security you can use a header with an authentication code and you can use SSL/HTTPS. Advantage: events are directly sent to the indexer. You don’t need a forwarder that first translates log files to events and then have the events indexed. Disadvantage: you will have to call Splunk yourself, which probably needs a lot more coding.
Another option I came across, is the use of Node-Red. Node-Red is a visual tool that can be configured to insert log data via one tool. A colleague raised as a disadvantage that everyone can read and manipulate Node-Red, which makes it an insecure solution. Seems unlikely, but I don’t know enough about Node-Red to acknowlegde or disapprove this statement.
- Sumologic (SaaS)
- Elastic stack (open source, but more complexity)
- Data Dog
- New Relic
Furthermore it’s interesting to compare Splunk to Azure Monitor. I would say, Splunk is interesting in a cloud-native scenario, where you start with an on-prem container platform. Azure Monitor is strongly focused on an Azure cloud scenario. Azure Monitor is also very developer focused, not aimed at the business. Azure monitor is good at application monitoring and audit logging. For business logging, you could use tracked properties in Azure. Next you can use Kusto queries to select entities based on tracked properties and answer questions like: “What is the status of my order?” or “Why are my customer data not updated?”. To further dive into the topic, use the following Gartner resource: Splunk vs Azure Monitor: Gartner Peer Insights 2022