How to read data from Splunk in SSIS

Introduction

Splunk LogoSplunk is commonly used for searching, monitoring, and analyzing machine-generated big data, via a Web-style interface. In this post, you will learn how to implement Splunk API Integration with SQL Server or any other RDBMS (e.g. Oracle, MySQL, Postgresql) using SSIS in a few clicks. We will use SSIS XML Source Connector to Read data from Splunk and Load into SQL Server / other targets (Using HTTP Connection).

We will discuss on How to Create an Intuit Developer Account, How to Create QuickBooks Online App for OAuth, We will also discuss reverse scenario to Write data to Splunk (API POST for Insert or Update in Splunk) using SSIS REST API Task

In nutshell, this post will focus on how to call Splunk API using SSIS.
So let’s get started.

Prerequisites

Before we perform the steps listed in this article, you will need to make sure the following prerequisites are met:
  1. SSIS designer installed. Sometimes it is referred to as BIDS or SSDT (download it from the Microsoft site).
  2. Basic knowledge of SSIS package development using Microsoft SQL Server Integration Services.
  3. Make sure ZappySys SSIS PowerPack is installed (download it, if you haven't already).
  4. (Optional step). Read this article, if you are planning to deploy packages to a server and schedule their execution later.

What is Splunk

Splunk is a software platform to search, analyze and visualize the machine-generated data gathered from the websites, applications, sensors, devices etc. which make up your IT infrastructure and business.

The other benefits of implementing Splunk are:

  • Input data can be in any format for e.g. .csv, or JSON or other formats
  • Give Alerts / Events notification at the onset of a machine state
  • Accurately predict the resources needed for scaling up the infrastructure
  • Create knowledge objects for Operational Intelligence

Why Use Splunk REST API

So there will be a time when you want to automate certain things without using Splunk Web Portal. Here are some example scenarios which can be solved using the Splunk REST API.

  • Reading/Update Splunk configuration files
  • Creating Splunk searches using the REST API
  • Create a new Splunk object for a specific context
  • Edit a Splunk object

Step-By-Step – Import Splunk data into SQL Server

In order to start, we will show several examples. ZappySys includes an SSIS XML Source Connector that will help you to call Splunk API, read search job results data from Splunk with SSIS, create a new Saved Search on Splunk with SSIS and do many more actions using REST API. To learn more about Splunk REST API check this help file.

To get Splunk data using the REST API call, you need to have a Free or Licensed Splunk Account. You can Download Splunk Enterprise or Register with Splunk see this link. You can find the videos for How to Install and Getting Data in on this link: Splunk Enterprise Videos.

Create / Configure Splunk Dataset

So first step in our demo would be make sure we have at least one Splunk Dataset which we can query using REST API. If you have already configured Dataset then skip this section.

  1. First of all, let’s start the Splunk on your machine from the program menu.
    ssis-start-splunk

    Start Splunk

  2. Hence, Splunk opens into the browser. So let’s logged in and let’s Add Data for WinEvents to be searched in our next phase.
    ssis-splunk-add-data-click

    Add Data

  3. To start adding data first, choose a data source. For example, you can monitor WinEvents.
    ssis-splunk-moniter-data-click

    Select Data Source

  4. Select Search & Reporting in App Context and click next for review and click next to finish.
    ssis-splunk-select-search-and-reporting

    Select Search & Reporting

  5. That’s it local event logs input has been created successfully for searching.
    ssis-splunk-start-searching-local-source

    Start searching for local events

  6. Therefore you will redirect to search page.
    ssis-splunk-search-page

    Splunk search page

  7. So, let’s import WinEvents search data into SQL Server in the next few sections.

 

Connect to Splunk in SSIS / Create Search JOB / Obtain JobID

Now once you have Splunk Dataset configured, next step is Call Splunk API. So most common thing you will do with splunk API is to call splunk search API. In splunk Search API is Job Style API means you perform following steps

  1. First call /services/search/jobs/ endpoint with Search query (in POST) to create search Job. It returns you Job ID (also known as SID)
  2. Wait until Job is done (This part may be tricky) … SO you may have to add timer task after first step and in the 3rd step enable Retry option on HTTP connection just incase it didnt finish within supplied delay.
  3. Once Job finished you can read by calling below API endpoint.
    /services/search/jobs/{{your Job Id}}/results?output_mode=json

Thats it now let’s look at actual steps

  1. Now let’s make the first call to search job using a POST method. If you are using a custom time range, pass it in with the POST request. Refer this link for more information: Export data using the Splunk REST API. If your URL is different than below HTTP connection then Check Use direct URL option on REST API Task and enter custom URL.
    ssis-splunk-search-post-call

    Splunk POST Search request

  2. So let’s get Response SID(Search job ID) into the variable to be used in the next call to get results.
    ssis-splunk-get-sid-in-variable

    Store Splunk Response Search Job ID in a variable

  3. Now add the Data Flow Task to retrieve Splunk search result data using JSON Source (REST API or File) in it. In below URL we are specifying count=5 but in real world change it to 100 or higher.
    ssis-splunk-get-results-meta

    Splunk Search JSON Results Data

  4. Furthermore, Let’s set pagination to get all the results data.
    NOTE: In below screenshot we are specifying count=5 and Increment=5 just to show demo but in real world set both items to higher value (e.g. 100) so you request more records per page.
    ssis-splunk-set-pagination

    Set Pagination

  5. Let’s preview the results data.
    Preview Splunk Search Data

    Preview Splunk Search Data

  6. Edit JSON Source again and replace the below URL with it to get results By SID(Search Job ID) variable. And create the new HTTP connection as if the search job(SID) is newly created, it is possible that we will #get 204s(No Content) until the job is ready to respond.
    ssis-splunk-204-handling

    204 No Content response error handling

  7. Furthermore, click on OK button and make sure we need to click on NO.
    ssis-splunk-set-variable-url

    Splunk URL With variable

  8. Finally, we are ready to Load Splunk data into the SQL Server.

Load Splunk data into SQL Server

Now let's look at how to load data into target like SQL Server, Oracle or Flat File. In below example we will see loading data into SQL Server database but steps may remain same for other targets which can be accessed using OLEDB Drivers (e.g. Oracle).
  1. Inside Data Flow, Drag and drop Upsert Destination Component from SSIS Toolbox
  2. Connect our Source component to Upsert Destination
  3. Double click Upsert Destination to configure it
  4. Select Target Connection or click NEW to create new connectionConfigure SSIS Upsert Destination Connection - Loading data (REST / SOAP / JSON / XML /CSV) into SQL Server or other target using SSIS Configure SSIS Upsert Destination Connection - Loading data (REST / SOAP / JSON / XML /CSV) into SQL Server or other target using SSIS
  5. Select Target Table or click NEW to create new table based on source columns
  6. Click on Mappings Tab to Auto map columns by name. You can change mappings as you need SSIS Upsert Destination - Columns Mappings SSIS Upsert Destination - Columns Mappings
  7. Click OK to Save Upsert Destination Settings
  8. That's it, You are now ready to run data flow. NOTE: If you wish to debug data flow and see records when you run, add data viewer by right click on blue arrow > Click Enable Data Viewer
  9. To execute data flow, Right click anywhere inside Data Flow Surface and click Execute Task
 

Conclusion

Above all, in this blog, we learned how to read Splunk data in SSIS. Furthermore we used XML Source Component and SSIS REST API Task to call Splunk REST API and load data into SQL server. You can download SSIS PowerPack here to try many other scenarios not discussed in this blog along with 70+ other components.

References

Finally, you can use the following links for more information about the use of Splunk Online REST API with our tools:

Posted in HTTP Connection, REST API, REST API Integration, SSIS Connection Manager, SSIS JSON Source (File/REST) and tagged , , , , .