How to Read API data in SSIS and Load into SQL Table

Introduction

If you have need to integrate REST API in SSIS (i.e. fetch JSON, XML, CSV via HTTP call) and wondering how to Read API data in SSIS and Load into SQL DB (i.e. SQL Server, MySQL, Oracle) then you are at the right place.

In this article we will cover simple step by step approach how to achieve API integration in few clicks. For detailed introduction please refer to our previous article.

In next few sections we will cover the following SSIS Components to read data from API like a Table (Parse in Rows / Column). Our instructions will be mostly for JSON Format but concepts are same for other formats (i.e. XML / CSV).

Custom SSIS Components - Json Source JSON Source Connector (Read from REST API, JSON File or OData Service): Use this dataflow component when you have to fetch data from REST API webservice like a table. This component allows you to extract JSON data from webservice and de-normalize nested structure so you can save to Relational database such as SQL Server or any other target (Oracle, FlatFile, Excel, MySQL). This component also supports reading local JSON files or direct JSON string (Wildcard pattern supported too e.g. c:\data\file*.json).
Custom SSIS Components - XML Source (Read File/SOAP/REST Web Service) XML Source Connector (SOAP, File, REST) : Use this dataflow component when you have to fetch data from XML or SOAP webservice and consume data like a table. This component allows you to extract data from webservice and save to SQL Server or any other target (Oracle, FlatFile, Excel, MySQL). This component also supports reading local XML files or direct XML string.
Custom SSIS Components - CSV Source (Read File/REST API) CSV Source Connector (File, REST) : Use this dataflow component when you have to fetch data from File or REST API which returns data in CSV format . This component allows you to extract data from webservice and save to SQL Server or any other target (Oracle, FlatFile, Excel, MySQL). This component also supports reading local CSV files or direct CSV string.

There are few more API components not discussed in this articles. They are REST API Task (no Parser – Just raw API calls) and Web API Destination (Write / Send Data To API / Delete rows). They both have different use case to check them see it fits in your need.

Prerequisites

Before we perform steps listed in this article, you will need to make sure following prerequisites are met:
  1. SSIS designer installed. Sometimes it is referred as BIDS or SSDT (download it from Microsoft site).
  2. Basic knowledge of SSIS package development using Microsoft SQL Server Integration Services.
  3. Make sure ZappySys SSIS PowerPack is installed (download it).
  4. Optional (If you want to Deploy and Schedule ) - Deploy and Schedule SSIS Packages

Video Tutorial

Step-By-Step – Read API data in SSIS using JSON / XML / CSV Source

In this section you will learn how to use JSON Source Adapter to Read data from API URL.

  1. Open Visual Studio and Create New SSIS Package Project.
  2. Now, Drag and Drop SSIS Data Flow Task from SSIS Toolbox.
    SSIS Data Flow Task - Drag and Drop
  3. Double click on the Data Flow task to see Data Flow designer surface.
  4. From the SSIS toolbox drag and drop JSON Source on the dataflow designer surface. If your data format is XML then use XML Source, if your data format is CSV then use CSV Source.
    SSIS JSON Source - Drag and Drop
  5. Double click JSON Source to configure it.
  6. From the Access Mode dropdown select [File path or web Url] and paste the following Url for this example .
  7. Now next step is to select Filter (Click Browse button next to Array Filter) or just type below expression in the text box to define which Hierarchy you like to Flatten (i.e. Array).
    JSON Source Select Filter

    JSON Source Select Filter

  8. JSON / XML can have many Arrays in a single document (like many Tables inside one Database). So you have to define which branch you like to extract.  Enter JSONPath expression in the Filter textbox to extract only specific part of JSON file as below ($.value[*] will get content of value property from JSON document. Value property is an array of JSON documents so we have to use [*] to indicate we want all records of that array)
    Read JSON File data from Web Url Example in SSIS
  9. Click preview to see our settings are OK.
  10. Click OK to save settings.
  11. How to pass credentials to service (Basic Authorization Header) Check this article for more information

After this section we will see how to connect Source to Target like SQL Server and Load API data into SQL Server Table.

Load API data into SQL Server Table / Other Target

Now let's look at how to load data into target like SQL Server, Oracle or Flat File. In below example we will see loading data into SQL Server database but steps may remain same for other targets which can be accessed using OLEDB Drivers (e.g. Oracle).
  1. Inside Data Flow, Drag OLEDB Destination from SSIS Toolbox
  2. Connect our Source component to OLEDB Destination
  3. Double click OLEDB Destination to configure it
  4. Select Target Connection or click NEW to create new connection
    Configure SSIS OLEDB Destination Connection - Loading data (REST / SOAP / JSON / XML /CSV) into SQL Server or other target using SSIS

    Configure SSIS OLEDB Destination Connection - Loading data (REST / SOAP / JSON / XML /CSV) into SQL Server or other target using SSIS

  5. Select Target Table or click NEW to create new table based on source columns
  6. Click on Mappings Tab to Auto map columns by name. You can change mappings as you need
    Create NEW Table SSIS OLEDB Destination

    Create NEW Table SSIS OLEDB Destination

  7. Click OK to Save OLEDB Destination Settings
  8. That's it, You are now ready to run data flow. NOTE: If you wish to debug data flow and see records when you run, add data viewer by right click on blue arrow > Click Enable Data Viewer
  9. To execute data flow, Right click anywhere inside Data Flow Surface and click Execute Task
 

API Authentication

In our step by step example we did not cover Authentication to keep things simple but in real world you have to pass some sort of authentication details to make API calls (e.g. UserID/Password or Secure Token). There are many ways to authenticate with API Services. ZappySys offer mainly 2 Connection Managers to cover most Authentication scenarios.

  1. HTTP Connection Manager
  2. OAuth Connection Manager

Using HTTP Connection Manager

If you have use case of using UserID / Password or any other method supported by HTTP Connection Manager then use below option (Below example is for Basic Authentication)

SSIS JSON Source - Passing Basic Credentials (Supply UserID / Password) using HTTP Connection

SSIS JSON Source – Passing Basic Credentials (Supply UserID / Password) using HTTP Connection

Using OAuth Connection Manager

If your API support OAuth Standard for Authorization then use OAuth Connection Type as below.

Generate Token for Zoom API using OAuth 2.0

Generate Token for Zoom API using OAuth 2.0

API Pagination

Pagination is another important concept to understand when you call API using ZappySys Connector. If you ready many records from your API then by default API might not return all rows so you have to paginate until all records are read (Like a Loop in Programming). ZappySys makes it easy to paginate. Check this article to configure many different Pagination Modes

Truncation related error

The most common error you may face when you run an SSIS package is truncation error. During the design time only 300 rows are scanned from a source (a file or a REST API call response) to detect datatypes but at runtime, it is likely you will retrieve far more records. So it is possible that you will get longer strings than initially expected. For detailed instructions on how to fix common metadata related errors read an article "How to handle SSIS errors (truncation, metadata issues)".

Authentication related error

Another frequent error you may get is an authentication error, which happens when you deploy/copy a package to another machine and run it there. Check the paragraph below to see why it happens and how to solve this problem.

Things have gone bad: Error handling & debugging

Incidentally, bad things can happen. A remote server may go offline or your server may go out of memory. In any case, you may want to know when that happens and take actions accordingly. For that purpose, you have to redirect bad rows to some other destination. For this example, we will take and use Web API Destination, but basically, you can use any SSIS component:

Handling errors

  1. Add a Derived Column above Web API Destination with expression "(DT_WSTR,4000)ZS_JSON_OUT" and name it "JsonAsString". This will let you see what JSON you are actually passing.
  2. Then add a database or file destination or use another Trash Destination for debugging purposes and redirect the bad rows (red arrow) from Web API Destination into it. Don't forget to set Redirect row option for both, Error and Truncation columns:
    Redirect bad rows from <em>Web API Destination</em> to <em>Trash Destination</em> when load from SQL Server to Elasticsearch is failing. Add derived column JsonAsString to be able to read JSON you are using.

    Redirected failed requests from Web API Destination to a desired destination when loading from SQL Server to REST API Service is failing. Derived Column JsonAsString added to be able to read JSON which was passed to Elasticsearch

  3. Finally, add a Data Viewer for the red path, if you want to debug the flow. You will be able to see URL, JSON and the error message for each record. You may want to copy-paste ErrorMessage to Notepad if you want it to be more readable:
    Use Data Viewer to view HTTP requests that failed to be fulfilled in Elasticsearch

    Use Data Viewer to view HTTP requests that failed to be fulfilled.

NOTE: You can read more about redirecting rows in SSIS Error Handling (Redirect bad rows) article.

Debugging HTTP requests

A common thing you have to do when working with HTTP requests is to debug those requests; e.g. to check what headers, body or URL was passed. To test how things look behind the scenes we strongly suggest to use Fiddler - a popular web debugging tool.

Inside it, you can double-click the URL entry (Right side) to see Request and Response Panels. The top panel is Request (URL, Headers, Body) and Bottom Panel is Response. For https:// (secure URL) make sure you enable HTTPS option in Fiddler (Tools > Options > HTTPS > Check Decrypt https request):

Debugging Web API call using Fiddler in SSIS

Debugging Web API call using Fiddler in SSIS

Conclusion

REST API is becoming more and more popular each day. With that Data integration from RESTful API services going to be challenge. Luckily ZappySys SSIS PowerPack provides a great way to integrate any API in SSIS via simple drag and drop approach without coding. Try SSIS PowerPack for FREE  see how much time / money you can save and to integrate virtually any REST API.

Posted in REST API Integration, SSIS Components, SSIS CSV Source, SSIS JSON Source (File/REST), SSIS XML Source (File / SOAP) and tagged .