Read Twitter data in SSIS using REST API Call (OAuth/JSON Source)

Introduction

In this article you will how to Read Twitter data in SSIS using SSIS JSON Source and SSIS REST API Web Service Task. You will also learn about latest OAuth 2.0 Protocol to simplify REST API access.

Twitter REST API Authentication

In order to fetch any data from twitter using OAuth REST API calls you have to obtain two Keys (Consumer Key [Like UserID] and Consumer Secret [Like Password]). There are maily two methods to read data from Twitter. Using Application-user authentication and Application Only authentication. Use correct method depending what type of information you want to pull from twitter. Most common method is Application-user authentication . If you want to read more about other methods then click here

Method 1 – Read Twitter data in SSIS using Application-user authentication

In this method you will need twitter user account to connect Twitter API. Once first time authorization is done you don’t have to re-authenticate. Certain type of API calls only allowed by this method (such as POST new tweet using API). For any API call in Twitter very first step is create OAuth App (i.e. Register new App in Twitter developer portal).

Using Default OAuth App Created by ZappySys

To make your life easy ZappySys provides default App for certain OAuth providers (e.g. Google, Twitter). If you decide to create your own app for whatever reason then check next section on how to register twitter OAuth Application. See below screenshot how to use Twitter OAuth connection using Default App. Once you click Generate Token it will prompt you to login using Twitter account and then you can grant permission.

Step-By-Step

  1. Download and Install SSIS PowerPack FREE Trial from this link
  2. Create new SSIS Project and add new package.
  3. Open Package. Go to control flow. Drag and drop Data Flow task from SSIS Toolbox
  4. Go to Data flow designer. Drag and drop ZS JSON Source Component
  5. Enter any twitter API URL you want to call such as below
  6. Now check use credentials option
  7. Select new ZS-OAuth connection from dropdown
  8. When new connection dialogbox pops up select Twitter from providers dropdown. Select Default App option for now.
  9. Click Generate Token. It may ask you to login using your Twitter account credentials. If prompted click Approve.
  10. If twitter OAuth grant approved then you will see Access Token and Access Token Secret both populated
  11. Click Test Connection and if it works then click OK to close connection
  12. On JSON Source you can now click Preview to see data from Twitter
  13. You can optionally specify Filter expression [Click Select Filter button] to select result from specific JSON Node. For example if you want to select all Hashtags used inside each Tweet then select Filter like $.entities.hashtags[*]
  14. Click OK to close the UI
  15. Drag ZS Trash Destination from Toolbox (Or you can use OLEDB Destination if you want to load inside SQL Server)
  16. Connect JSON Source to ZS Trash Destination
  17. Execute SSIS package
Call Twitter REST API in SSIS - using OAuth Connection (Default App)

Call Twitter REST API in SSIS – using OAuth Connection (Default App)

Using Custom OAuth App Created by you

For some reason if you dont want to use Default Twitter App then you can register Custom App. It requires few extra steps listed in next section but it wont take more than few minutes. In Custom App option you have to specify ClientId and Client Secret (in above screenshot). Once you click Generate Token it will prompt you to login using Twitter account and then you can grant permission.

Register/Create OAuth Application for twitter API Access

Now lets look at how to register OAuth Application for Twitter API access.

Goto https://apps.twitter.com/app/new and create new app by providing necessary information. Don’t get confused by calling it App, its just way in twitter to create multiple API Access Keys so you can grant different level of access to different users. Once app is created you will be taken to page where your Consumer Key and Secret Key will be listed.

Create Twitter Application - REST API Access for Developer

Create Twitter Application – REST API Access for Developer

Obtaining Consumer Key and Consumer Secret from Twitter

Once app is created you can go to Keys and Access Tokens tab. Here you will find Consumer Key and Consumer Secret.
Consumer Key can be Public but Consumer Secret must not be shared (think it like a password).

Obtain Twitter API Consumer Key and Consumer Secret - REST API Access for Developer

Obtain Twitter API Consumer Key and Consumer Secret – REST API Access for Developer

Method 2 – Read Twitter data in SSIS – Application-only authentication

For some reason if you don’t want to use Twitter user account to access data (e.g. Giving access to consultant so he can access you company Twitter Account via API). In this scenario you have to use Application Only method. In this method you don’t authorize application using Login form.

Here is our basic flow to access twitter data.

  1. Get Bearer Access Token by calling https://api.twitter.com/oauth2/token service (Use POST method and Pass BASE64 encoded ConsumerKey and Consumer Secret)
  2. Call any twitter service by passing Bearer AccessToken we retrieved in previous step.NOTE: Any access to twitter service is over HTTPS so automatically your tokens passed along request is encrypted before sending over wire unless someone know how to hack SSL :)

Application Only Authentication using REST API task – Get Access Token

Once we create SSIS Package – First step to access any twitter data will be get Access Token. As you can see from below screenshot we have called https://api.twitter.com/oauth2/token service URL with POST method. Notice how we have supplied POST data and 2 headers. Authorization header contains BASE64 encoded value of YourConsumerKey:YourConsumerSecret

SSIS Twitter REST API Get Token

SSIS Twitter REST API Get Token

Fetch data from Twitter using JSON Source – Deformalize nested JSON

Once we have Authentication Token we are ready pull twitter data using JSON Source. Check below screenshot how we have supplied Token in Authorization Header. JSON Source can make your JSON look like normal table (It also de-normalize nested JSON into flat dataset. If you want to extract subset of JSON then simply specify JSON Path expression e.g. $.data.users[*]

Read JSON response of Twitter REST API in SSIS

Read JSON response of Twitter REST API in SSIS

Load Twitter JSON data to SQL Server

You can easily connect your SSIS JSON Source to OLEDB Destination if you want to load Twitter Data to some RDBMS such as SQL Server, MySQL.

Handling paging of large REST API result set with twitter data – looping/cursoring

Most of REST API limit total data sent in single response. So if you wish to get all records then you have to loop through multiple results. Twitter provides looping mechanism using CursorClick here to read more

In our case SSIS JSON Source Supports Paging very well so we are covered. To loop through multiple result sets of twitter data simply configure following 3 properties. See below screenshot.

Twitter REST API - Paging Example -Loop through resultset using cursor

Twitter REST API – Paging Example -Loop through resultset using cursor

Word of caution about too many requests

Twitter does not allow you request too much data too quickly so be careful how many requests you make :). Check their official page on twitter Rate Limit for REST API. You can add delay after each request if you doing pagination. Go to Throttling tab of JSON/XML Source. You can change setting there based on API restriction. For example if API allows only 30 requests per minute then adding 2 seconds delay will make sure you wont exceed 30 requests in 1 minute.

To learn more about rate limit of Twitter API check this table

Handling Twitter API Date Format (Parse Twitter Date/Time)

Twitter returns date in the following format e.g. Fri May 03 15:22:09 +0000 2013 to parse this to correct date/time datatype you can use Date/Time Handling Tab of JSON/XML Source.

Change Custom Date Format as below and preview Twitter data.

Parsing Twitter API date/time format

Parsing Twitter API date/time format

Download related files

Below is sample SSIS package
Twitter Demo SSIS 2012

Conclusion

So you have now seen how easy it is to access twitter data with OAuth 2.0 using SSIS JSON Source and SSIS REST API Web Service Task. We have also seen how to loop through large resultset using inbuilt Paging support of JSON Source.

Posted in SSIS REST API Web Service Task and tagged , , , , , , , , .