Get data from Google SpreadSheet using SSIS

Introduction

In this post you will learn how to get data from Google SpreadSheet (REST API) using SSIS. We will use drag and drop REST API connectors from SSIS PowerPack. No need to download any SDK or learn programming language (e.g. JAVA, C#, Ruby, Python) when you use SSIS PowerPack Connectors. We will use Google Drive API to get file list and export SpreadSheet as CSV file in few clicks (Using OAuth 2.0 connection in SSIS).

In this tutorial we will use REST API Task to call some ad-hoc API (e.g. get File List from Google Drive) and save output into Variable or File. We will use OAuth connection along with ZappySys JSON Source connector to read data from Google SpreadSheet (Use Drive API) and load into SQL Server (Export Google SpreadSheet to CSV).

Create Google API Project

First step to access any Google API is create an API Project in Google Console. If you don’t want to go through this then Skip this Step-1 and in the next section select Default OAuth App option on OAth Connection Manager (This is the most easiest option for now unless you want to use your own OAuth App).

Check this article for step-by-step instructions. When you follow these instructions make sure you enable Google Drive API (In the article it shows how to enable YouTube API as an example but you will need to enable Drive API for this article).

Create OAuth Connection Manager in SSIS

Once you create Google API project and obtained Client ID and Client Secret your next step is to create OAuth Connection Manager in SSIS. ZappySys OAuth connection manager comes with many predefined OAuth Providers (e.g. Facebook, Twitter, Google etc) but you can also define custom OAuth settings for any OAuth enabled API.

To create SSIS OAuth 2.0 Connection for Google API perform following steps.

  1. Download and Install SSIS PowerPack
  2. Create new SSIS Package
  3. Right click in Connection Manager Area and Click “New Connection..”
  4. When prompted select ZS-OAUTH connection type
  5. On the OAuth Connection Manager Select Provider=Google.
  6. In the Scopes enter or select permission as below. This will allow read only access to Drive Files (E.g. Reading file content / Export file to CSV). You can use Scope browser to see many other available permissions.
    Click Generate Token button.

    SSIS OAuth Connection Manager - Access Google Drive API using OAuth 2.0

    SSIS OAuth Connection Manager – Access Google Drive API using OAuth 2.0

  7. You will see UI as below… Click Accept (You may have to scroll to see that button sometimes)
  8. Click Test to see connection is working.

NOTE: If you don’t want to use Default OAuth App provided by ZappySys then select “Use Custom OAuth App” option (App created in previous section) specify your ClientID, ClientSecret.

Get File List from Google Drive using REST API call

Once OAuth connection is defined you can now call any valid API from Google based on Permission you requested in OAuth Connection. OAuth connection takes care of renewing you Token and passing Authorization as per OAuth standard. All these protocol complexity is hidden from you.

First step to get data from Google SpreadSheet using REST API call is get fileId of the Google Drive File you want to read. The easiest way to know file ID is call following API. Below API call returns name, ID and other information about file(s). You can get only fileID and use that in Next Section to export Google SpreadSheet to CSV format.

OR

NOTE: For more information about searching files in Google drive / getting list of files using search expression language check this link. Make sure expression is URL encoded (e.g. name=’Customers’ should be passed as name%3D’Customers’)

Above API returns response as below. Notice the name and fileId. We will use fileId when calling export API in next section.

If you want to automate retrieval of fileId from document name then perform the following steps.

  1. In the SSIS package Control flow. Drag and drop ZS REST API Task from SSIS toolbox
  2. In the Request Access mode, select “URL from Connection” option
  3. Select OAuth connection manager (Created in Section-2)
  4. Enter URL as below (It will return any file name equals to Customers (%3D is encoded value for = sign)
  5. Click Test (You will see file information)
    Get Google Drive File List in SSIS - Call Google Drive API. Search file by name or expression

    Get Google Drive File List in SSIS – Call Google Drive API to search file by name or expression

  6. If you want to extract only fileID from full JSON then go to response tab
  7. Select Format=JSON, Enter following JSONPath expression to extract only fileId from first record
  8. Check save response content, select SSIS variable where you want to save fileId as below. Now click Test to see preview of response. You will notice now only fileId will be returned from full JSON response.
    Get Google Drive fileid from file name using api call in SSIS

    Get Google Drive fileid from file name using API call in SSIS

  9. Click OK to save task

Export Google SpreadSheet data to File using SSIS (REST API Call)

Once we have Google Drive fileId we can perform following actions to export data into CSV file format. After that you can easily read CSV using SSIS FlatFile Source and load into SQL Server or other target.

Here are the steps to export Google Drive SpreadSheet to CSV file

  1. Just like previous section configure REST API Task
  2. Enter following URL (See we used FileId from SSIS variable. You can hardcode too.
  3. Goto response tab. Check Save response and select “Save to file option”.
  4. Enter file path.
  5. Response character set select utf-8 (This will allow Unicode characters in response)
  6. Click Test to see preview
    Export Google SpreadSheet to CSV file format using SSIS - Call Google Drive API to Export

    Export Google SpreadSheet to CSV file format using SSIS – Call Google Drive API to Export

  7. Click OK to save

 

Load Google SpreadSheet data into SQL Server using SSIS (REST API Call)

Once you export SpreadSheet to CSV file you can easily consume it to load into SQL Server or any Target (e.g. Oracle, MySQL) using Native SSIS FlatFile Source and OLEDB/ADO.net Destination.

NOTE: Using FlatFile Source is not covered in Article. You can find many articles online which shows how to use SSIS FlatFile Source.

Conclusion

Google Drive API provides great way to automate file related functionality. However to call Google API  you have to use SDK / coding approach (e.g. C#, Java, Python, Ruby). Luckily ZappySys SSIS PowerPack provides great way to integrate any Google API call via simple drag and drop approach without coding. Try out yourself see how long it takes to call virtually any REST API.

Posted in Google API, SSIS REST API Web Service Task and tagged , , , , , , , , .