Introduction
Please visit this page to see all Pre-Configured ready to use API connectors which you can use in SSIS API Source / SSIS API Destination OR API ODBC Driver (for non-SSIS Apps such as Excel, Power BI, Informatica).
In this tutorial we will use REST API Task to call some ad-hoc API (e.g. get File List from Google Drive) and save output into Variable or File. We will use OAuth connection along with ZappySys JSON Source connector to read data from Google SpreadSheet (Use Drive API) and load into SQL Server (Export Google SpreadSheet to CSV).
Create Google API Project
First step to access any Google API is create an API Project in Google Console. If you don’t want to go through this then Skip this Step-1 and in the next section select Default OAuth App option on OAth Connection Manager (This is the most easiest option for now unless you want to use your own OAuth App).
Check this article for step-by-step instructions. When you follow these instructions make sure you enable Google Drive API (In the article it shows how to enable YouTube API as an example but you will need to enable Drive API for this article).
Create OAuth Connection Manager in SSIS
Once you create Google API project and obtained Client ID and Client Secret your next step is to create OAuth Connection Manager in SSIS. ZappySys OAuth connection manager comes with many predefined OAuth Providers (e.g. Facebook, Twitter, Google etc) but you can also define custom OAuth settings for any OAuth enabled API.
To create SSIS OAuth 2.0 Connection for Google API perform following steps.
- Download and Install SSIS PowerPack
- Create new SSIS Package
- Right click in Connection Manager Area and Click “New Connection”
- When prompted select ZS-OAUTH connection type
- On the OAuth Connection Manager Select Provider=Google Sheets + Drive.
- You can keep “Use Default App” selected or choose Custom App. Click here to learn – how to register Custom Google App
- If you choose the Custom App option then enter the below scopes (or click Select Scopes). If you choose the default app then it’s not required.
This will allow read / write access to Drive Files or Sheet (E.g. Reading file content / Export file to CSV). You can use Scope browser to see many other available permissions.
12345https://www.googleapis.com/auth/drivehttps://www.googleapis.com/auth/drive.filehttps://www.googleapis.com/auth/drive.readonlyhttps://www.googleapis.com/auth/spreadsheetshttps://www.googleapis.com/auth/spreadsheets.readonlyClick Generate Token button.
- You will see UI as below… Click Accept (You may have to scroll to see that button sometimes)
- Click Test to see connection is working.
NOTE: If you don’t want to use Default OAuth App provided by ZappySys then select “Use Custom OAuth App” option (App created in previous section) specify your ClientID, ClientSecret.
Find Google Sheet ID for API call
Before we can read or write Sheet data in SSIS we need to know Google Driver File ID. We will use this ID in next few sections.
Easiest way to find File ID is look at the URL like below. If you need to know ID at runtime dynamically then refer last section.
Reading Google SpreadSheet Data in SSIS
Basically there are two ways you can export / read Google SpreadSheet data.
- Method-1 : Read Google SpreadSheet using SSIS JSON / REST API Source
- In this approach we directly pull data from SpreadSheet using Core Google SpreadSheet.Values API to read particular Tab from Sheet and you can also specify Range. This is most effective way to read. Make sure you have v2.5.0.10807 or higher version (released after Aug 03 2017). New version added 2D array transformation options to parse multi-dimensional arrays found in SpreadSheet API JSON response.
- Method-2 : Read Google SpreadSheet using Flat File Source (as CSV)
- Another approach is export SpreadSheet as CSV format. Then use Flat File Source to read that data and load into Target (E.g. SQL Server)
Method-1 : Read Google SpreadSheet using SSIS JSON / REST API Source
Now lets look at real steps to read Google SpreadSheet. To get values from specific tab and specific cell-row range you have to use Google SpreadSheet.Values API
In this example we will load below Google Spreadsheet into SQL Server
Step-By-Step : Loading Google SpreadSheet into SQL Server using SSIS
- Drag and drop data flow on the control flow designer surface
- Double click data flow and you will see SSIS Toolbox refreshed with new components
- Drag ZS JSON Source from SSIS Toolbox onto Data flow designer surface
- Double click JSON SOurce and configure below setting
- Enter your Spreadsheet URL in the following format
1https://sheets.googleapis.com/v4/spreadsheets/{your-file-id}/values/{cell-range}
Here is the actual URL we will use for demo
1https://sheets.googleapis.com/v4/spreadsheets/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms/values/'Class Data'!A2:F
You can also enter like below (If you omit Tab name then first table is used)
12345https://sheets.googleapis.com/v4/spreadsheets/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms/values/A1:F--OR-- Use below URL for to supply Column names manually (Start from A2)https://sheets.googleapis.com/v4/spreadsheets/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms/values/A2:F
1https://sheets.googleapis.com/v4/spreadsheets/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms/values/A1:F101 - Check Use credentials and select OAuth connection we created in previous section
- Click on Select Filter and select values node and click OK or you can directly enter following filter.
1$.values[*] - Now click on array Transformation tab. Now we will show you two different ways to configure column names.
- Method-1: Auto Detect Column Names (added in version 3.1.2)
In recent version we added column less array transformation where we have option to detect first record as column name.- Select Transform Type as Column less array
- Check option First line has column names like below example (Ignore direct Sample JSON – We provided for demo only. In your case it will be URL)
- Method-2: Manually Enter Column Names (For old version)
- Select Transform simple 2D array option from dropdown
- Check Specify columns list manually
- Enter column names (Must match the order and count based on range you selected)
1Student Name,Gender,Class Level,Home State,Major,Extracurricular Activity
- Now click Preview to see the data
- Click OK to save UI
- Attach your JSON Source to target (e.g. OLEDB Destination for SQL Server Table).
- Configure Destination
- Execute SSIS Package to load Google SpreadSheet data int SQL Server
- Enter your Spreadsheet URL in the following format
Method-2 : Read Google SpreadSheet using Flat File Source (as CSV)
Once we have Google Drive fileId we can perform following actions to export data into CSV file format. After that you can easily read CSV using SSIS FlatFile Source and load into SQL Server or other target.
Export Google SpreadSheet as CSV file using SSIS REST API Task
Here are the steps to export Google Drive SpreadSheet to CSV file
- Just like previous section configure REST API Task
- Enter following URL (See we used FileId from SSIS variable. You can hardcode too.
1https://www.googleapis.com/drive/v3/files/{{User::varFileId}}/export?mimeType=text/csv - Goto response tab. Check Save response and select “Save to file option”.
- Enter file path.
- Response character set select utf-8 (This will allow Unicode characters in response)
- Click Test to see preview
- Click OK to save
Load Google SpreadSheet data into SQL Server using SSIS (REST API Call)
Once you export SpreadSheet to CSV file you can easily consume it to load into SQL Server or any Target (e.g. Oracle, MySQL) using Native SSIS FlatFile Source and OLEDB/ADO.net Destination.
Write data to Google Spread Sheet
Now lets look at how to write to Google SpreadSheet. To write multiple rows in a single request you may use batchUpdate API. Also check this link to learn more.
Below is sample request to write data to columns and some data to rows (set majorDimension = COLUMNS to write vertical and use ROWS to write horizontally. You can mix multiple ways just like below example)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
POST https://sheets.googleapis.com/v4/spreadsheets/1lkMEgu0zm2Q-cnPeoNZp4hqeIuASxXSpsrxtDDV89tg/values:batchUpdate Authorization: Bearer ya29.Gl3WBFb3xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Content-Type: application/json Accept: */* { "valueInputOption": "USER_ENTERED", "data": [ { "range": "Sheet2!A1:A4", "majorDimension": "COLUMNS", "values": [ ["Item", "Wheel", "Door", "Engine"] ] }, { "range": "Sheet2!B1:D2", "majorDimension": "ROWS", "values": [ ["Cost", "Stocked", "Ship Date"], ["$20.50", "4", "3/1/2016"] ] } ] } |
Update Google Sheet Cells – Single Range (Set Values for multiple cells)
Now let’s look at the below example of updating multiple cells in google Sheet. For more information, you can see the example here. For parameter detail review this API documentation.
You can use SSIS REST API Task or Web API Destination to achieve Spread Sheet Update scenario. For simple example lets use REST API Task.
- Drag and Drop ZS REST API Task on the control flow designer
- Edit Task and select URL from Connection Method
- Select OAuth Connection from the dropdown (Same connection we created in the previous section)
- Enter the URL as below. Change Sheet ID, Range designator and API URL Parameters as per your need.
https://sheets.googleapis.com/v4/spreadsheets/[Sheet-ID-Here]/values/[Range-Here]?[Parameters-Here] Example URL (some part masked with xxxxxxxx):
1https://sheets.googleapis.com/v4/spreadsheets/1tuGO3_-2JlSmyiHwX6xxxxxxxxCHrORJc/values/Sheet1!A2:B4?valueInputOption=USER_ENTERED - Change Request Method to PUT
- Change Request Content type to application/json
- Enter Request Body as below (Assuming we need to update 4 cells)
1234567{"majorDimension": "ROWS","values": [["row1_cellA","row1_cellB"],["row2_cellA","row2_cellB"],]} - Click Test Request see it works.
Update Google Sheet Cells – Multiple Ranges
If you like to update values which needs to define multiple ranges then you need to use batchUpdate API. Import changes are you can’t include Range in the URL like previous example of Single Range Update. Also you have to use POST method and body with multiple ranges.
Example of sheet update with multiple range
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
POST https://sheets.googleapis.com/v4/spreadsheets/1tuGO3_-2JlSmyiHxxxxxxxxxxxxxxxc/values:batchUpdate?valueInputOption=USER_ENTERED Content-Type: application/json >>>> BODY <<<<< { "data": [ { "range": "Sheet1!A2:B3", "majorDimension": "ROWS", "values": [ [ "row1_cellA", "row1_cellB" ], [ "row2_cellA", "row2_cellB" ] ] }, { "range": "Sheet1!D2:E3", "majorDimension": "ROWS", "values": [ [ "row1_cellD", "row1_cellE" ], [ "row2_cellD", "row2_cellE" ] ] } ] } |
Write SQL Server data to Google Sheet
Now let’s look at real world example. How to write data from SQL Server table or other source and send to Google Sheet.
- Drag and Drop SSIS Data Flow Task from SSIS Toolbox
- Drag source (e.g. OLEDB Source) and configure it to read from source like SQL Server Table
- Drag ZS JSON Generator.
- Connect Source to JSON Generator (blue arrow)
- Double click it to configure
- Select Output Mode as Single Dataset Array option for Mode. Enter Some Batch value (e.g. 500) this will make sure we send 500 rows at a time to Google API call. This way if we have many rows its not rejected by API call.
- Click Add element (Select Static Element) . Name: majorDimension and Value: ROWS
- Now click Add Document Array icon, name as values
- check Treat as 2D array option and click OK to save
- Now under values node click Add elements > Select Multiple Columns and Add columns you like to add.
- Click OK to save JSON GeneratorBelow is just an example How you can use Batch Settings to create JSON with N rows in each JSON document.
- Now drag ZS Web API destination
- Connect JSON Generator to Web API destination (blue arrow)
- Double click it to configure
- Select same OAuth connection manager created in earlier section (Assuming you had all scopes with write permission)
- Now select Input column for Body (JSON Generator Output)
- Enter API call URL as below (Replace ID with your own id)
1https://sheets.googleapis.com/v4/spreadsheets/{your-file-id}/values/Sheet1!A2?valueInputOption=USER_ENTERED - Select Request Method as PUT
- Select content type as application/json
- You can enter sample Body As below.
1234567{"majorDimension": "ROWS","values": [["Bob","bob@mycompany.com",55],["Sam","sam@mycompany.com",57]]} - Click Test Request / Response button. Check your Sheet in browser after it see it updates?
- If you see 200 OK response means we are good to go.
- Run package and see your real data updated in Sheet.
Format Cells using Google Sheet API
Now let’s look at an example to format cell using Google Sheet API (e.g. Change Font, Color, Border, Background). For more information review this API documentation.
To format Sheet you first need to get SheetID (keep in mind this is not same as spreadsheetId which you used in URL). SheetId is basically numeric ID of your Tab in WorkSheet.
Get SheetID
Request
1 |
GET https://sheets.googleapis.com/v4/spreadsheets/1tuGO3_-2JlSmyiHtxxxxxxxxxxxxxx |
Response
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
{ "spreadsheetId": "1tuGO3_-2JlSmyiHwX6bRFUWqqrXlt4BRoX8rCHrORJc", "properties": { "title": "Test.xlsx", "locale": "en", "autoRecalc": "ON_CHANGE", "timeZone": "America/Los_Angeles", "defaultFormat": { "backgroundColor": { "red": 1, "green": 1, "blue": 1 }, "padding": { "right": 3, "left": 3 }, "verticalAlignment": "BOTTOM", "wrapStrategy": "OVERFLOW_CELL", "textFormat": { "foregroundColor": {}, "fontFamily": "Calibri", "fontSize": 11, "bold": false, "italic": false, "strikethrough": false, "underline": false } } }, "sheets": [ { "properties": { "sheetId": 1896331083, "title": "Sheet1", "index": 0, "sheetType": "GRID", "gridProperties": { "rowCount": 1000, "columnCount": 26, "rowGroupControlAfter": true, "columnGroupControlAfter": true } } } ], "spreadsheetUrl": "https://docs.google.com/spreadsheets/d/1tuGO3_-2JlSmyiHwX6bRFUWqqrXlt4BRoX8rCHrORJc/edit" } |
Notice that in above response we see “sheetId”: 1896331083. This ID we will use in the Update API to format Cells.
Update Borders Example
Here is the example of Google Sheet Formatting API updateBorders Request
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
POST https://sheets.googleapis.com/v4/spreadsheets/1tuGO3_-2JlSxxxxlt4BRoX8rCHrORJc:batchUpdate Content-Type: application/json >>>> BODY <<<<< { "requests": [ { "updateBorders": { "range": { "sheetId": 1896331083, "startRowIndex": 0, "endRowIndex": 10, "startColumnIndex": 0, "endColumnIndex": 6 }, "top": { "style": "DASHED", "width": 1, "color": { "blue": 1.0 }, }, "bottom": { "style": "DASHED", "width": 1, "color": { "blue": 1.0 }, }, "innerHorizontal": { "style": "DASHED", "width": 1, "color": { "blue": 1.0 }, }, } } ] } |
Response
1 2 3 4 5 6 |
{ "spreadsheetId": "1tuGO3_-2JlSmyiHwX6bRFUWqqrXlt4BRoX8rCHrORJc", "replies": [ {} ] } |
Update Cell Font Style (Bold, Style, Color, Background)
Now let’s look at how to use repeateCellRequest to apply certain type of formatting to multiple cells. This may be color, font style, background color.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
POST https://sheets.googleapis.com/v4/spreadsheets/1tuGO3xxxxxx4BRoX8rCHrORJc:batchUpdate Content-Type: application/json >>>> BODY <<<<< { "requests": [ { "repeatCell": { "range": { "sheetId": 1896331083, "startRowIndex": 0, "endRowIndex": 1 }, "cell": { "userEnteredFormat": { "backgroundColor": { "red": 0, "green": 0.5, "blue": 0.7, "alpha": 1.3 }, "textFormat": { "foregroundColor": { "red": 1 }, "bold": true, "italic": true, "strikethrough": true, "underline": true } } }, "fields": "userEnteredFormat" } } ] } |
In above example we have applied textFormat and backgroundColor but you can use any of below sections.
“numberFormat”: {object(
NumberFormat
) },“backgroundColor”: { object(
Color
) },“borders”: { object(
Borders
) },“padding”: { object(
Padding
) },“horizontalAlignment”: enum(
HorizontalAlign
),“verticalAlignment”: enum(
VerticalAlign
),“wrapStrategy”: enum(
WrapStrategy
),“textDirection”: enum(
TextDirection
),“textFormat”: { object(
TextFormat
) },“hyperlinkDisplayType”: enum(
HyperlinkDisplayType
),“textRotation”: { object(
TextRotation
) }}
Set Currency Formatting
Here is an example request to set currency formatting
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
POST https://sheets.googleapis.com/v4/spreadsheets/1tuGO3_-2JlSmyiHwX6bRFUWqqrXlt4BRoX8rCHrORJc:batchUpdate Content-Type: application/json >>>> BODY <<<<< { "requests": [ { "repeatCell": { "range": { "sheetId": 1896331083, "startRowIndex": 1, "startColumnIndex": 4, "endColumnIndex": 5 }, "cell": { "userEnteredFormat": { "numberFormat": { "type": "CURRENCY", "pattern": "$#,##0.00" } } }, "fields": "userEnteredFormat.numberFormat" } } ] } |
Set Cell Validation
You can also set cell validation by calling following API call. This will restrict user’s ability to enter wrong data in certain cells.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
POST https://sheets.googleapis.com/v4/spreadsheets/1tuGO3_-2JlSmyiHwX6bRFUWqqrXlt4BRoX8rCHrORJc:batchUpdate Content-Type: application/json >>>> BODY <<<<< { "requests": [ { "setDataValidation": { "range": { "sheetId": 1896331083, "startRowIndex": 1, "startColumnIndex": 0, "endColumnIndex": 6 }, "rule": { "condition": { "type": "ONE_OF_LIST", "values": [ { "userEnteredValue": "Red" }, { "userEnteredValue": "Green" }, { "userEnteredValue": "Yellow" } ] }, "inputMessage": "Color must be Red, Yellow or Green", "strict": true, "showCustomUi": true } } } ] } |
Dynamically obtain File ID for Google Drive REST API call
Once OAuth connection is defined you can now call any valid API from Google based on Permission you requested in OAuth Connection. OAuth connection takes care of renewing you Token and passing Authorization as per OAuth standard. All these protocol complexity is hidden from you.
First step to get data from Google SpreadSheet using REST API call is get fileId of the Google Drive File you want to read. The easiest way to know file ID is call following API. Below API call returns name, ID and other information about file(s). You can get only fileID and use that in Next Section to export Google SpreadSheet to CSV format.
1 |
https://www.googleapis.com/drive/v3/files |
OR
1 |
https://www.googleapis.com/drive/v3/files/?q=name%3D'Customers' |
NOTE: For more information about searching files in Google drive / getting list of files using search expression language check this link. Make sure expression is URL encoded (e.g. name=’Customers’ should be passed as name%3D’Customers’)
Above API returns response as below. Notice the name and fileId. We will use fileId when calling export API in next section.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
{ "kind": "drive#fileList", "files": [ { "kind": "drive#file", "id": "1lkMEgu0zm2Q-cnPeoNZp4hqeIuASxXSpsrxtDDV89tg", "name": "Customers", "mimeType": "application/vnd.google-apps.spreadsheet" }, { "kind": "drive#file", "id": "0B6UyXr6ZlIFTc3RhcnRlcl9maWxl", "name": "How to get started with Drive", "mimeType": "application/pdf" } ] } |
If you want to automate retrieval of fileId from document name then perform the following steps.
- In the SSIS package Control flow. Drag and drop ZS REST API Task from SSIS toolbox
- In the Request Access mode, select “URL from Connection” option
- Select OAuth connection manager (Created in Section-2)
- Enter URL as below (It will return any file name equals to Customers (%3D is encoded value for = sign)
1https://www.googleapis.com/drive/v3/files/?q=name%3D'Customers' - Click Test (You will see file information)
- If you want to extract only fileID from full JSON then go to response tab
- Select Format=JSON, Enter following JSONPath expression to extract only fileId from first record
1$.files[0].id - Check save response content, select SSIS variable where you want to save fileId as below. Now click Test to see preview of response. You will notice now only fileId will be returned from full JSON response.
- Click OK to save task
Conclusion
Google Drive API and Google Sheets API provides a great way to automate file related functionality. However, to call Google API you have to use SDK / coding approach (e.g. C#, Java, Python, Ruby). Luckily ZappySys SSIS PowerPack provides a great way to integrate any Google API call via simple drag and drop approach without coding. Try out yourself see how long it takes to call virtually any REST API.