How to read YouTube API data in SSIS (Videos, Channels, Playlists)

Introduction

In last few articles we saw how to read data from various Google Services. In this article we will see how to read YouTube API data in SSIS. This blog mainly focus on SSIS approach but steps mentioned to call Google APIs can be useful for any developer regardless which programming language or toolset you use.

 

 

Prerequisites

Before we perform steps listed in this article, you will need to make sure following prerequisites are met:
  1. SSIS designer installed. Sometimes it is referred as BIDS or SSDT (download it from Microsoft site).
  2. Basic knowledge of SSIS package development using Microsoft SQL Server Integration Services.
  3. Make sure ZappySys SSIS PowerPack is installed (download it).
  4. Optional (If you want to Deploy and Schedule ) - Deploy and Schedule SSIS Packages

Connect to YouTube API in SSIS (OAuth Connection)

Very first step to call any Google API is to create an API project for Google Service and then enable APIs you like to call in Console. In our case we need to enable YouTube API.

Check this article to learn how to create API Project and obtain Client ID and Client Secret to connect to YouTube API. For ease of use use Default Application provided by ZappySys but we still recommend using your own Custom Application  on OAuth Connection.

Before we try our first YouTube API example, we recommend you to get familiar with YouTube API here. Once you done with above steps you can perform the following steps to call YouTube API in SSIS.

  1. Open existing SSIS Project or Create new One in Visual Studio (i.e. SSDT)
  2. Create a new SSIS Package. Right click in the Connection Manager Panel and click “New Connection…”.
    SSIS Create New Connection
  3. When Dialog box opens select ZS-OAUTH connection type.
    SSIS Connection Maangers List: Select O-AUTH Connection Manager

    SSIS Connection Managers List: Select O-AUTH Connection Manager

  4. Once OAuth UI opens, Configure Google connection like below.
    1. Select Service Type as Google
    2. You can select Default App for ease of use or Select Custom Application (Enter Client Id and Secret Obtained using these steps )
    3. In the Scope Enter below – One scope per line for calling correct YouTube API. Refer to YouTube API documentation incase you need extra permissions. For read only operation we just need below scopes. Last scope is only needed if you need to use YouTube Analytics Reports
  5. Click Generate Token to obtain Refresh Token (You may see Login Prompt, and then Grant Permission Confirmation Screen)
    Connect to Google API using OAuth 2.0 (Youtube API Example)

    Connect to Google API using OAuth 2.0 (YouTube API Example)

  6. Click OK to save UI.

Call YouTube REST API Example

Now lets look at how to call YouTube API REST API Task this task is more suitable to call GET, POST, DELETE, PUT requests without parsing data in Rows and Columns. If you need to parse data and load into Database table then see next section (Use JSON Source)

  1. From SSIS Toolbox look for items starting with “ZS”. Drag and Drop [ZS Rest API Task] to Designer Surface.
    SSIS REST Api Task - Drag and Drop
  2. Double click it to configure like below.
    1. Select URL From Connection option
    2. Select OAuth connection created in previous section.
    3. Enter URL like below (For example get all playlists). Replace xxxxxxxxxxx with your own channel ID (Its usually found in URL when you visit Youtube Channel Homepage)
    4. Click Test Request Button
  3. Here is how it looks like
    Call Youtube API in SSIS Example (Dynamic URL)

    Call Youtube API in SSIS Example (Dynamic URL)

That’s it we have successfully configured Connection for YouTube API in SSIS. In the next section we will see how to use this connection and read various data from YouTube API.

Read YouTube Playlists in SSIS

Once we done creating OAuth Connection Manager we can move forward to read YouTube data inside Data Flow.

Configure SSIS JSON / REST API Source

  1. Now, Drag and Drop SSIS Data Flow Task from SSIS Toolbox.
    SSIS Data Flow Task - Drag and Drop
  2. Double click on the Data Flow task to see Data Flow designer surface.
  3. From the SSIS toolbox drag and drop JSON Source on the dataflow designer surface.
    SSIS JSON Source - Drag and Drop
  4. Double click JSON Source and enter URL as below (Use variable or Hardcode Channel Id). maxResults controls how many rows you want to get in each response. Rep
  5. Check Use Credentials and select existing OAuth connection or Create New one
  6. Select Array Filter (Items node) or type $.items[*]  as below.
    Get Playlists for YouTube Channel

    Get Playlists for YouTube Channel

  7. That’s it you can now Click Preview to see sample data. In the next section we will configure Pagination to read all records if you have more than 100 rows.

Configure YouTube REST API Pagination

Most REST APIs dont return data in one big response. So you have to make sure you implement looping / pagination to read more records. Luckily ZappySys supports many different Pagination Settings. Here is how to configure YouTube REST API Pagination.

  1. Go to Pagination Tab found on JSON Source and select following 3 settings
  2. Select Pagination Mode as Response Attribute Mode
  3. Enter Next Link as $.nextPageToken
  4. Enter Suffix for Next URL as &pageToken=<%nextlink%>
YouTube REST API Pagination Settings

YouTube REST API Pagination Settings

Read YouTube Playlist Videos in SSIS

So in previous section we saw simple API call to read all playlists from YouTube. Now lets see how to get information about Playlist Videos.

For that everything should remain same as previous section except change URL as below. Where xxxxxxxxxxxxxxxxxxxxx is your Playlist ID obtained from Previous Call.

Get all Videos for YouTube Channel

If you want to get all video name and ID for specific channel then use below URL. Click here to read more. Replace xxxxxxxxxxxxxxxxxxx with your Channel ID.

 

Read YouTube Video Views, Likes, Dislikes (Extended Videos)

Another common request from YouTube API is get Video Information such as Views, Likes, Dislikes, Comment Count etc. For that you can use this api . However the issue is this endpoint requires you to pass Comma separated Ids of Video. You can only pass 25 Ids in one requests so if you have 200 videos for which you need to get information then you have to make 8 requests (25 Ids in each Request). In below screenshot you can see we used Fiddler to debug web requests.

So here is how high level process to fetch extended information for all Videos in your channel. Full SSIS Package is attached here (SSIS 2012 Format).

  1. Get all videos for your channel by calling (Assuming you will paginate in below request.
  2. Build Video Id groups for 25 Ids from above result and submit requests. For example if above request returns 200 Videos then your request to fetch extended information may look like below.
  3. Parse response coming from each Request Above…. Each Response will give you details about 25 videos you requesting.

Here is the flow of our complete data flow to read extended information for all videos. You can read extended information one by one too but it wont be good idea if you have many videos.

We have used Following Components to achieve Bulk Mode

Get YouTube Video View Clicks, Likes, Dislikes (Bulk Mode for all videos)

Get YouTube Video View Clicks, Likes, Dislikes (Bulk Mode for all videos)

Read data from YouTube Analytics / Reporting API (Metrics, Dimensions Report)

So far we have seen basic YouTube APIs to extract your Channel information. However there is another very powerful API endpoint to query custom reports from YouTube. Check this YouTube Analytics API endpoint. In this section we will learn how to extract useful information from YouTube using Analytics API.

Step-By-Step

Let’s look at an example to read total views, estimated watch time, average view duration and some other metrics by date for channel you have access.

  1. Go to Data Flow, Drag ZS JSON Source from SSIS Toolbox
  2. Double click to configure.
  3. Check Use Credentials. Select OAuth connection we created in previous section.
  4. In the URL field, Enter Below Sample URL. Change Parameters as per your need or leave it default.
    For example:
    ids=channel==MINE   (where MINE can be replaced by other channel id or keep it as MINE to get data for your own channel)
    metrics (can be changed from any valid list here)
    dimensions (can be changed to any valid list from here)
    startDate (Must be YYYY-MM-DD format)
    endDate (Must be YYYY-MM-DD format)
  5. Change Filter to $.rows[*]
  6. Go to 2D Array Transform tab.
    Select Method as Simple 2-dimensional array EnterColumn Filter as  $.columnHeaders[*].name
    YouTube Analytics API Advanced Transformation

    YouTube Analytics API Advanced Transformation

  7. That’s it now Preview your data. See below over all steps.
    Read data from YouTube Analytics API (Custom Reports using Dimensions, Metrics Query)

    Read data from YouTube Analytics API (Custom Reports using Dimensions, Metrics Query)

Build YouTube Analytics / Reporting API Query (API Playground)

With Analytics API You can extract many interesting data. You can use Test Tool found on this page to build some interesting queries. Scroll to Common Use Cases section on that link. On the right side you can select Credentials Type as OAuth and click Execute.

https://developers.google.com/youtube/analytics/reference/reports/query

YouTube Analytics / Reporting API Explorer (Testing API Calls)

YouTube Analytics / Reporting API Explorer (Testing API Calls)

Loading YouTube API Data into SQL Server

Now let's look at how to load data into target like SQL Server, Oracle or Flat File. In below example we will see loading data into SQL Server database but steps may remain same for other targets which can be accessed using OLEDB Drivers (e.g. Oracle).
  1. Inside Data Flow, Drag OLEDB Destination from SSIS Toolbox
  2. Connect our Source component to OLEDB Destination
  3. Double click OLEDB Destination to configure it
  4. Select Target Connection or click NEW to create new connection
    Configure SSIS OLEDB Destination Connection - Loading data (REST / SOAP / JSON / XML /CSV) into SQL Server or other target using SSIS

    Configure SSIS OLEDB Destination Connection - Loading data (REST / SOAP / JSON / XML /CSV) into SQL Server or other target using SSIS

  5. Select Target Table or click NEW to create new table based on source columns
  6. Click on Mappings Tab to Auto map columns by name. You can change mappings as you need
    Create NEW Table SSIS OLEDB Destination

    Create NEW Table SSIS OLEDB Destination

  7. Click OK to Save OLEDB Destination Settings
  8. That's it, You are now ready to run data flow. NOTE: If you wish to debug data flow and see records when you run, add data viewer by right click on blue arrow > Click Enable Data Viewer
  9. To execute data flow, Right click anywhere inside Data Flow Surface and click Execute Task
 

Download Sample SSIS Package

SSIS Package is attached here (SSIS 2012 Format).

Conclusion

In this article we saw how to extract information from REST API such as YouTube REST API using OAuth 2.0. We also learned techniques like Pagination, JSON Parsing, Request Batching using multiple components to achieve full scale API integration. To explore many other scenarios not discussed in this article download SSIS PowerPack from here (includes 70+ Components).

Posted in Google API, REST API, REST API Integration, SSIS CSV Generator Transform, SSIS JSON Parser Transform, SSIS JSON Source (File/REST), SSIS OAuth Connection, SSIS REST API Task, SSIS WEB API Destination and tagged , , , , .