<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>big data Archives | ZappySys Blog</title>
	<atom:link href="https://zappysys.com/blog/tag/big-data/feed/" rel="self" type="application/rss+xml" />
	<link>https://zappysys.com/blog/tag/big-data/</link>
	<description>SSIS / ODBC Drivers / API Connectors for JSON, XML, Azure, Amazon AWS, Salesforce, MongoDB and more</description>
	<lastBuildDate>Wed, 19 Mar 2025 05:33:17 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.4.4</generator>

<image>
	<url>https://zappysys.com/blog/wp-content/uploads/2023/01/cropped-zappysys-symbol-large-32x32.png</url>
	<title>big data Archives | ZappySys Blog</title>
	<link>https://zappysys.com/blog/tag/big-data/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>How to read / write data in Google BigQuery using SSIS</title>
		<link>https://zappysys.com/blog/get-data-google-bigquery-using-ssis/</link>
		
		<dc:creator><![CDATA[ZappySys]]></dc:creator>
		<pubDate>Sun, 16 Jul 2017 01:10:44 +0000</pubDate>
				<category><![CDATA[Google API]]></category>
		<category><![CDATA[REST API Integration]]></category>
		<category><![CDATA[SSIS JSON Generator Transform]]></category>
		<category><![CDATA[SSIS JSON Source (File/REST)]]></category>
		<category><![CDATA[SSIS WEB API Destination]]></category>
		<category><![CDATA[API Integration]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[fiddler]]></category>
		<category><![CDATA[google api]]></category>
		<category><![CDATA[Google BigQuery]]></category>
		<category><![CDATA[rest api]]></category>
		<category><![CDATA[ssis json source]]></category>
		<category><![CDATA[SSIS PowerPack]]></category>
		<category><![CDATA[ssis web api destination]]></category>
		<guid isPermaLink="false">http://zappysys.com/blog/?p=1563</guid>

					<description><![CDATA[<p>Introduction Google BigQuery is a fully managed Big Data platform to run queries against large scale data. In this article you will learn how to integrate Google BigQuery data into Microsoft SQL Server using SSIS. We will leverage highly flexible JSON based REST API Connector and OAuth Connection to import / export data from Google [&#8230;]</p>
<p>The post <a href="https://zappysys.com/blog/get-data-google-bigquery-using-ssis/">How to read / write data in Google BigQuery using SSIS</a> appeared first on <a href="https://zappysys.com/blog">ZappySys Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2>Introduction</h2>
<div class="su-note"  style="border-color:#e5de9d;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><div class="su-note-inner su-u-clearfix su-u-trim" style="background-color:#FFF8B7;border-color:#ffffff;color:#333333;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><strong>UPDATE:</strong> ZappySys has released a brand new <a href="https://zappysys.com/api/integration-hub/google-bigquery-connector/">API Connector for BigQuery Online</a> which makes it much simpler to <strong>Read/Write BigQuery Data in SSIS</strong> compared to the steps listed in this article. You can still use steps from this article but if you are new to API or want to avoid learning curve with API then use newer approach.</p>
<p>Please visit <a href="https://zappysys.com/api/integration-hub/">this page to see all</a> Pre-Configured ready to use API connectors which you can use in <a href="https://zappysys.com/products/ssis-powerpack/ssis-api-source/">SSIS API Source</a> / <a href="https://zappysys.com/products/ssis-powerpack/ssis-api-destination/">SSIS API Destination</a> OR <a href="https://zappysys.com/products/odbc-powerpack/odbc-api-driver/">API ODBC Driver</a> (for non-SSIS Apps such as Excel, Power BI, Informatica).</p>
</div></div>
<p style="text-align: left;"><a href="//zappysys.com/blog/wp-content/uploads/2017/07/google-big-query-integration-1.png"><img loading="lazy" decoding="async" class="alignleft wp-image-1959" src="//zappysys.com/blog/wp-content/uploads/2017/07/google-big-query-integration-1-150x150.png" alt="Google BigQuery API Integration" width="91" height="91" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/google-big-query-integration-1-150x150.png 150w, https://zappysys.com/blog/wp-content/uploads/2017/07/google-big-query-integration-1.png 257w" sizes="(max-width: 91px) 100vw, 91px" /></a><a href="https://cloud.google.com/bigquery/" target="_blank" rel="noopener">Google BigQuery</a> is a fully managed <strong>Big Data platform</strong> to run queries against large scale data. In this article you will learn how to integrate <strong>Google BigQuery</strong> data into <strong>Microsoft SQL Server</strong> using <strong>SSIS</strong>. We will leverage highly flexible <a href="https://zappysys.com/products/ssis-powerpack/" target="_blank" rel="noopener">JSON based REST API Connector</a> and <strong>OAuth Connection</strong> to import / export data from <a href="https://cloud.google.com/bigquery/docs/reference/rest/v2/" target="_blank" rel="noopener">Google BigQuery API</a> just in a few clicks.</p>
<p style="text-align: left;">If you are looking for a similar product inside Amazon AWS Cloud then <a href="https://zappysys.com/blog/import-export-data-amazon-athena-using-ssis/" target="_blank" rel="noopener">check an article about Amazon Athena</a>.</p>
<h2><span id="Prerequisites">Prerequisites</span></h2>
<p>Before we look into a Step-By-Step section to extract and load data from BigQuery to SQL Server let&#8217;s make sure you meet the following requirements:</p>
<ol>
<li>SSIS designer installed. Sometimes it is referred as BIDS or SSDT (<a href="https://docs.microsoft.com/en-us/sql/ssdt/download-sql-server-data-tools-ssdt" target="_blank" rel="noopener">download it from Microsoft site</a>).</li>
<li>Basic knowledge of SSIS package development using <em>Microsoft SQL Server Integration Services</em>.</li>
<li><a href="//zappysys.com/products/ssis-powerpack/" target="_blank" rel="noopener"><em>ZappySys SSIS PowerPack</em> installed</a>. Click on the link to download a FREE trial.</li>
<li>You have basic familiarity with REST API concepts and Google BigQuery API. This post uses <a href="https://cloud.google.com/bigquery/public-data/" target="_blank" rel="noopener">free Public Dataset</a> to query BigQuery table so <em>no billing</em> is required to get your hands dirty 🙂 However, if you like to query your own dataset then make sure you have at least one BigQuery dataset and one table with some sample data created (this part <em>does require</em> billing enabled). Read a <a href="https://cloud.google.com/bigquery/quickstart-web-ui" target="_blank" rel="noopener">Google Quickstart article</a> for more information on how to create a new BigQuery dataset and a table.</li>
</ol>
<h2>Understanding Google BigQuery Object Heirarchy</h2>
<p>Google BigQuery has 3 main concepts below.</p>
<ul style="list-style-type: circle;">
<li>Project
<ul style="list-style-type: circle;">
<li>Dataset
<ul style="list-style-type: circle;">
<li>Table
<ul style="list-style-type: circle;">
<li>Query requests (each query creates a unique JobID &#8211; valid for 24 hours from which you can read data)</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>So in order to create BigQuery table you always need Project and then Dataset under that project. You can group objects around Project / Datasets. When you query you can supply fully qualified name of your table in FROM clause (e.g. <pre class="crayon-plain-tag">select count(*) from `bigquery-public-data.usa_names.usa_1910_2013`</pre>  Here <strong>bigquery-public-data</strong> is project name, <strong>usa_names</strong> is dataset and <strong>usa_1910_2013</strong>  is table). So lets get started with Read operation first and then we will cover write operation. For Read operation we will use public dataset so we can quickly show you demo without too many steps but later we will cover how to automate create / delete of Dataset / Tables using API calls.</p>
<p>So let&#8217;s get started.</p>
<h2>Read data from Google BigQuery using SSIS</h2>
<p>Basically you can query Google BigQuery data in two ways: In this article we will not cover 2nd method.</p>
<ul>
<li><strong>Method-1</strong>: Query data using <a href="https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query" target="_blank" rel="noopener"><em>jobs/query</em> method in BigQuery API</a>. Use it if you expect to get a result in a fairly short amount of time. This API method generates a temp table which gets deleted after 24 hours. You can read data within that time frame using newly created <em>JobId</em> reference.<br />
<div class="su-note"  style="border-color:#e5dd9d;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><div class="su-note-inner su-u-clearfix su-u-trim" style="background-color:#FFF7B7;border-color:#ffffff;color:#333333;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><strong>NOTE:</strong> If you have records that sum up to more than 10MB of data or ~10,000 rows (assuming 1 row uses 1KB of data) then you need to proceed with two-step process as explained below.</div></div></li>
<li><strong>Method-2</strong>: Export SQL query result using <a href="https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/insert" target="_blank" rel="noopener"><em>jobs/insert</em> method in BigQuery API</a>. Use this method if you expect a query to take a long time to finish.</li>
</ul>
<p>Now let&#8217;s see how to query Google BigQuery data using SSIS 🙂</p>
<h3>Create Google API Project and Register OAuth App</h3>
<p>To use any <a href="https://developers.google.com/apis-explorer/#p/" target="_blank" rel="noopener">Google API</a> firstly you have to finish two tasks:</p>
<ol>
<li><a href="//zappysys.com/blog/register-google-oauth-application-get-clientid-clientsecret/" target="_blank" rel="noopener">Create Google API Project</a> and obtain projectId (see next section).</li>
<li><a href="https://console.cloud.google.com/apis/library/bigquery.googleapis.com" target="_blank" rel="noopener">Enable BigQuery API</a></li>
<li>Register your own Google OAuth App and obtain <strong>ClientId</strong> and <strong>Client Secret</strong>.</li>
</ol>
<p>Check step-by-step instructions on <a href="//zappysys.com/blog/register-google-oauth-application-get-clientid-clientsecret/" target="_blank" rel="noopener">how to create Google API Project and create OAuth App for Google</a>. During instructions make sure you enable BigQuery API (screenshots may differ).</p>
<div class="su-note"  style="border-color:#e5de9d;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><div class="su-note-inner su-u-clearfix su-u-trim" style="background-color:#fff8b7;border-color:#ffffff;color:#333333;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;">NOTE: For BigQuery API you must use custom app option on OAuth connection. Google used to allow Default App but recent policy changes requires BigQuery to use Custom App to obtain your own ClientId and Client Secret (Some old Screenshots may use Default App for time being so please adjust your settings to use Custom App)</div></div>
<h3>Get Google API Project ID</h3>
<p>Once you have Google API Project created you can grab <a href="https://console.cloud.google.com/project?_ga=1.106484547.1991223081.1500069328">Project ID</a> (see screenshot below). You will need Project ID in the next step when building API URL.</p>
<div id="attachment_1567" style="width: 583px" class="wp-caption alignnone"><a href="//zappysys.com/blog/wp-content/uploads/2017/07/how-to-find-google-api-project-id.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1567" class="size-full wp-image-1567" src="//zappysys.com/blog/wp-content/uploads/2017/07/how-to-find-google-api-project-id.png" alt="How to find Google API Project ID" width="573" height="379" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/how-to-find-google-api-project-id.png 573w, https://zappysys.com/blog/wp-content/uploads/2017/07/how-to-find-google-api-project-id-300x198.png 300w" sizes="(max-width: 573px) 100vw, 573px" /></a><p id="caption-attachment-1567" class="wp-caption-text">How to find Google API Project ID?</p></div>
<h3>Configure OAuth Connection for Google BigQuery API</h3>
<p>In order to call most of Google APIs you will need an OAuth App. If you want to use your own app then refer to <a href="//zappysys.com/blog/register-google-oauth-application-get-clientid-clientsecret/" target="_blank" rel="noopener">how to register Google OAuth Application (Get ClientID and ClientSecret)</a> article. For simplicity we will use <em>Default OAuth App</em>:</p>
<ol style="margin-left: 0;">
<li>Right click inside Connection Managers area and click &#8220;New Connection&#8230;&#8221;</li>
<li>From the connection type list select &#8220;ZS-OAUTH&#8221; connection type.
<div id="attachment_1569" style="width: 687px" class="wp-caption alignnone"><a href="//zappysys.com/blog/wp-content/uploads/2017/07/ssis-oauth-create-new-connection.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1569" class="size-full wp-image-1569" src="//zappysys.com/blog/wp-content/uploads/2017/07/ssis-oauth-create-new-connection.png" alt="Create new SSIS OAuth API Connection Manager" width="677" height="220" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-oauth-create-new-connection.png 677w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-oauth-create-new-connection-300x97.png 300w" sizes="(max-width: 677px) 100vw, 677px" /></a><p id="caption-attachment-1569" class="wp-caption-text">Create new SSIS OAuth API Connection Manager</p></div></li>
<li>For OAuth Provider select &#8220;Google&#8221;.</li>
<li>For OAuth App type options leave &#8220;Use Default OAuth App&#8221; selected. If you <a href="//zappysys.com/blog/register-google-oauth-application-get-clientid-clientsecret/" target="_blank" rel="noopener">created custom OAuth App</a> then you would need to select &#8220;Use Custom OAuth App&#8221; option.</li>
<li>For Scopes enter or select the following URLs (URL is just a permission name). Each URL must be in a separate line. For demo purposes we use only the first 3 scopes but we included a few more in case you like to test API in depth with a different permission set:<br />
<pre class="crayon-plain-tag">https://www.googleapis.com/auth/bigquery
https://www.googleapis.com/auth/cloud-platform
https://www.googleapis.com/auth/cloud-platform.read-only
https://www.googleapis.com/auth/bigquery.insertdata
https://www.googleapis.com/auth/devstorage.full_control
https://www.googleapis.com/auth/devstorage.read_only
https://www.googleapis.com/auth/devstorage.read_write</pre>
</li>
<li>Click Generate Token. Login using correct account if needed and then you will be prompted to click &#8220;Approve&#8221; OAuth App.</li>
<li>Once you click OK on the login screen you will see new tokens populated on the connection screen.</li>
<li>Click Test to make sure connection / tokens are valid and then click OK to save the connection.<div class="su-note"  style="border-color:#e5de9d;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><div class="su-note-inner su-u-clearfix su-u-trim" style="background-color:#fff8b7;border-color:#ffffff;color:#333333;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><strong>NOTE:</strong> Below screenshot uses Default App but we recommend you to to use Custom OAuth App (Your own Clientid / secret &#8211; Obtained in previous section)</div></div>
<div id="attachment_1568" style="width: 685px" class="wp-caption alignnone"><a href="//zappysys.com/blog/wp-content/uploads/2017/07/ssis-google-bigquery-integration-oauth-api-connection.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1568" class="size-full wp-image-1568" src="//zappysys.com/blog/wp-content/uploads/2017/07/ssis-google-bigquery-integration-oauth-api-connection.png" alt="Create SSIS OAuth API Connection for Google BigQuery API" width="675" height="654" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-google-bigquery-integration-oauth-api-connection.png 675w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-google-bigquery-integration-oauth-api-connection-300x291.png 300w" sizes="(max-width: 675px) 100vw, 675px" /></a><p id="caption-attachment-1568" class="wp-caption-text">Create SSIS OAuth API Connection for Google BigQuery API</p></div></li>
</ol>
<h3>Start BigQuery Job and get JobId (Submit Query)</h3>
<p>Once you have SSIS OAuth connection created for BigQuery API it&#8217;s time to read data from BigQuery. So, basically, there are two ways you can read BigQuery data: using <em>query </em>or <em>insert </em>method. For demo purposes we will use <a href="https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query" target="_blank" rel="noopener"><em>jobs/query</em> method</a>. If you want fire complex queries which can run for many minutes then refer to <a href="https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/insert" target="_blank" rel="noopener"><em>jobs/insert</em> method</a>.</p>
<p>If you are expecting more than 10MB of data (~10K records) then you have to split data extract in two steps:</p>
<ol>
<li>Step-1 : Call <em>jobs/query</em> method using REST API Task to submit SQL query. This steps returns <strong>jobComplete: true</strong> if supplied query execution is done within timeoutMs parameter you included. By default its 10 seconds so if your query is going to take longer than 10 seconds then its good idea to add bigger timeout that way you dont need <strong>step#2</strong>. Once API call is done you will Get JobId in the response. Save that JobId in a variable for later use. This job result is valid for 24-hours only.</li>
<li>Step-2 <strong>(Optional)</strong> &#8211; You can add another optional <strong>REST API Task</strong> after previous step to Wait until Job is completed. For simplicity, this article will Omit setting up Status Check but its very simple&#8230;.Use below settings
<ol>
<li>Drag new REST API Task from toolbox. Connect it to Step-1 and double click to configure as below.</li>
<li>Select URL From Connection mode,</li>
<li>Select OAuth connection</li>
<li>Enter URL as <strong>https://www.googleapis.com/bigquery/v2/projects/{{User::ProjectId}}/jobs/{{User::JobId}}</strong> , Method: <strong>GET</strong></li>
<li>On Response Settings Tab, Select Response Content Type <strong>JSON</strong> and for content filter enter <strong>$.status.state</strong></li>
<li>On Status Check Tab, check Enable Status Check Loop and enter <strong>DONE</strong> in SuccessValue field</li>
<li>Doing this setup will make sure we do not query data until Job Status is DONE (System keeps checking every 5 seconds)</li>
</ol>
</li>
<li>Step-2 : Read Job result using JSON Source in a Data Flow Task.</li>
</ol>
<div class="su-note"  style="border-color:#e5dd9d;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><div class="su-note-inner su-u-clearfix su-u-trim" style="background-color:#FFF7B7;border-color:#ffffff;color:#333333;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><strong>NOTE</strong> : If you are expecting less than ~10K rows then you can skip this step and just go to the next section &#8211; <a href="#Read_BigQuery_SQL_result_Method1_or_Method2">Read BigQuery SQL result (Method#1 or Method#2)</a>.</div></div>
<p>Now lets see how to configure REST API Task to submit query and extract JobId (which will be used in next section).</p>
<ol style="margin-left: 0;">
<li>In the Control Flow designer drag and drop <a href="https://zappysys.com/products/ssis-powerpack/ssis-json-file-source/" target="_blank" rel="noopener">ZS REST API Task</a> from the SSIS toolbox.</li>
<li>Double click the task to configure it.</li>
<li>Change Request URL Access Mode to [Url from connection].</li>
<li>From the connection dropdown select OAuth connection created in the previous section.</li>
<li>Enter URL as below (replace YOUR-PROJECT-ID with a valid Project ID obtained in the previous section):<br />
<pre class="crayon-plain-tag">https://www.googleapis.com/bigquery/v2/projects/YOUR-PROJECT-ID/queries</pre>
</li>
<li>Select Method as POST.</li>
<li>Enter Body (Request Data) as below (change your query if needed; we used Public dataset for the demo): If you do not specify timeout then default is 10 seconds only. Also in this call we only care about JobID from response so we just added maxResults=10 because in 2nd step we will get all rows by doing pagination.<br />
<pre class="crayon-plain-tag">{
   "timeoutMs": 100000, 
   "maxResults": 10, 
   "query": "SELECT title id,language,wp_namespace,reversion_id ,comment ,num_characters FROM [bigquery-public-data:samples.wikipedia] LIMIT 100000"
}</pre>
To use a <a href="https://cloud.google.com/bigquery/docs/reference/standard-sql/" target="_blank" rel="noopener">Standard SQL</a> query instead of <a href="https://cloud.google.com/bigquery/docs/reference/legacy-sql" target="_blank" rel="noopener">Legacy SQL</a>, add <strong>useLegacySql</strong> property and set it to <strong>false</strong>:<br />
<pre class="crayon-plain-tag">{
   "timeoutMs": 100000, 
   "maxResults": 10, 
   "query": "SELECT title id,language,wp_namespace,reversion_id ,comment ,num_characters FROM [bigquery-public-data:samples.wikipedia] LIMIT 100000",
   "useLegacySql": false
}</pre>
For all possible parameters refer to <a href="https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query" target="_blank" rel="noopener"><em>jobs/query</em> method documentation</a>.</p>
<div class="su-note"  style="border-color:#e5de9d;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><div class="su-note-inner su-u-clearfix su-u-trim" style="background-color:#FFF8B7;border-color:#ffffff;color:#333333;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><strong>NOTE:</strong> You can also supply location Parameter in the Body JSON to indicate where job should run. For non EU / US datacenters we suggest you to supply this parameter. See details at <a href="https://cloud.google.com/bigquery/docs/locations#specifying_your_location" target="_blank" rel="noopener">https://cloud.google.com/bigquery/docs/locations#specifying_your_location</a>.<br />
Example Use Of Location<br />
<pre class="crayon-plain-tag">{ &quot;location&quot;:&quot;us-east1&quot;,&nbsp; &quot;maxResults&quot;: 10, &quot;query&quot;: &quot;SELECT title FROM [bigquery-public-data:samples.wikipedia] LIMIT 10&quot; }</pre>
</div></div></li>
<li>Select Body Content Type as <strong>application/json</strong>.</li>
<li>Click Test Request/Response. If things go well you will see JSON content. Then just <strong>copy JobId from the response</strong> and save for later use:<br />
<a href="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-rest-api-get-jobid.png"><img loading="lazy" decoding="async" class="alignnone size-full wp-image-2213" src="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-rest-api-get-jobid.png" alt="" width="677" height="488" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-rest-api-get-jobid.png 677w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-rest-api-get-jobid-300x216.png 300w" sizes="(max-width: 677px) 100vw, 677px" /></a></li>
<li>Now go to Response Settings tab.</li>
<li>Select ContentType=Json.</li>
<li>Enter below expression to extract JobId from the response. You will need this Id to read SQL query output (this JobId is valid for 24 hrs):<br />
<pre class="crayon-plain-tag">$.jobReference.jobId</pre>
</li>
<li>Choose save response content to save to a variable. Select &lt;New Variable&#8230;&gt; option. When prompted give a name to your variable (i.e. vJobId) and for the value field paste JobId copied from the above step:<br />
<a href="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-rest-api-response-save-json-value-variable.png"><img loading="lazy" decoding="async" class="alignnone size-full wp-image-2214" src="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-rest-api-response-save-json-value-variable.png" alt="" width="625" height="531" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-rest-api-response-save-json-value-variable.png 625w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-rest-api-response-save-json-value-variable-300x255.png 300w" sizes="(max-width: 625px) 100vw, 625px" /></a><br />
<div class="su-note"  style="border-color:#e5dd9d;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><div class="su-note-inner su-u-clearfix su-u-trim" style="background-color:#FFF7B7;border-color:#ffffff;color:#333333;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><strong>NOTE</strong>: This JobId expires after 24-hrs so while you are designing a package and not running it then design time value is used for testing/previewing. So make sure your JobId is valid. If needed click Test Request/Response to grab a new JobId and update the variable with a new value so you can preview data without running a full package.</div></div></li>
<li>Click OK to save the UI.</li>
</ol>
<h3>Read BigQuery SQL result (for specified JobID)</h3>
<p>Now let&#8217;s look at how to read data from BigQuery</p>
<p>Configure Google BigQuery Web Request (URL, Method, ContentType, Body etc.)</p>
<ol>
<li>In the Control Flow designer drag and drop Data Flow Task from SSIS toolbox.<br />
<img decoding="async" class="figureimage" title="SSIS Data Flow Task - Drag and Drop" src="https://zappysys.com/onlinehelp/ssis-powerpack/scr/images/drag-and-drop-data-flow-task.png" alt="SSIS Data Flow Task - Drag and Drop" /></li>
<li>Double click Data Flow Task and drag and drop <a href="//zappysys.com/products/ssis-powerpack/ssis-json-file-source/" target="_blank" rel="noopener">ZS JSON Source (For API/File)</a> from SSIS toolbox.<br />
<img decoding="async" class="figureimage" title="SSIS JSON Source - Drag and Drop" src="https://zappysys.com/onlinehelp/ssis-powerpack/scr/images/json-source/ssis-json-source-adapter-drag.png" alt="SSIS JSON Source - Drag and Drop" /></li>
<li>Double click JSON Source to edit and configure as below</li>
<li>Enter URL as below. Replace <strong>YOUR-API-PROJECT-ID</strong> with the API Project ID obtained in the <a href="//zappysys.com/blog/wp-content/uploads/2017/07/how-to-find-google-api-project-id.png" target="_blank" rel="noopener">previous section</a>.<br />
<pre class="crayon-plain-tag">https://www.googleapis.com/bigquery/v2/projects/YOUR-API-PROJECT-ID/queries/{{User::vJobId}}?maxResults=10000</pre>
&#8211;OR&#8211; (Use below if your Job ran <strong>outside US / EU data center</strong>)<br />
<pre class="crayon-plain-tag">https://www.googleapis.com/bigquery/v2/projects/YOUR-API-PROJECT-ID/queries/{{User::vJobId}}?location=YOUR_REGION_ID&amp;maxResults=10000</pre>
<div class="su-note"  style="border-color:#e5de9d;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><div class="su-note-inner su-u-clearfix su-u-trim" style="background-color:#FFF8B7;border-color:#ffffff;color:#333333;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><strong>NOTE:</strong> location Parameter indicates the geographic location of the job. This is <span style="text-decoration: underline;"><strong>Required parameter except for US and EU</strong></span>. See details at <a href="https://cloud.google.com/bigquery/docs/locations#specifying_your_location">https://cloud.google.com/bigquery/docs/locations#specifying_your_location</a>. You can supply same location parameter for first step when you <a href="https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query" target="_blank" rel="noopener">submit the query</a></p>
<p>Failure to supply location parameter for non US /EU users may result in <strong>404 NotFound Error</strong>.<br />
</div></div></li>
<li>Check <strong>Use Credentials</strong> option and select <strong>OAuth Connection Manager</strong> created in the previous section.</li>
<li>For HTTP Request Method select <strong>GET</strong>.<br />
<a href="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-read-data-from-google-bigquery-job-use-json-source.png"><img loading="lazy" decoding="async" class="size-full wp-image-2215" src="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-read-data-from-google-bigquery-job-use-json-source.png" alt="Read data from Google BigQuery from Temp Job result" width="1004" height="661" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-read-data-from-google-bigquery-job-use-json-source.png 1004w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-read-data-from-google-bigquery-job-use-json-source-300x198.png 300w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-read-data-from-google-bigquery-job-use-json-source-768x506.png 768w" sizes="(max-width: 1004px) 100vw, 1004px" /></a></li>
</ol>
<p>Read data from Google BigQuery from Temp Job result</p>
<h4>Configure Filter</h4>
<p>We need to transform a single JSON response into multiple rows so we need to apply a correct filter.</p>
<ol>
<li>On JSON Source go to Filter Options tab.</li>
<li>In the Filter field enter<strong> $.rows[*]</strong> or click [Select Filter] button to browse hierarchy you want to extract.</li>
</ol>
<div id="attachment_1573" style="width: 685px" class="wp-caption alignnone"><a href="//zappysys.com/blog/wp-content/uploads/2017/07/ssis-get-data-google-bigquery-select-json-filter.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1573" class="wp-image-1573 size-full" src="//zappysys.com/blog/wp-content/uploads/2017/07/ssis-get-data-google-bigquery-select-json-filter.png" alt="Select Filter for JSON Response" width="675" height="363" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-get-data-google-bigquery-select-json-filter.png 675w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-get-data-google-bigquery-select-json-filter-300x161.png 300w" sizes="(max-width: 675px) 100vw, 675px" /></a><p id="caption-attachment-1573" class="wp-caption-text">Select Filter for JSON Response</p></div>
<h4>Configure BigQuery API Pagination Settings</h4>
<p>Most of modern APIs usually implement some sort of pagination technique so you get a part of data in each request rather than all in one go. Thus if you want more you can paginate through pages until the last page is reached. You can also read <a href="//zappysys.com/blog/ssis-rest-api-looping-until-no-more-pages-found/" target="_blank" rel="noopener">Understanding REST API Pagination in SSIS</a> article to learn more about pagination in SSIS.</p>
<p>BigQuery API returns <em>pageToken</em> attribute in response JSON if more data is found for requested query result. You can then pass <em>pageToken</em> in the next URL in this format: <strong>http://my-api-url/?pageToken=xxxxxxxxxx</strong></p>
<p>Now lets configure JSON Source to automate this pagination for us. On JSON Source go to Pagination Tab and enter the following two settings:</p>
<ol>
<li>Set Next Link as <strong>$.pageToken</strong>.</li>
<li>Set Suffix for Next URL as <strong>&amp;pageToken=&lt;%nextlink%&gt;</strong>.</li>
</ol>
<p>See the screenshot below to get more clarity:</p>
<div id="attachment_1572" style="width: 815px" class="wp-caption alignnone"><a href="//zappysys.com/blog/wp-content/uploads/2017/07/ssis-google-bigquery-api-pagination-settings.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1572" class="size-full wp-image-1572" src="//zappysys.com/blog/wp-content/uploads/2017/07/ssis-google-bigquery-api-pagination-settings.png" alt="Configure BigQuery API Pagination on SSIS JSON Source" width="805" height="576" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-google-bigquery-api-pagination-settings.png 805w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-google-bigquery-api-pagination-settings-300x215.png 300w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-google-bigquery-api-pagination-settings-768x550.png 768w" sizes="(max-width: 805px) 100vw, 805px" /></a><p id="caption-attachment-1572" class="wp-caption-text">Configure BigQuery API Pagination on SSIS JSON Source</p></div>
<h4>Configure Array Transformation</h4>
<p>Now the last thing we have to configure is special 2-dimensional JSON array format used by BigQuery API:</p>
<ol>
<li>On the JSON Source UI go to 2D Array Transformation tab.</li>
<li>Enter the following settings:
<ol>
<li>For Transformation Type select <strong>Transform complex 2-dimensional array</strong>.</li>
<li>For Column Name filter enter <strong>$.schema.fields[*].name</strong>.</li>
<li>For Row Values Filter enter <strong>$.f[*].v</strong>.</li>
</ol>
</li>
</ol>
<div id="attachment_1575" style="width: 819px" class="wp-caption alignnone"><a href="//zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-filter-extract-2d-array-get-data-from-google-bigquery.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1575" class="size-full wp-image-1575" src="//zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-filter-extract-2d-array-get-data-from-google-bigquery.png" alt="JSON Array Transformation Options" width="809" height="317" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-filter-extract-2d-array-get-data-from-google-bigquery.png 809w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-filter-extract-2d-array-get-data-from-google-bigquery-300x118.png 300w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-filter-extract-2d-array-get-data-from-google-bigquery-768x301.png 768w" sizes="(max-width: 809px) 100vw, 809px" /></a><p id="caption-attachment-1575" class="wp-caption-text">JSON Array Transformation Options</p></div>
<h4>Preview Data and Save UI</h4>
<p>That&#8217;s it click Preview to see some data. If you entered sample JobID for your User::vJobID variable then you will see some data. In the next section we will see how to load Google BigQuery data into SQL Server. You can click Columns Tab to review Metadata.</p>
<p>Click OK to save UI and generate Metadata.</p>
<p>&nbsp;</p>
<h3>Configure Target &#8211; Load Google BigQuery data into SQL Server</h3>
<p>Now you can connect your JSON Source to any target such as ZS Trash Destination or a real database destination such as OLEDB Destination.</p>
<p>If you wish to dump data from Google BigQuery to a SQL Server table then just perform the following steps:</p>
<ol>
<li>Drag and drop OLEDB Destination from SSIS Toolbox. Rename it to something like SQL Server Table.</li>
<li>Double click on OLEDB Destination.</li>
<li>Click New to create a new connection &gt; Configure connection &gt; Click OK.</li>
<li>Click on New to create a new table &gt; Rename default table name &gt; Click OK.</li>
<li>Click on Mappings tab and click OK to save UI with default mappings.</li>
</ol>
<div id="attachment_1580" style="width: 918px" class="wp-caption alignnone"><a href="//zappysys.com/blog/wp-content/uploads/2017/07/ssis-rest-api-load-data-google-bigquery-to-sqlserver-table-json-connector.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1580" class="size-full wp-image-1580" src="//zappysys.com/blog/wp-content/uploads/2017/07/ssis-rest-api-load-data-google-bigquery-to-sqlserver-table-json-connector.png" alt="Configure SSIS OLEDB Destination - Google BigQuery to SQL Server Import" width="908" height="558" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-rest-api-load-data-google-bigquery-to-sqlserver-table-json-connector.png 908w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-rest-api-load-data-google-bigquery-to-sqlserver-table-json-connector-300x184.png 300w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-rest-api-load-data-google-bigquery-to-sqlserver-table-json-connector-768x472.png 768w" sizes="(max-width: 908px) 100vw, 908px" /></a><p id="caption-attachment-1580" class="wp-caption-text">Configure SSIS OLEDB Destination &#8211; Google BigQuery to SQL Server Import</p></div>
<h3>Execute Package &#8211; Loading BigQuery data into SQL Server</h3>
<div id="attachment_1578" style="width: 596px" class="wp-caption alignnone"><a href="//zappysys.com/blog/wp-content/uploads/2017/07/ssis-package-execute-rest-api-loading-data-from-bigquery-to-sqlserver.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1578" class="size-full wp-image-1578" src="//zappysys.com/blog/wp-content/uploads/2017/07/ssis-package-execute-rest-api-loading-data-from-bigquery-to-sqlserver.png" alt="SSIS Package Execution - Loading Google BigQuery Data into SQL Server" width="586" height="296" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-package-execute-rest-api-loading-data-from-bigquery-to-sqlserver.png 586w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-package-execute-rest-api-loading-data-from-bigquery-to-sqlserver-300x152.png 300w" sizes="(max-width: 586px) 100vw, 586px" /></a><p id="caption-attachment-1578" class="wp-caption-text">SSIS Package Execution &#8211;<br />Loading Google BigQuery Data into SQL Server</p></div>
<h2>Create / Delete Google BigQuery Dataset using API call</h2>
<p>As we mentioned before, you need dataset before you can create a table. Most common way to create dataset is via User Interface but what if you like to automate from your SSIS Package or other workflow? Here is how you can create or delete. Basically you can use REST API Task to send CREATE or DROP command for Dataset object</p>
<h3><strong>Create Google Dataset</strong></h3>
<p>To create dataset configure REST API Task using below settings (We will <a href="https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert" target="_blank" rel="noopener">call this API</a>)</p>
<ol>
<li>Drag REST API Task from toolbox<br />
<img decoding="async" class="figureimage" title="SSIS REST Api Web Service Task - Drag and Drop" src="https://zappysys.com/onlinehelp/ssis-powerpack/scr/images/rest-api-task/ssis-rest-api-web-service-task-drag.png" alt="SSIS REST Api Task - Drag and Drop" /></li>
<li>Select URL from Connection mode and select connection as <strong>OAuth connection</strong> (Created in previous section)</li>
<li>Enter below URL (assuming you stored your ProjectID in a variable called ProjectId<br />
<pre class="crayon-plain-tag">https://bigquery.googleapis.com/bigquery/v2/projects/{{User::ProjectId}}/datasets</pre>
</li>
<li>Request Method as <strong>POST</strong></li>
<li>Enter Request Body as below (change YOUR_DATASET_NAME &#8211; e.g. TestDataset )<br />
<pre class="crayon-plain-tag">{"datasetReference": { "datasetId": "YOUR_DATASET_NAME", "projectId": "{{User::ProjectId}}"} }</pre>
</li>
<li>Select Request Content Type as <strong>application/json</strong> from the dropdown</li>
<li>(Optional) &#8211; If you want to implement Continue if Dataset already exists then you can go to Error Handling Tab (See next section for screenshot) and check <strong>Continue On Error Code</strong> option and set <strong>409 status code</strong></li>
<li>That&#8217;s it. Click Test Request/Response see it works.</li>
</ol>
<h3>Delete Google BigQuery Dataset</h3>
<p>For deleing dataset you have to choose same steps as above except two things</p>
<ol>
<li>Request Method as <strong>DELETE</strong></li>
<li>Request Body as blank</li>
</ol>
<h2>Create / Delete Google BigQuery Table using API call</h2>
<p>Now let&#8217;s look at how to create /drop Table in Google BigQuery using API calls.</p>
<h3>Create Google BigQuery Table</h3>
<p>To send <a href="https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#:~:text=%20%20%201%20Open%20the%20BigQuery%20web,table%20appears%20in%20the%20resources%20pane.%20More%20" target="_blank" rel="noopener">CREATE TABLE  SQL statement (DDL)</a> we have to use same approach as we send normal SQL Query <a href="https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query" target="_blank" rel="noopener">using this API call</a>. So notice we used Standard SQL for this call by supplying  <strong>useLegacySql: false</strong> . DDL Statement must be all in one line</p><pre class="crayon-plain-tag">POST https://www.googleapis.com/bigquery/v2/projects/{{User::ProjectId}}/queries

Content-Type: application/json

&gt;&gt;&gt;&gt; BODY &lt;&lt;&lt;&lt;&lt;

{
 "query": "CREATE TABLE TestDataset.Table1 (RecordID INT64,CustomerID STRING,CustomerName STRING);",
 "useLegacySql": false,
 "timeoutMs": 100000 
}</pre><p>
To create dataset configure REST API Task using below settings</p>
<ol>
<li>Drag REST API Task from toolbox<br />
<img decoding="async" class="figureimage" title="SSIS REST Api Web Service Task - Drag and Drop" src="https://zappysys.com/onlinehelp/ssis-powerpack/scr/images/rest-api-task/ssis-rest-api-web-service-task-drag.png" alt="SSIS REST Api Task - Drag and Drop" /></li>
<li>Select URL from Connection mode and select connection as <strong>OAuth connection</strong> (Created in previous section)</li>
<li>Enter below URL (assuming you stored your ProjectID in a variable called ProjectId<br />
<pre class="crayon-plain-tag">https://bigquery.googleapis.com/bigquery/v2/projects/{{User::ProjectId}}/queries</pre>
</li>
<li>Request Method as <strong>POST</strong></li>
<li>Enter Request Body as below<br />
<pre class="crayon-plain-tag">{
 "query": "CREATE TABLE TestDataset.Table1 (RecordID INT64,CustomerID STRING,CustomerName STRING);",
 "useLegacySql": false,
 "timeoutMs": 100000
}</pre>
</li>
<li>Select Request Content Type as <strong>application/json</strong> from the dropdown</li>
<li>(Optional) &#8211; If you want to implement Continue if Table exists then you can go to enable Error Handling Tab and check <strong>Continue On Error Code</strong> option and set <strong>409 status code</strong></li>
<li>That&#8217;s it. Click Test Request/Response see it works.</li>
</ol>
<div id="attachment_8986" style="width: 1008px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-create-bigquery-table-rest-api-call-skip-if-exists.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-8986" class="size-full wp-image-8986" src="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-create-bigquery-table-rest-api-call-skip-if-exists.png" alt="Create Google BigQuery Table - API Call (Continue On Error Setting - Skip CREATE if Table / Dataset Already Exists)" width="998" height="692" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-create-bigquery-table-rest-api-call-skip-if-exists.png 998w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-create-bigquery-table-rest-api-call-skip-if-exists-300x208.png 300w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-create-bigquery-table-rest-api-call-skip-if-exists-768x533.png 768w" sizes="(max-width: 998px) 100vw, 998px" /></a><p id="caption-attachment-8986" class="wp-caption-text">Create Google BigQuery Table &#8211; API Call (Continue On Error Setting &#8211; Skip CREATE if Table / Dataset Already Exists)</p></div>
<h3>Delete Google BigQuery Table</h3>
<p>For deleing table you have to choose same steps as above except two things</p>
<ol>
<li>Request Method as <strong>DELETE</strong></li>
<li>Request Body as blank</li>
</ol>
<h2>Write data to Google BigQuery using SSIS &#8211; 1 Million row insert test (FAST)</h2>
<p>Now let&#8217;s look at how easy it is to import data into Google BigQuery using SSIS. We will use the same OAuth connection we created before. To learn more about inserting data into BigQuery check <a href="https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll"><i>tabledata/insertAll </i>method documentation</a>.</p>
<h3>Make sure billing is enabled</h3>
<p>Make sure that billing is enabled for your Google Cloud project. <a href="https://cloud.google.com/billing/docs/how-to/modify-project" target="_blank" rel="noopener">Learn how to confirm billing is enabled for your project</a>.</p>
<p>Streaming is not available via the <a href="https://cloud.google.com/bigquery/pricing#free-tier">free tier</a>. If you attempt to use streaming without enabling billing, you receive the following error: <code translate="no" dir="ltr">BigQuery: Streaming insert is not allowed in the free tier.</code></p>
<p>As long as your API calls fall under Free Tier  limit you wont be charged but you still need to enable billing if you wish to call Streaming insertAll API call (Write  demo).</p>
<h3>Configure OAuth Connection / Permissions</h3>
<p>If you want to perform data insert operation in BigQuery using API calls then include the following scopes in your OAuth Connection Manager and generate a token (see our first section of this article &#8211; We already included scopes for Write operation but incase you didnt do then regenerate token with below scopes):</p><pre class="crayon-plain-tag">https://www.googleapis.com/auth/bigquery
https://www.googleapis.com/auth/cloud-platform
https://www.googleapis.com/auth/bigquery.insertdata</pre><p>
<h3>Create BigQuery Dataset (From UI)</h3>
<p>For this demo first create a test dataset and one table under it like shown below (<em>billing must be enabled on your Google API Project</em>). To do that go to <a href="https://bigquery.cloud.google.com/welcome" target="_blank" rel="noopener">https://bigquery.cloud.google.com/welcome</a> and configure them:</p>
<div class="su-note"  style="border-color:#e5dd9d;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><div class="su-note-inner su-u-clearfix su-u-trim" style="background-color:#FFF7B7;border-color:#ffffff;color:#333333;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><strong>NOTE:</strong> BigQuery now provides a <a href="https://cloud.google.com/bigquery/docs/sandbox">sandbox</a> if you do not want to provide a credit card or enable billing for your project. The steps in this topic work for a project whether or not your project has billing enabled. If you optionally want to enable billing, see <a href="https://cloud.google.com/billing/docs/how-to/modify-project" target="_blank" rel="noopener">Learn how to enable billing</a>. There are some restriction on Sandbox mode (Write API calls will fail &#8211; Check Common Errors section later this article)</div></div>
<p><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/google-bigquery-console-create-dataset-table-defination.png"><img loading="lazy" decoding="async" class="size-full wp-image-2354" src="https://zappysys.com/blog/wp-content/uploads/2017/07/google-bigquery-console-create-dataset-table-defination.png" alt="Create sample dataset and table for Google BigQuery Load" width="715" height="599" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/google-bigquery-console-create-dataset-table-defination.png 715w, https://zappysys.com/blog/wp-content/uploads/2017/07/google-bigquery-console-create-dataset-table-defination-300x251.png 300w" sizes="(max-width: 715px) 100vw, 715px" /></a></p>
<h3>Insert data into BigQuery using API call</h3>
<p>And here is a REST API example to insert data into a BigQuery Table:</p>
<p><strong>Request URL:</strong></p><pre class="crayon-plain-tag">https://www.googleapis.com/bigquery/v2/projects/MY_PROJECT_ID/datasets/MY_DATASET_ID/tables/MY_TABLE_ID/insertAll</pre><p>
<strong>Request Headers:</strong></p><pre class="crayon-plain-tag">Authorization: Bearer ya29.Gl0cBxxxxxxxxxxxxxxxxxxuEIhJIEnxE6GsQPHI
Content-Type: application/json</pre><p>
<strong>Request Body:</strong></p><pre class="crayon-plain-tag">{
  "kind": "bigquery#tableDataInsertAllRequest",
  "rows": [
     {"json": {"RowId": 1,"CustomerName": "AAA"} }, 
     {"json": {"RowId": 2,"CustomerName": "BBB"} }
   ]
}</pre><p>
<h3></h3>
<p>Here is our data flow setup to achive very high throughput for Google BigQuery Data Load. We will show you how to insert one million rows in Google BigQuery in less than a minute based on below setup. We will use Multi Threading option and New Compression Option (Added in v3.1.4)</p>
<ol>
<li>Dummy Data Source &#8211; Generate sample records</li>
<li>JSON Generator Transform &#8211; Generates JSON documents to send as POST request for above /insertAll API call.</li>
<li>Web API destination  &#8211; Call /insertAll API call to submit our data to BigQuery</li>
<li><strong>(Optional)</strong> JSON Parser Transform &#8211; Parse Error Message for any response</li>
<li><strong>(Optional)</strong> Trash Destination &#8211; Save any errors to text file for review</li>
</ol>
<div id="attachment_8973" style="width: 910px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/google-bigquery-fast-data-load-ssis-multi-threads-insert-compression-on.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-8973" class="size-full wp-image-8973" src="https://zappysys.com/blog/wp-content/uploads/2017/07/google-bigquery-fast-data-load-ssis-multi-threads-insert-compression-on.png" alt="Google BigQuery Data Load Demo in SSIS - 1 Million Rows Insert with Multi Threads and Compression ON (Fast Upload)" width="900" height="724" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/google-bigquery-fast-data-load-ssis-multi-threads-insert-compression-on.png 900w, https://zappysys.com/blog/wp-content/uploads/2017/07/google-bigquery-fast-data-load-ssis-multi-threads-insert-compression-on-300x241.png 300w, https://zappysys.com/blog/wp-content/uploads/2017/07/google-bigquery-fast-data-load-ssis-multi-threads-insert-compression-on-768x618.png 768w" sizes="(max-width: 900px) 100vw, 900px" /></a><p id="caption-attachment-8973" class="wp-caption-text">Google BigQuery Data Load Demo in SSIS &#8211; 1 Million Rows Insert with Multi Threads and Compression ON (Fast Upload)</p></div>
<p>&nbsp;</p>
<h3>Configure SSIS JSON Generator &#8211; Generate JSON for BigQuery Table Insert</h3>
<p>Now let&#8217;s see how to build an HTTP request with JSON body and send it to BigQuery:</p>
<ol style="margin-left: 0;">
<li>Drag Data Flow Task and double click on it.<br />
<img decoding="async" class="figureimage" title="SSIS Data Flow Task - Drag and Drop" src="https://zappysys.com/onlinehelp/ssis-powerpack/scr/images/drag-and-drop-data-flow-task.png" alt="SSIS Data Flow Task - Drag and Drop" /></li>
<li>Drag and configure your Source (for this demo we use Dummy Data Source with Customer Template). See previous section for configuration of Dummy Data Source.<br />
<img decoding="async" class="figureimage" title="SSIS DummyData Source - Drag and Drop" src="https://zappysys.com/onlinehelp/ssis-powerpack/scr/images/dummy-data-Source/ssis-dummy-data-source-adapter-drag.png" alt="SSIS DummyData Source - Drag and Drop" /></li>
<li>Drag and drop <a href="https://zappysys.com/products/ssis-powerpack/ssis-json-generator-transform/" target="_blank" rel="noopener">ZS JSON Generator Transform</a> to produce JSON from a database or file records. If your source is already sending a valid JSON then you can skip this step (e.g. SQL query is<br />
returning JSON). You can also read raw JSON from a very large file (new-line separated JSON) using <a href="https://zappysys.com/products/ssis-powerpack/ssis-json-file-source/" target="_blank" rel="noopener">JSON Source with Output as Raw Data</a> option checked.<img decoding="async" class="figureimage" title="SSIS JSON Generator - Drag and Drop" src="https://zappysys.com/onlinehelp/ssis-powerpack/scr/images/json-generator-transform/ssis-json-generator-transform-drag-2.png" alt="SSIS JSON Generator - Drag and Drop" /></li>
<li>Connect Source to JSON Generator. Double click JSON Generator Transform to start configuring it like below.</li>
<li>Select Output Mode as <strong>Single Dataset Array</strong> and enter Batch Size <strong>10000</strong> (This is Max limit allowed by Google BigQuery API <a href="https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll" target="_blank" rel="noopener">insertAll</a>)</li>
<li>First <strong>right click</strong> on Mappings node and select <strong>Add Static Element</strong>
<div id="attachment_8975" style="width: 862px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-insert-request-batch-json-generate.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-8975" class="size-full wp-image-8975" src="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-insert-request-batch-json-generate.png" alt="Generate JSON for Google BigQuery InsertAll API request - Batch 10000 rows in a single API call" width="852" height="391" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-insert-request-batch-json-generate.png 852w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-insert-request-batch-json-generate-300x138.png 300w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-insert-request-batch-json-generate-768x352.png 768w" sizes="(max-width: 852px) 100vw, 852px" /></a><p id="caption-attachment-8975" class="wp-caption-text">Generate JSON for Google BigQuery InsertAll API request &#8211; Batch 10000 rows in a single API call</p></div></li>
<li>Enter Name as <strong>kind</strong> and value as <strong><strong>bigquery#tableDataInsertAllRequest<br />
</strong></strong></p>
<div id="attachment_8976" style="width: 538px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-static-value.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-8976" class="size-full wp-image-8976" src="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-static-value.png" alt="JSON Generator - Add Static Element" width="528" height="373" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-static-value.png 528w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-static-value-300x212.png 300w" sizes="(max-width: 528px) 100vw, 528px" /></a><p id="caption-attachment-8976" class="wp-caption-text">JSON Generator &#8211; Add Static Element</p></div></li>
<li>Now right click on Mappings node and select <strong>Add Document Array</strong> option
<div id="attachment_8978" style="width: 464px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-document-array-option.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-8978" class="wp-image-8978 size-full" src="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-document-array-option.png" alt="JSON Generator - Add Document Array" width="454" height="349" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-document-array-option.png 454w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-document-array-option-300x231.png 300w" sizes="(max-width: 454px) 100vw, 454px" /></a><p id="caption-attachment-8978" class="wp-caption-text">JSON Generator &#8211; Add Document Array</p></div></li>
<li>Enter <strong>rows</strong> as array title and click OK
<div id="attachment_8979" style="width: 470px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-document-array.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-8979" class="size-full wp-image-8979" src="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-document-array.png" alt="JSON Generator - Name array" width="460" height="313" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-document-array.png 460w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-document-array-300x204.png 300w" sizes="(max-width: 460px) 100vw, 460px" /></a><p id="caption-attachment-8979" class="wp-caption-text">JSON Generator &#8211; Name array</p></div></li>
<li>Select newly added array node and right click &gt; Add <strong>unbound nested element</strong>  enter Output alias as <strong>json </strong>and click OK.
<div id="attachment_8980" style="width: 475px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-unbound-element.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-8980" class="size-full wp-image-8980" src="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-unbound-element.png" alt="JSON Generator - Add unbound nested element" width="465" height="471" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-unbound-element.png 465w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-unbound-element-296x300.png 296w" sizes="(max-width: 465px) 100vw, 465px" /></a><p id="caption-attachment-8980" class="wp-caption-text">JSON Generator &#8211; Add unbound nested element</p></div></li>
<li>Select <strong>json</strong> node and right click &gt; Select <strong>Add Elements below</strong> this node and select multiple columns you like to send to BigQuery. Click OK to save.
<div id="attachment_8981" style="width: 467px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-multiple-elements.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-8981" class="size-full wp-image-8981" src="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-multiple-elements.png" alt="JSON Generator - Add Multiple Elements" width="457" height="529" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-multiple-elements.png 457w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-json-generator-add-multiple-elements-259x300.png 259w" sizes="(max-width: 457px) 100vw, 457px" /></a><p id="caption-attachment-8981" class="wp-caption-text">JSON Generator &#8211; Add Multiple Elements</p></div></li>
<li>Now let&#8217;s preview our JSON (Copy preview JSON to try in the next step &#8211; Web API destination)<div class="su-note"  style="border-color:#e5da9d;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><div class="su-note-inner su-u-clearfix su-u-trim" style="background-color:#fff4b7;border-color:#ffffff;color:#333333;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;">NOTE: Table name and column names are case-sensitive so make sure your JSON attribute matches exact same way. </div></div>Here is the finished JSON Structure for next Step
<div id="attachment_8982" style="width: 857px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/create-google-bigquery-request-json-insertall-api.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-8982" class="size-full wp-image-8982" src="https://zappysys.com/blog/wp-content/uploads/2017/07/create-google-bigquery-request-json-insertall-api.png" alt="Sample JSON Request body for Google BigQuery insertAll API request" width="847" height="671" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/create-google-bigquery-request-json-insertall-api.png 847w, https://zappysys.com/blog/wp-content/uploads/2017/07/create-google-bigquery-request-json-insertall-api-300x238.png 300w, https://zappysys.com/blog/wp-content/uploads/2017/07/create-google-bigquery-request-json-insertall-api-768x608.png 768w" sizes="(max-width: 847px) 100vw, 847px" /></a><p id="caption-attachment-8982" class="wp-caption-text">Sample JSON Request body for Google BigQuery insertAll API request</p></div></li>
<li>Click OK to save UI.</li>
</ol>
<h3>Configure SSIS Web API destination &#8211; Insert data into BigQuery Table</h3>
<p>Once you have Input JSON prepared,  now let&#8217;s configure destination for BigQuery.</p>
<ol style="margin-left: 0;">
<li>Drag and drop <a href="https://zappysys.com/products/ssis-powerpack/ssis-web-api-destination-connector/" target="_blank" rel="noopener">ZS Web API Destination</a>.</li>
<li>Connect your JSON Generator Transform to Web API destination.</li>
<li>Configure general properties:
<div id="attachment_2349" style="width: 869px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-load-using-web-api-destination.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-2349" class="size-full wp-image-2349" src="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-load-using-web-api-destination.png" alt="SSIS Web API Destination - Configure for BigQuery Data load" width="859" height="583" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-load-using-web-api-destination.png 859w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-load-using-web-api-destination-300x204.png 300w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-load-using-web-api-destination-768x521.png 768w" sizes="(max-width: 859px) 100vw, 859px" /></a><p id="caption-attachment-2349" class="wp-caption-text">SSIS Web API Destination &#8211;<br />Configure for BigQuery Data load</p></div></li>
<li>Make sure to enter URL in this format:<br />
<pre class="crayon-plain-tag">https://www.googleapis.com/bigquery/v2/projects/MY_PROJECT_ID/datasets/MY_DATASET_ID/tables/MY_TABLE_ID/insertAll</pre>
Make sure to replace 3 parts in above URL (MY_PROJECT_ID, MY_DATASET_ID, MY_TABLE_ID) with actual values from your Google Project and BigQuery dataset/table configuration.</li>
<li>Now you can enable Compression and Multiple Threads for higher throughput as below.<br />
<strong>NOTE:</strong> Compression Property was added in v3.1.4 so you may not see it if you have older version.</p>
<div id="attachment_8983" style="width: 1042px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-rest-api-post-data-request-performance-optimization-gzip-compression-multi-threads.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-8983" class="size-full wp-image-8983" src="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-rest-api-post-data-request-performance-optimization-gzip-compression-multi-threads.png" alt="Google BigQuery Data Loading Performance Optimization Options - Enable Multiple Threads and Compression Options" width="1032" height="347" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-rest-api-post-data-request-performance-optimization-gzip-compression-multi-threads.png 1032w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-rest-api-post-data-request-performance-optimization-gzip-compression-multi-threads-300x101.png 300w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-rest-api-post-data-request-performance-optimization-gzip-compression-multi-threads-768x258.png 768w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-rest-api-post-data-request-performance-optimization-gzip-compression-multi-threads-1024x344.png 1024w" sizes="(max-width: 1032px) 100vw, 1032px" /></a><p id="caption-attachment-8983" class="wp-caption-text">Google BigQuery Data Loading Performance Optimization Options &#8211; Enable Multiple Threads and Compression Options</p></div></li>
<li>If you want to try test insert request from UI without running full package then go back to first tab and edit Body (Use Sample JSON generated by previous JSON Transform &#8211; You can grab from JSON Preview Panel on Generator Transform).Click Test Request / Response and confirm Success as below. You can go back to your BigQuery Portal and check one row is inserted after our test click. If everything looking good then run full package to insert all records.Sample JSON for Body:<br />
<pre class="crayon-plain-tag">{
  "kind": "bigquery#tableDataInsertAllRequest",
  "rows": [
     {"json": {"RowId": 1,"CustomerName": "AAA"} }, 
     {"json": {"RowId": 2,"CustomerName": "BBB"} }
   ]
}</pre>
<div id="attachment_4919" style="width: 910px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-google-bigquery-test-insert-dataset-record-api.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-4919" class="size-full wp-image-4919" src="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-google-bigquery-test-insert-dataset-record-api.png" alt="Test Google BigQuery Table Insert - SSIS Web API Destination UI" width="900" height="730" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-google-bigquery-test-insert-dataset-record-api.png 900w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-google-bigquery-test-insert-dataset-record-api-300x243.png 300w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-google-bigquery-test-insert-dataset-record-api-768x623.png 768w" sizes="(max-width: 900px) 100vw, 900px" /></a><p id="caption-attachment-4919" class="wp-caption-text">Test Google BigQuery Table Insert &#8211; SSIS Web API Destination UI</p></div></li>
<li>Hit OK to save UI.</li>
<li>Run the package and verify data in Google BigQuery Console:
<div id="attachment_2351" style="width: 1206px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-import-example.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-2351" class="size-full wp-image-2351" src="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-import-example.png" alt="Loading data into Google BigQuery using SSIS" width="1196" height="625" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-import-example.png 1196w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-import-example-300x157.png 300w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-import-example-768x401.png 768w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-import-example-1024x535.png 1024w" sizes="(max-width: 1196px) 100vw, 1196px" /></a><p id="caption-attachment-2351" class="wp-caption-text">Loading data into Google BigQuery using SSIS</p></div></li>
</ol>
<h2></h2>
<h2>Error Handling for BigQuery Data Load (Bulk Insert API Calls)</h2>
<p>There will be a time when some records you insert may not go well in Google BigQuery. In such case you can read output from Web API destination and parse further with JSON Parser Transform. Check for certain values in the output. You must use JSON Parser after Web API destination (Connect <strong>Blue Arrow</strong> from Web API destination &#8211; Since its Soft Error it won&#8217;t redirect in Red Arrow ).</p>
<p>For example here is the sample JSON in POST Body for testing which produces error due to bad column name. When bad row found in batch all records will be rejected. Notice that error returns index of record in batch so you can identify which row went bad. It also returns column name in location attribute.</p>
<p><strong>NOTE: Bad column name in 2nd record</strong></p>
<p><strong>Test Body (Bad):</strong></p><pre class="crayon-plain-tag">{
  "kind": "bigquery#tableDataInsertAllRequest",
  "rows": [
     {"json": {"RecordID": 1,"CustomerID": "X1"} }, 
     {"json": {"Bad_Column": 2,"CustomerID": "X2"} }
     {"json": {"RecordID": 3,"CustomerID": "X3"} }, 
   ]
}</pre><p>
&nbsp;</p>
<p><strong>Response (For Bad Input):</strong></p><pre class="crayon-plain-tag">{
  "kind": "bigquery#tableDataInsertAllResponse",
  "insertErrors": [
    {
      "index": 1,
      "errors": [
        {
          "reason": "invalid",
          "location": "bad_column",
          "debugInfo": "",
          "message": "no such field."
        }
      ]
    },
    {
      "index": 0,
      "errors": [
        {
          "reason": "stopped",
          "location": "",
          "debugInfo": "",
          "message": ""
        }
      ]
    },
    {
      "index": 2,
      "errors": [
        {
          "reason": "stopped",
          "location": "",
          "debugInfo": "",
          "message": ""
        }
      ]
    }
  ]
}</pre><p>
&nbsp;</p>
<p>To configure error detection perform following steps.</p>
<ol>
<li>Drag and drop <a href="https://zappysys.com/products/ssis-powerpack/ssis-json-parser-transform/" target="_blank" rel="noopener">ZS JSON Parser Transform</a> after Web API destination</li>
<li>Click on Web API destination. Connect Blue arrow  to JSON Parser Transform</li>
<li>Configure JSON Parser Transform like below</li>
<li>Connect JSON Parser TRansform to some Destination to save error information (e.g. SQL Table or Trans destination)</li>
</ol>
<div id="attachment_4920" style="width: 1111px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-handle-bigquery-insert-errors-json-parser.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-4920" class="size-full wp-image-4920" src="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-handle-bigquery-insert-errors-json-parser.png" alt="Handling BigQuery Insert Errors in SSIS " width="1101" height="739" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-handle-bigquery-insert-errors-json-parser.png 1101w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-handle-bigquery-insert-errors-json-parser-300x201.png 300w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-handle-bigquery-insert-errors-json-parser-768x515.png 768w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-handle-bigquery-insert-errors-json-parser-1024x687.png 1024w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-handle-bigquery-insert-errors-json-parser-272x182.png 272w" sizes="(max-width: 1101px) 100vw, 1101px" /></a><p id="caption-attachment-4920" class="wp-caption-text">Handling BigQuery Insert Errors in SSIS</p></div>
<h2>Other Common Errors in BigQuery API calls</h2>
<p>In this section we will talk about many common API errors in BigQuery.</p>
<h3>Error: The project XXXXXXX has not enabled BigQuery</h3>
<p>Sometimes you might get below error. To fix this error make sure you go to your Project and <a href="https://console.cloud.google.com/apis/library/bigquery.googleapis.com" target="_blank" rel="noopener">Enable BigQuery API</a></p><pre class="crayon-plain-tag">Status Code: BadRequest

Response Body: {
  "error": {
    "code": 400,
    "message": "The project bigquerytest-281915 has not enabled BigQuery.",
    "errors": [
      {
        "message": "The project bigquerytest-281915 has not enabled BigQuery.",
        "domain": "global",
        "reason": "invalid"
      }
    ],
    "status": "INVALID_ARGUMENT"
  }
}</pre><p>
<h3>Error: 404 &#8211; Not Found: Table / Job xxxxxx</h3>
<ul>
<li>Make sure you are setting GET request and not POST.</li>
<li>Also make sure your jobId is valid because it expires after 24 hours.</li>
<li>Make sure project-id supplied in URL is valid ID  (DO NOT Specify Alias, use internal ID for project)</li>
<li>
<div>Make sure you supplied location parameter if your outside EU / US region. Reading data using <a href="https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/getQueryResults" target="_blank" rel="noopener">this API call</a> might fail if you failed to supply <strong>location</strong> in URL</div>
</li>
</ul>
</p><pre class="crayon-plain-tag">{
 "error": {
  "errors": [
   {
    "domain": "global",
    "reason": "notFound",
    "message": "Not Found: Table xxxxxx"
   }
  ],
  "code": 404,
  "message": "Not Found: Table xxxxxxx
 }
}</pre><p>
&nbsp;</p>
<h3>Error: 403 &#8211; Access Denied: BigQuery BigQuery: Streaming insert is not allowed in the free tier</h3>
<p>If you trying call certain APIs on sandbox mode or free tier then you might get below error. To overcome this error <a href="https://cloud.google.com/billing/docs/how-to/modify-project#enable_billing_for_a_project" target="_blank" rel="noopener">enable billing on google bigquery</a>. As long as your API calls fall under Free Tier  limit you wont be charged but you still need to enable billing if you wish to call Streaming insertAll API call.</p><pre class="crayon-plain-tag">Response Url: https://www.googleapis.com/bigquery/v2/projects/bigquerytest-281915/datasets/TestDataset/tables/Table1/insertAll

Status Code: Forbidden

Response Body: {
  "error": {
    "code": 403,
    "message": "Access Denied: BigQuery BigQuery: Streaming insert is not allowed in the free tier",
    "errors": [
      {
        "message": "Access Denied: BigQuery BigQuery: Streaming insert is not allowed in the free tier",
        "domain": "global",
        "reason": "accessDenied"
      }
    ],
    "status": "PERMISSION_DENIED"
  }
}</pre><p>
&nbsp;</p>
<p>&nbsp;</p>
<h2>Debugging Web API Requests</h2>
<p>If you need to debug actual requests made to Google server then you can use a tool like <a href="https://zappysys.com/blog/how-to-use-fiddler-to-analyze-http-web-requests/" target="_blank" rel="noopener">Fiddler</a>. It&#8217;s a very handy tool to troubleshoot JSON format issues. It will allow to see how a request is made to a server.</p>
<div id="attachment_2352" style="width: 1076px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-import-debug-api-request-using-fiddler.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-2352" class="size-full wp-image-2352" src="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-import-debug-api-request-using-fiddler.png" alt="Using Fiddler to debug Google BigQuery API requests in SSIS" width="1066" height="719" srcset="https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-import-debug-api-request-using-fiddler.png 1066w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-import-debug-api-request-using-fiddler-300x202.png 300w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-import-debug-api-request-using-fiddler-768x518.png 768w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-import-debug-api-request-using-fiddler-1024x691.png 1024w, https://zappysys.com/blog/wp-content/uploads/2017/07/ssis-bigquery-data-import-debug-api-request-using-fiddler-272x182.png 272w" sizes="(max-width: 1066px) 100vw, 1066px" /></a><p id="caption-attachment-2352" class="wp-caption-text">Using Fiddler to debug Google BigQuery API requests in SSIS</p></div>
<h2>Common Errors</h2>
<div class="content_block" id="custom_post_widget-1887"><h3>Truncation related error</h3>
<p style="text-align: justify;">The most common error you may face when you run an SSIS package is truncation error. During the design time only 300 rows are scanned from a source (a file or a REST API call response) to detect datatypes but at runtime, it is likely you will retrieve far more records. So it is possible that you will get longer strings than initially expected. For detailed instructions on how to fix common metadata related errors read an article "<a href="//zappysys.com/blog/handling-ssis-component-metadata-issues/" target="_blank" rel="noopener">How to handle SSIS errors (truncation, metadata issues)</a>".</p>

<h3>Authentication related error</h3>
Another frequent error you may get is an authentication error, which happens when you deploy/copy a package to another machine and run it there. Check <a href="#Deployment_to_Production">the paragraph below</a> to see why it happens and how to solve this problem.</div>
<h2>Deployment to Production</h2>
<div class="content_block" id="custom_post_widget-1932"><p style="text-align: justify;">In SSIS package <a href="https://docs.microsoft.com/en-us/sql/integration-services/security/access-control-for-sensitive-data-in-packages" target="_blank" rel="noopener">sensitive data such as tokens and passwords are by default encrypted by SSIS</a> with your Windows account which you use to create a package. So SSIS will fail to decrypt tokens/passwords when you run it from another machine using another Windows account. To circumvent this when you are creating an SSIS package which uses authentication components (e.g. an <a href="https://zappysys.com/onlinehelp/ssis-powerpack/scr/ssis-oauth-connection-manager.htm" target="_blank" rel="noopener">OAuth Connection Manager</a> or an <a href="https://zappysys.com/onlinehelp/ssis-powerpack/scr/ssis-http-connection-manager.htm" target="_blank" rel="noopener">HTTP Connection Manager</a> with credentials, etc.), consider using parameters/variables to pass tokens/passwords. In this way, you won’t face authentication related errors when a package is deployed to a production server.</p>
<p style="text-align: justify;">Check our article on <a href="https://zappysys.com/blog/how-to-run-an-ssis-package-with-sensitive-data-on-sql-server/" target="_blank" rel="noopener">how to configure packages with sensitive data on your production or development server</a>.</p></div>
<h2>Download Sample Package</h2>
<p><a href="https://zappysys.com/blog/wp-content/uploads/2017/07/Google_BigQuery_API_Read_Write_Create_Delete_Sample_SSIS2019_2017_2012.zip">Click Here to Download SSIS Sample Package &#8211; Google BigQuery API Read Write Create Delete (SSIS 2019, 2017, 2012)</a></p>
<p>&nbsp;</p>
<h2>Conclusion. What&#8217;s next?</h2>
<p>In this article we have learned how to load data from Google BigQuery into SQL Server using SSIS (drag and drop approach without coding). We used <a href="https://zappysys.com/products/ssis-powerpack/" target="_blank" rel="noopener">SSIS JSON / REST API Connector</a> to extract data from Google BigQuery REST API using OAuth. JSON Source Connector makes it super simple to parse complex/large JSON files or any Web API response into rows and columns so you can load data into a database, e.g. SQL Server database. <a href="//zappysys.com/products/ssis-powerpack/">Download SSIS PowerPack</a> to try many other automation scenarios that were not discussed in this article.</p>
<p><strong>Keywords:</strong></p>
<p>Google BigQuery Integration with SQL Server | How to extract data from google bigquery in SSIS? | How to read data from Google BigQuery API? | Loading BigQuery Data into SQL Server. | BigQuery to SQL Server | SSIS Google Big Query Integration | SSIS Google BigQuery Import  JSON File | SSIS Google BigQuery Export data</p>
<p>&nbsp;</p>
<p>The post <a href="https://zappysys.com/blog/get-data-google-bigquery-using-ssis/">How to read / write data in Google BigQuery using SSIS</a> appeared first on <a href="https://zappysys.com/blog">ZappySys Blog</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>How to setup Amazon Redshift Cluster in few clicks</title>
		<link>https://zappysys.com/blog/how-to-setup-amazon-redshift-cluster-for-outside-data-access/</link>
		
		<dc:creator><![CDATA[ZappySys]]></dc:creator>
		<pubDate>Tue, 17 Nov 2015 14:47:38 +0000</pubDate>
				<category><![CDATA[AWS (Amazon Web Services)]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Redshift]]></category>
		<category><![CDATA[aws]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[cloud computing]]></category>
		<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[redshift]]></category>
		<category><![CDATA[Redshift Data Transfer Task]]></category>
		<category><![CDATA[ssis]]></category>
		<category><![CDATA[SSIS PowerPack]]></category>
		<guid isPermaLink="false">http://zappysys.com/blog/?p=169</guid>

					<description><![CDATA[<p>Introduction In this article you will learn how to Setup Amazon Redshift Cluster in few clicks. You will also learn how to set Inbound and Outbound Firewall Rules so you can access Redshift Cluster from outside of AWS Network (e.g. from your corporate network or your home). By default Redshift Cluster cannot be access from outside [&#8230;]</p>
<p>The post <a href="https://zappysys.com/blog/how-to-setup-amazon-redshift-cluster-for-outside-data-access/">How to setup Amazon Redshift Cluster in few clicks</a> appeared first on <a href="https://zappysys.com/blog">ZappySys Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2>Introduction</h2>
<p>In this article you will learn <em>how to Setup Amazon Redshift Cluster</em> in few clicks. You will also learn how to set Inbound and Outbound Firewall Rules so you can access Redshift Cluster from outside of AWS Network (e.g. from your corporate network or your home). By default Redshift Cluster cannot be access from outside of your AWS Virtual Network (referred as VPC &#8211; Virtual Private Cloud)</p>
<p>Once Redshift Cluster is setup you can follow <a href="//zappysys.com/posts/sql-server-to-redshift-data-load-using-ssis/">these steps to Load data into Redshift</a> (Using <a href="//zappysys.com/products/ssis-powerpack/ssis-amazon-redshift-data-transfer-task/">SSIS Redshift Data Transfer Task</a> or <a href="//zappysys.com/products/zappyshell/amazon-redshift-command-line-tools/">Command line for Redshift</a>)</p>
<h2>What is Amazon Redshift</h2>
<p><em>Amazon Redshift</em> is a fully managed, petabyte-scale <em>data warehouse</em> service in the <em>cloud</em>. You can start with just a few hundred gigabytes of data and scale to a petabyte or more. This enables you to use your data to acquire new insights for your business and customers.</p>
<p>The first step to create a data warehouse is to launch a set of nodes, called an <em>Amazon Redshift cluster</em>. After you provision your cluster, you can upload your data set and then perform data analysis queries. Regardless of the size of the data set, Amazon Redshift offers fast query performance using the same SQL-based tools and business intelligence applications that you use today.</p>
<h2><span id="Setup_your_Amazon_Redshift_Cluster">Setup Amazon Redshift Cluster</span></h2>
<div>
<p><em>NOTE: Skip this step if you already setup you Redshift Cluster</em></p>
<ol>
<li>Login to your AWS Console and Click on Redshift icon. Or <a href="https://console.aws.amazon.com/redshift/home" target="_blank"><span style="color: #248cc8;">click here</span></a> to land directly to redshift</li>
<li>Click on Launch Cluster</li>
<li>On Cluster Detail Page specify Cluster Identifier, Database Name, Port, Master User and Password. Click Continue to go to next page
<div class="wp-caption alignnone">
<p><a href="//zappysys.com/onlinehelp/ssis-powerpack/scr/images/amazon-redshift-datatransfer-task/amazon-redshift-cluster-setup-1-create-database.png"><img loading="lazy" decoding="async" title="Configure Redshift Cluster Identifier, Database Name, Port , UserID and Password" src="//zappysys.com/onlinehelp/ssis-powerpack/scr/images/amazon-redshift-datatransfer-task/amazon-redshift-cluster-setup-1-create-database.png" alt="Configure Redshift Cluster Identifier, Database Name, Port , UserID and Password" width="381" height="365" /></a></p>
<p class="wp-caption-text">Configure Redshift Cluster Identifier, Database Name, Port , UserID and Password</p>
</div>
</li>
<li>On Node Configuration Page specify Node Type (This is VM Type), Cluster Type and Number of Node. If you are trying under Free Tire then select smallest Node possible (in this case it was dw2.large). Click Continue to go to next page
<div class="wp-caption alignnone">
<p><a href="//zappysys.com/onlinehelp/ssis-powerpack/scr/images/amazon-redshift-datatransfer-task/amazon-redshift-cluster-setup-2-specify-node-type.png"><img loading="lazy" decoding="async" title="Configure Redshift Node Type and Cluster Type" src="//zappysys.com/onlinehelp/ssis-powerpack/scr/images/amazon-redshift-datatransfer-task/amazon-redshift-cluster-setup-2-specify-node-type.png" alt="Configure Redshift Node Type and Cluster Type" width="386" height="366" /></a></p>
<p class="wp-caption-text">Configure Redshift Node Type and Cluster Type</p>
</div>
</li>
<li>On Additional Configuration Page you can pick VPC (virtual private connection), Security group for Cluster and other options for Encryption. For demo purpose select as below screenshot . Click Continue to review your settings and click Create Cluster
<div class="wp-caption alignnone">
<p><a href="//zappysys.com/onlinehelp/ssis-powerpack/scr/images/amazon-redshift-datatransfer-task/amazon-redshift-cluster-setup-3-configuration.png"><img loading="lazy" decoding="async" title="Configure Redshift Cluster Encryption, VPC and Additional Detail" src="//zappysys.com/onlinehelp/ssis-powerpack/scr/images/amazon-redshift-datatransfer-task/amazon-redshift-cluster-setup-3-configuration.png" alt="Configure Redshift Cluster Encryption, VPC and Additional Detail" width="545" height="452" /></a></p>
<p class="wp-caption-text">Configure Redshift Cluster Encryption, VPC and Additional Detail</p>
</div>
</li>
<li>Give it few mins while your cluster is being created. After few minutes (5-10 mins) you can go back to same page and review cluster Status and other properties as below. Copy Cluster Endpoint to somewhere because we will need it later.
<div class="wp-caption alignnone">
<p><a href="//zappysys.com/onlinehelp/ssis-powerpack/scr/images/amazon-redshift-datatransfer-task/amazon-redshift-cluster-setup-6-properties.png"><img loading="lazy" decoding="async" title="Check Redshift Cluster Status , Endpoint and Other Properties" src="//zappysys.com/onlinehelp/ssis-powerpack/scr/images/amazon-redshift-datatransfer-task/amazon-redshift-cluster-setup-6-properties.png" alt="Check Redshift Cluster Status , Endpoint and Other Properties" width="526" height="337" /></a></p>
<p class="wp-caption-text">Check Redshift Cluster Status , Endpoint and Other Properties</p>
</div>
</li>
</ol>
</div>
<h2><span id="Add_inbound_rule_for_Redshift_Cluster">Add inbound rule for Redshift Cluster</span></h2>
<div><em>NOTE: Skip this step if you have already added your IP to inbound exclusion rule.</em><br />
By default you cannot connect to Amazon Redshift cluster from outside AWS Network (e.g. from your On-Premises Machine). If you wish to connect then you must add inbound exception rule to allow your request to redshift cluster on specific port.</div>
<div>
<p>To add create new inbound rule perform following steps</p>
<ol>
<li>Under Redshift home page click [Security] tab. You may see following Notice depending on which region you are. Click on [Go to the EC2 Console] link or you can direct go to EC2 by clicking Services -&gt; EC2 menu at the top
<div class="wp-caption alignnone">
<p><a href="//zappysys.com/onlinehelp/ssis-powerpack/scr/images/amazon-redshift-datatransfer-task/amazon-redshift-cluster-setup-7-security-groups.png"><img loading="lazy" decoding="async" title="Configure Security Group and Inbound Filter Firewall Rule to allow Local Connection" src="//zappysys.com/onlinehelp/ssis-powerpack/scr/images/amazon-redshift-datatransfer-task/amazon-redshift-cluster-setup-7-security-groups.png" alt="Configure Security Group and Inbound Filter Firewall Rule to allow Local Connection" width="502" height="218" /></a></p>
<p class="wp-caption-text">Configure Security Group and Inbound Filter Firewall Rule to allow Local Connection</p>
</div>
</li>
<li>On EC2 Security Groups Page select Security group attached with your Redshift Cluster and then in the bottom pane click on Inbound Tab
<div class="wp-caption alignnone">
<p><a href="//zappysys.com/onlinehelp/ssis-powerpack/scr/images/amazon-redshift-datatransfer-task/amazon-redshift-cluster-setup-9-security-group-inbound-rule.png"><img loading="lazy" decoding="async" title="Security Group Screen - Add or Edit Inbound Firewall Rule to allow Local Connection" src="//zappysys.com/onlinehelp/ssis-powerpack/scr/images/amazon-redshift-datatransfer-task/amazon-redshift-cluster-setup-9-security-group-inbound-rule.png" alt="Security Group Screen - Add or Edit Inbound Firewall Rule to allow Local Connection" width="662" height="394" /></a></p>
<p class="wp-caption-text">Security Group Screen – Add or Edit Inbound Firewall Rule to allow Local Connection</p>
</div>
</li>
<li>On Inbound Tab click Edit option to modify default entry or you can add new Rule. Notice how IP Range is specified.. 0.0.0.0/0 means all IP. If you wish to add range then you have to set something like this&#8230; 50.34.234.10/250  .. this will cover 50.34.234.10 to 50.34.234.250 IP range. Make sure your port range covers Port you specified for Redshift cluster.</li>
<li>Click on Add rule if you wish to add new entry else edit as below and click save</li>
</ol>
</div>
<h2>Automate Redshift Cluster Creation</h2>
<p>If you have need to automate Redshift Cluster Creation or any of the following things automatically then check <a href="//zappysys.com/products/ssis-powerpack/ssis-amazon-redshift-cluster-management-task/">Redshift Cluster management Task</a></p>
<ul>
<li>Automate Amazon Redshift Cluster Create Action in few clicks. You can also add Access Security Rule.</li>
<li>Automate Amazon Redshift Cluster Delete Action</li>
<li>Fetch Amazon Redshift Cluster Property to SSIS Variable (e.g. Fetch Cluster Status)</li>
<li>Fetch all cluster and their properties as DataTable (Use ForEach Loop and iterate through all clusters)</li>
<li>Automate Redshift Cluster Snapshot Creation</li>
<li>Automate Redshift Cluster Snapshot Delete Action</li>
<li>Support for Wait until Cluster operation is done</li>
</ul>
<p>The post <a href="https://zappysys.com/blog/how-to-setup-amazon-redshift-cluster-for-outside-data-access/">How to setup Amazon Redshift Cluster in few clicks</a> appeared first on <a href="https://zappysys.com/blog">ZappySys Blog</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Extract / Unload Redshift data into SQL Server using SSIS</title>
		<link>https://zappysys.com/blog/extract-unload-redshift-data-sql-server-using-ssis/</link>
		
		<dc:creator><![CDATA[ZappySys]]></dc:creator>
		<pubDate>Tue, 27 Oct 2015 19:55:36 +0000</pubDate>
				<category><![CDATA[AWS (Amazon Web Services)]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Redshift]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[aws]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[cloud computing]]></category>
		<category><![CDATA[etl]]></category>
		<category><![CDATA[extract]]></category>
		<category><![CDATA[redshift]]></category>
		<category><![CDATA[s3]]></category>
		<category><![CDATA[ssis]]></category>
		<category><![CDATA[SSIS PowerPack]]></category>
		<guid isPermaLink="false">http://zappysys.com/blog/?p=128</guid>

					<description><![CDATA[<p>Introduction In our previous article we saw how to load data into Redshift using SSIS or load data into Redshift using ZappyShell Redshift Command Line In this article we will walk through various steps to Extract/UNLOAD Redshift Data into SQL Server using Amazon S3 Storage Task and ExecuteSQL Task for Amazon Redshift. Below is the [&#8230;]</p>
<p>The post <a href="https://zappysys.com/blog/extract-unload-redshift-data-sql-server-using-ssis/">Extract / Unload Redshift data into SQL Server using SSIS</a> appeared first on <a href="https://zappysys.com/blog">ZappySys Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2>Introduction</h2>
<p>In our previous article we saw <a href="//zappysys.com/posts/sql-server-to-redshift-data-load-using-ssis/">how to load data into Redshift using SSIS</a> or load data into Redshift using ZappyShell <a href="//zappysys.com/products/zappyshell/amazon-redshift-command-line-tools/">Redshift Command Line</a></p>
<p>In this article we will walk through various steps to <em>Extract/UNLOAD Redshift Data into SQL Server</em> using <a href="//zappysys.com/products/ssis-powerpack/ssis-amazon-s3-task/">Amazon S3 Storage Task</a> and <a href="//zappysys.com/products/ssis-powerpack/ssis-redshift-execute-sql-task/">ExecuteSQL Task for Amazon Redshift</a>. Below is the screenshot of actual SSIS Package to <em>Extract Redshift Data and Load into SQL Server</em></p>
<div id="attachment_164" style="width: 664px" class="wp-caption alignnone"><a href="//zappysys.com/blog/wp-content/uploads/2015/10/extract-unload-redshift-data-load-to-sql-server-ssis.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-164" class="size-full wp-image-164" src="//zappysys.com/blog/wp-content/uploads/2015/10/extract-unload-redshift-data-load-to-sql-server-ssis.png" alt="Extract/Unload Redshift Data using SSIS and Load into SQL Server" width="654" height="565" srcset="https://zappysys.com/blog/wp-content/uploads/2015/10/extract-unload-redshift-data-load-to-sql-server-ssis.png 654w, https://zappysys.com/blog/wp-content/uploads/2015/10/extract-unload-redshift-data-load-to-sql-server-ssis-300x259.png 300w" sizes="(max-width: 654px) 100vw, 654px" /></a><p id="caption-attachment-164" class="wp-caption-text">Extract/Unload Redshift Data using SSIS and Load into SQL Server</p></div>
<h2>Requirements for Extract Redshift Data using SSIS</h2>
<p>Before you <a href="http://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD.html">UNLOAD</a> data from Redshift, you have to make sure few things.</p>
<ol>
<li>Setup your Redshift cluster (Follow these instructions <a href="//zappysys.com/blog/how-to-setup-amazon-redshift-cluster-for-outside-data-access/">to setup redshift cluster</a>)</li>
<li>Load some sample data to Redshift (Red more here: <a href="//zappysys.com/posts/sql-server-to-redshift-data-load-using-ssis/">How to load data to Redshift</a>)</li>
<li>Make sure you have correct connection settings to connect to Redshift cluster (Host name, Port, UserId, Password, DB name etc). You can get host name from AWS Console.</li>
<li>Make sure you have Access to S3 Bucket where files will be dumped from Redshift. You will need AccessKey and SecretKey to fetch files from S3</li>
</ol>
<h2>Step-1: Execute Redshift UNLOAD Command</h2>
<p>Very first step would be to unload redshift data as GZip file using <a href="//zappysys.com/products/ssis-powerpack/ssis-redshift-execute-sql-task/">ExecuteSQL Task for Amazon Redshift</a><br />
Below is SQL Command you can use to <em>extract data from Redshift</em>. Notice how we used variable placeholders in SQL Command. These placeholders are replaced at runtime with actual value stored in specified variable.</p><pre class="crayon-plain-tag">unload ('select * from (select * from customerdata limit 1000)')
to 's3://bw-rstest/stage/custdata'
credentials 'aws_access_key_id={{User::S3Accesskey}};aws_secret_access_key={{User::S3SecretKey}}'
ALLOWOVERWRITE</pre><p>
<b>Export as GZip files (Compressed files)</b></p>
<p>If you exporting data as compressed files to save data transfer cost then use GZIP option as below.</p>
<p><strong>NOTE:</strong> Make sure there are no spaces before and after AccessKey and SecretKey otherwise you may get error.</p><pre class="crayon-plain-tag">unload ('select * from (select * from customerdata limit 1000)')
to 's3://bw-rstest/stage/custdata_file_'
credentials 'aws_access_key_id={{User::S3Accesskey}};aws_secret_access_key={{User::S3SecretKey}}'
ALLOWOVERWRITE
GZIP</pre><p>
<h3>Common Errors / Troubleshooting</h3>
<p><strong>UNLOAD command issue with Region mismatch (S3 bucket vs Redshift Cluster)</strong></p>
<p>If your S3 bucket is in different region than Redshift cluster then above command may fail with &#8220;<em>301 permanent redirect error</em>&#8221; in that case you have to change your S3 bucket region. Region can be changed in AWS console (See S3 bucket properties and change location to match region with Redshift cluster region. Both regions must be same.</p>
<blockquote><p>ERROR: XX000: S3ServiceException:The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.,Status 301,Error PermanentRedirect</p></blockquote>
<p><strong>UNLOAD command issue with accesskey and secret key</strong></p>
<p>If you specify invalid accesskey or secretkey &#8211;or&#8211; you have misspelled keywords related to credentials &#8212; or &#8212; you have spaces before or after accesskey or secret key then you may get following error.</p>
<blockquote><p>ERROR: XX000: Invalid credentials. Must be of the format: credentials &#8216;aws_iam_role=&#8230;&#8217; or &#8216;aws_access_key_id=&#8230;;aws_secret_access_key=&#8230;[;token=&#8230;].</p></blockquote>
<h2>Step-2: Download data files from Amazon S3 Bucket to local machine</h2>
<p>Once files are exported to S3 bucket we can download then to local machine using <a href="//zappysys.com/products/ssis-powerpack/ssis-amazon-s3-task/">Amazon S3 Storage Task</a></p>
<h2>Step-3: Un-compress downloaded files</h2>
<p>If you have exported Redshift data as compressed files (using GZIP option) then you can use <a href="https://zappysys.com/products/ssis-powerpack/ssis-zip-file-task/" target="_blank">ZappySys Zip File task</a> to un-compress multiple files.</p>
<p>Or you can write Script to un-compress those files (see below code). You can skip this step if files are not compressed (not used GZIP option in command).</p>
<p>Here is sample <strong>C# code</strong> to un-compress <strong>GZip</strong> files</p><pre class="crayon-plain-tag">public void Main()
{
	System.IO.DirectoryInfo directorySelected = new System.IO.DirectoryInfo(@"C:\amazon\archive");

	foreach (System.IO.FileInfo fileToDecompress in directorySelected.GetFiles("custdata*_part_*"))
	{
		Decompress(fileToDecompress);
	}

	Dts.TaskResult = (int)ScriptResults.Success;
}
private static void Decompress(System.IO.FileInfo fileToDecompress)
{
	using (System.IO.FileStream originalFileStream = fileToDecompress.OpenRead())
	{
		string currentFileName = fileToDecompress.FullName;
		string newFileName = currentFileName.Remove(currentFileName.Length - fileToDecompress.Extension.Length);

		using (System.IO.FileStream decompressedFileStream = System.IO.File.Create(newFileName))
		{
			using (System.IO.Compression.GZipStream decompressionStream = new System.IO.Compression.GZipStream(originalFileStream, System.IO.Compression.CompressionMode.Decompress))
			{
				decompressionStream.CopyTo(decompressedFileStream);
				//Console.WriteLine("Decompressed: {0}", fileToDecompress.Name);
			}
		}
	}
}</pre><p>
&nbsp;</p>
<h2>Step-4: Loop through files using ForEachLoop Container</h2>
<p>Once files downloaded from S3 bucket we can now loop through files using SSIS ForEach Loop Task and load into SQL Server (One file in each iteration)</p>
<div id="attachment_165" style="width: 705px" class="wp-caption alignnone"><a href="//zappysys.com/blog/wp-content/uploads/2015/10/ssis-loop-amazon-s3-files.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-165" class="size-full wp-image-165" src="//zappysys.com/blog/wp-content/uploads/2015/10/ssis-loop-amazon-s3-files.png" alt="Loop through files downloaded from Amazon S3 (Exported using Redshift UNLOAD Command)" width="695" height="384" srcset="https://zappysys.com/blog/wp-content/uploads/2015/10/ssis-loop-amazon-s3-files.png 695w, https://zappysys.com/blog/wp-content/uploads/2015/10/ssis-loop-amazon-s3-files-300x166.png 300w" sizes="(max-width: 695px) 100vw, 695px" /></a><p id="caption-attachment-165" class="wp-caption-text">Loop through files downloaded from Amazon S3 (Exported using Redshift UNLOAD Command)</p></div>
<h2>Step-5: Data Flow &#8211; Load Redshift Data Files to SQL Server</h2>
<p>Inside data flow you can use Flat File source and OLEDB Destination for SQL Server. Just map correct File columns to SQL Server fields and you should be good. If needed convert Unicode/Non-unicode columns using Data Conversion Transform (This is not needed if source is DT_STR and target also DT_STR.. or source is DT_WSTR and target is DT_WSTR i.e. Unicode).</p>
<h2>Downloads</h2>
<p>To download above SSIS Package click on the below links. In order to test below package you first have to <a href="//zappysys.com/products/ssis-powerpack/">download SSIS PowerPack</a><br />
<a href="//zappysys.com/blog/wp-content/uploads/2015/11/RedshiftExtractDemo_2008.zip">Download Demo SSIS Package &#8211; SSIS 2008</a><br />
<a href="//zappysys.com/blog/wp-content/uploads/2015/11/RedshiftExtractDemo_2012.zip">Download Demo SSIS Package &#8211; SSIS 2012/2014</a></p>
<h2>Conclusion</h2>
<p>amazon Redshift is great way to start your data warehouse projects with very minimum investment in a very simple pay as you go model but loading or unloading data from redshift can be challenging task. Using <a href="//zappysys.com/products/ssis-powerpack/">SSIS PowerPack</a> you can perform Redshift data load or unload in few clicks.</p>
<p>The post <a href="https://zappysys.com/blog/extract-unload-redshift-data-sql-server-using-ssis/">Extract / Unload Redshift data into SQL Server using SSIS</a> appeared first on <a href="https://zappysys.com/blog">ZappySys Blog</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
