<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>performance tips Archives | ZappySys Blog</title>
	<atom:link href="https://zappysys.com/blog/tag/performance-tips/feed/" rel="self" type="application/rss+xml" />
	<link>https://zappysys.com/blog/tag/performance-tips/</link>
	<description>SSIS / ODBC Drivers / API Connectors for JSON, XML, Azure, Amazon AWS, Salesforce, MongoDB and more</description>
	<lastBuildDate>Thu, 08 Apr 2021 17:55:02 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.4.4</generator>

<image>
	<url>https://zappysys.com/blog/wp-content/uploads/2023/01/cropped-zappysys-symbol-large-32x32.png</url>
	<title>performance tips Archives | ZappySys Blog</title>
	<link>https://zappysys.com/blog/tag/performance-tips/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>How to read large XML / JSON file in SSIS (3 Million Rows in 3 Mins)</title>
		<link>https://zappysys.com/blog/read-large-xml-json-file-ssis-fast-process-million-rows/</link>
		
		<dc:creator><![CDATA[ZappySys]]></dc:creator>
		<pubDate>Mon, 08 Jan 2018 16:55:09 +0000</pubDate>
				<category><![CDATA[SSIS JSON Source (File/REST)]]></category>
		<category><![CDATA[SSIS XML Source (File / SOAP)]]></category>
		<category><![CDATA[performance tips]]></category>
		<category><![CDATA[ssis json source]]></category>
		<category><![CDATA[ssis xml source]]></category>
		<guid isPermaLink="false">https://zappysys.com/blog/?p=2467</guid>

					<description><![CDATA[<p>Introduction In this post we will learn how to use ZappySys SSIS XML Source or ZappySys SSIS JSON Source  to read large XML or JSON File (Process 3 Million rows in 3 minutes &#8211; 1.2 GB file). If you use default settings to read data then it may result into OutOfMemory Exception so we will outline [&#8230;]</p>
<p>The post <a href="https://zappysys.com/blog/read-large-xml-json-file-ssis-fast-process-million-rows/">How to read large XML / JSON file in SSIS (3 Million Rows in 3 Mins)</a> appeared first on <a href="https://zappysys.com/blog">ZappySys Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2>Introduction</h2>
<p>In this post we will learn how to use <a href="https://zappysys.com/products/ssis-powerpack/ssis-xml-source/" target="_blank" rel="noopener">ZappySys SSIS XML Source</a> or <a href="https://zappysys.com/products/ssis-powerpack/ssis-json-file-source/" target="_blank" rel="noopener">ZappySys SSIS JSON Source</a>  to read large XML or JSON File (Process 3 Million rows in 3 minutes &#8211; 1.2 GB file).</p>
<p>If you use default settings to read data then it may result into OutOfMemory Exception so we will outline few techniques which will enable high performance Streaming Mode rather than In-memory load of entire file.</p>
<h2><span id="Prerequisites">Prerequisites</span></h2>
<p>Before we parse very large XML or JSON files using SSIS , you will need to make sure following prerequisites are met.</p>
<ol>
<li>SSIS designer installed. Sometimes it is referred as BIDS or SSDT (<a href="https://docs.microsoft.com/en-us/sql/ssdt/download-sql-server-data-tools-ssdt" target="_blank" rel="noopener">download it from Microsoft site</a>).</li>
<li>Basic knowledge of SSIS package development using <em>Microsoft SQL Server Integration Services</em>.</li>
<li>You have at least two sample files&#8230; One file must be small dataset (less than 10 MB if possible). We will  use small dataset file during design mode to get metadata and see data preview on the Component UI.</li>
<li>Second XML file is the big file with full dataset you like to parse at runtime.</li>
<li>Make sure <a href="https://zappysys.com/products/ssis-powerpack/" target="_blank" rel="noopener"><em>SSIS PowerPack</em></a> is installed. <a href="https://zappysys.com/products/ssis-powerpack/download/" target="_blank" rel="noopener">Click here to download</a>.</li>
</ol>
<p>&nbsp;</p>
<h2>Step-By-Step : Reading large XML file (SSIS XML Source)</h2>
<p>Now let&#8217;s look at how to read large XML file (e.g. 3 Million rows or more) using <a href="https://zappysys.com/products/ssis-powerpack/ssis-xml-source/" target="_blank" rel="noopener">ZappySys XML Source</a> in SSIS. Below steps are identical for  <a href="https://zappysys.com/products/ssis-powerpack/ssis-json-file-source/" target="_blank" rel="noopener">ZappySys JSON Source</a>  too (Except Step#7).</p>
<ol>
<li>Open SSIS Designer and drag Data Flow from SSIS Toolbox</li>
<li>Double click Data Flow Task to switch to Data flow designer</li>
<li>Now drag ZS XML Source on the surface from SSIS Toolbox.</li>
<li>Double click ZS XML Source and specify <pre class="crayon-plain-tag">small dataset file</pre>  path you like to parse. (e.g.  c:\data\customer_small.xml )</li>
<li>Click on Select Filter button to find Node which will be treated as Array. Once you close the Filter Browse Dialog. Append <pre class="crayon-plain-tag">--FAST</pre>  Your Expression may look like below.<br />
<pre class="crayon-plain-tag">$.Root.Row--FAST</pre></li>
<li>Now <strong>uncheck  </strong> <pre class="crayon-plain-tag">Include Parent Columns option</pre></li>
<li><strong>Check </strong><pre class="crayon-plain-tag">Enable Performance Option</pre><strong><strong><strong>  (For JSON Source Skip this step)</strong></strong></strong><div class="su-note"  style="border-color:#e5de9d;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><div class="su-note-inner su-u-clearfix su-u-trim" style="background-color:#FFF8B7;border-color:#ffffff;color:#333333;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;">NOTE: Try to enable Performance mode setting after you Select Filter (by using smaller dataset file). Once filter is set you can check enable performance mode and make sure following two settings are correctly set (Option#2 needs new version &#8211; SSIS v4.1+ / ODBC v1.4+).<br />
(1) On <strong>Array Handling Tab</strong> &#8211; Set Array node name (must be <strong>only one entry</strong>). For example your Filter is $.DATA.ORDER[*] then you can enter <pre class="crayon-plain-tag">ORDER</pre>
(2) On <strong>Advanced Filter Options Tab</strong> &#8211; Enter all unwanted Tag names you like to Skip. For example if you have other arrays like PRODUCT or CUSTOMER then enter <pre class="crayon-plain-tag">PRODUCT,CUSTOMER</pre>
</div></div><strong><strong><br />
</strong></strong></li>
<li>Your setting may look like below<br />
<strong><strong><br />
</strong></strong></p>
<div id="attachment_2472" style="width: 1006px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-read-very-large-json-xml-file-parse-stream-mode.png"><img fetchpriority="high" decoding="async" aria-describedby="caption-attachment-2472" class="size-full wp-image-2472" src="https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-read-very-large-json-xml-file-parse-stream-mode.png" alt="Configure XML source or JSON Source for Very Large Data File (Streaming mode for High Performance)" width="996" height="708" srcset="https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-read-very-large-json-xml-file-parse-stream-mode.png 996w, https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-read-very-large-json-xml-file-parse-stream-mode-300x213.png 300w, https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-read-very-large-json-xml-file-parse-stream-mode-768x546.png 768w" sizes="(max-width: 996px) 100vw, 996px" /></a><p id="caption-attachment-2472" class="wp-caption-text">Configure XML source or JSON Source for Very Large Data File (Streaming mode for High Performance)</p></div></li>
<li>Click Preview to verify data (Adjust Filter if needed to extract correct Hierarchy)
<div id="attachment_2473" style="width: 667px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-preview-sample-xml-json-data.png"><img decoding="async" aria-describedby="caption-attachment-2473" class="size-full wp-image-2473" src="https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-preview-sample-xml-json-data.png" alt="Preview XML or JSON File data using SSIS XML Source or JSON Source" width="657" height="385" srcset="https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-preview-sample-xml-json-data.png 657w, https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-preview-sample-xml-json-data-300x176.png 300w" sizes="(max-width: 657px) 100vw, 657px" /></a><p id="caption-attachment-2473" class="wp-caption-text">Preview XML or JSON File data using SSIS XML Source or JSON Source</p></div></li>
<li>Click on Columns Tab</li>
<li>Change Scan Row count to 3000 or more and Click on <pre class="crayon-plain-tag">Refresh Column</pre> .</li>
<li>Select Guess 4x , Check Lock, Check Reset and Click OK like below. At runtime if you ever get error about Data Type Issue you can always adjust this later on too. Make sure Lock column is set to avoid setting reset for manually changed columns. For more information on metadata changes <a href="https://zappysys.com/blog/handling-ssis-component-metadata-issues/" target="_blank" rel="noopener">check this article</a>
<div id="attachment_2470" style="width: 987px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-metadata-scan-options-xml-json-parsing.png"><img decoding="async" aria-describedby="caption-attachment-2470" class="size-full wp-image-2470" src="https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-metadata-scan-options-xml-json-parsing.png" alt="SSIS Metadata Options - JSON / XML File Parsing" width="977" height="661" srcset="https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-metadata-scan-options-xml-json-parsing.png 977w, https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-metadata-scan-options-xml-json-parsing-300x203.png 300w, https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-metadata-scan-options-xml-json-parsing-768x520.png 768w" sizes="(max-width: 977px) 100vw, 977px" /></a><p id="caption-attachment-2470" class="wp-caption-text">SSIS Metadata Options &#8211; JSON / XML File Parsing</p></div></li>
<li>Click OK to save settings.</li>
<li>Now Right click on XML Component &gt; Click Properties. Change <pre class="crayon-plain-tag">DirectPath</pre>  property to original file path (large file) (e.g. <pre class="crayon-plain-tag">c:\data\customers_large.xml</pre>  ). Save Package.</li>
<li>Now you can run your SSIS Package from Designer or Command line. As you see in the below screenshot that there is virtually no memory pressure when you enable stream mode. Thanks to ZappySys unique XML / JSON Parsing Engine. When streaming mode is enabled file is not loaded into memory for parsing rather than that it only reads record by record to process very large JSON or XML file.In our below example we used <strong>Windows 7 Desktop, 16GB RAM, 4 Core i7 64 bit CPU</strong>. It took around <pre class="crayon-plain-tag">3 Minutes to Parse 3 Million Records (1.2 GB big XML file)</pre> . If you Parse JSON file then it can be even faster due to compact size.
<div id="attachment_2471" style="width: 1048px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-read-very-large-xml-json-file-example.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-2471" class="size-full wp-image-2471" src="https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-read-very-large-xml-json-file-example.png" alt="Reading Very Large XML or JSON File using SSIS (Stream Mode for High Performance)" width="1038" height="696" srcset="https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-read-very-large-xml-json-file-example.png 1038w, https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-read-very-large-xml-json-file-example-300x201.png 300w, https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-read-very-large-xml-json-file-example-768x515.png 768w, https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-read-very-large-xml-json-file-example-1024x687.png 1024w, https://zappysys.com/blog/wp-content/uploads/2018/01/ssis-read-very-large-xml-json-file-example-272x182.png 272w" sizes="(max-width: 1038px) 100vw, 1038px" /></a><p id="caption-attachment-2471" class="wp-caption-text">Reading Very Large XML or JSON File using SSIS (Stream Mode for High Performance)</p></div></li>
</ol>
<h2>Step-By-Step : Reading very large JSON file (SSIS JSON Source)</h2>
<p>Reading very large JSON file using <a href="https://zappysys.com/products/ssis-powerpack/ssis-json-file-source/" target="_blank" rel="noopener">ZappySys JSON Source</a>  has exact same steps described in above section except two changes. You have to use ZS JSON Source and skip Step#7 (Check Enable Performance Mode &#8211; This option is not available JSON Source).</p>
<h2>Parsing very large XML File with Multiple Arrays</h2>
<p>Now let&#8217;s discuss a scenario which can result in OutOfMemory Exception unless you tweak some extra options. Assume you have file structure like below.<br />
NOTE: This will only work in version <strong>4.1.0</strong> or later (in SSIS PowerPack) or <strong>1.4.0</strong> or later for ODBC PowerPack</p><pre class="crayon-plain-tag">&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;Data&gt;
	&lt;Product&gt;....&lt;Product&gt;
	&lt;Product&gt;....&lt;Product&gt;
	.... many more....
	&lt;Product&gt;....&lt;Product&gt;
	
	&lt;Customer&gt;....&lt;Customer&gt;
	&lt;Customer&gt;....&lt;Customer&gt;
	.... many more....
	&lt;Customer&gt;....&lt;Customer&gt;
	
	&lt;Row&gt;....&lt;Row&gt;
	&lt;Row&gt;....&lt;Row&gt;
	.... many more....
	&lt;Row&gt;....&lt;Row&gt;
&lt;/Data&gt;</pre><p>
Notice that in above XML it has 3 different nodes (For Product, Customer and Order). If you try to extract Orders it might fail with OutOfMemory Exception because it needs to scan large XML before it can hit First Order node. To solve this issue you can adjust following 2 settings.</p>
<ol>
<li>On Array Handling Tab you have following<br />
<pre class="crayon-plain-tag">Row</pre>
</li>
<li>On Advanced Filter Options tab enter Following two nodes which we dont want to extract.<br />
<pre class="crayon-plain-tag">Product,Customer</pre>
</li>
</ol>
<p>Thats it. This will avoid excessive Memory pressure to find very first node before it can start Stream.</p>
<div id="attachment_9286" style="width: 453px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2018/01/xml-parse-array-extract.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-9286" class="size-full wp-image-9286" src="https://zappysys.com/blog/wp-content/uploads/2018/01/xml-parse-array-extract.png" alt="Parse XML Array - Performance Mode Setting" width="443" height="150" srcset="https://zappysys.com/blog/wp-content/uploads/2018/01/xml-parse-array-extract.png 443w, https://zappysys.com/blog/wp-content/uploads/2018/01/xml-parse-array-extract-300x102.png 300w" sizes="(max-width: 443px) 100vw, 443px" /></a><p id="caption-attachment-9286" class="wp-caption-text">Parse XML Array &#8211; Performance Mode Setting</p></div>
<p>&nbsp;</p>
<div id="attachment_9287" style="width: 680px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2018/01/xml-parse-exclude-nodes-by-name.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-9287" class="size-full wp-image-9287" src="https://zappysys.com/blog/wp-content/uploads/2018/01/xml-parse-exclude-nodes-by-name.png" alt="Parse Large XML - Exclude nodes by name" width="670" height="149" srcset="https://zappysys.com/blog/wp-content/uploads/2018/01/xml-parse-exclude-nodes-by-name.png 670w, https://zappysys.com/blog/wp-content/uploads/2018/01/xml-parse-exclude-nodes-by-name-300x67.png 300w" sizes="(max-width: 670px) 100vw, 670px" /></a><p id="caption-attachment-9287" class="wp-caption-text">Parse Large XML &#8211; Exclude nodes by name</p></div>
<h2>Conclusion</h2>
<p>As you saw in this article that ZappySys SSIS PowerPack is designed to handle very large dataset in JSON or XML. We also support very large CSV and Excel files too which are not covered in this article. <a href="https://zappysys.com/products/ssis-powerpack/" target="_blank" rel="noopener">Download SSIS PowerPack</a> to explore 70+ more components by yourself to make your ETL simple and Fast.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>The post <a href="https://zappysys.com/blog/read-large-xml-json-file-ssis-fast-process-million-rows/">How to read large XML / JSON file in SSIS (3 Million Rows in 3 Mins)</a> appeared first on <a href="https://zappysys.com/blog">ZappySys Blog</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
