<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>SSIS Regex Parser Task Archives | ZappySys Blog</title>
	<atom:link href="https://zappysys.com/blog/category/ssis/tasks/ssis-regex-parser-task/feed/" rel="self" type="application/rss+xml" />
	<link>https://zappysys.com/blog/category/ssis/tasks/ssis-regex-parser-task/</link>
	<description>SSIS / ODBC Drivers / API Connectors for JSON, XML, Azure, Amazon AWS, Salesforce, MongoDB and more</description>
	<lastBuildDate>Wed, 13 Sep 2023 17:09:53 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.4.4</generator>

<image>
	<url>https://zappysys.com/blog/wp-content/uploads/2023/01/cropped-zappysys-symbol-large-32x32.png</url>
	<title>SSIS Regex Parser Task Archives | ZappySys Blog</title>
	<link>https://zappysys.com/blog/category/ssis/tasks/ssis-regex-parser-task/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>How to get all URLs from emails from Outlook</title>
		<link>https://zappysys.com/blog/how-to-get-all-urls-from-emails-from-outlook/</link>
		
		<dc:creator><![CDATA[ZappySys Team]]></dc:creator>
		<pubDate>Mon, 27 Mar 2023 11:44:37 +0000</pubDate>
				<category><![CDATA[SSIS JSON Source (File/REST)]]></category>
		<category><![CDATA[SSIS Regex Parser Task]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[url]]></category>
		<guid isPermaLink="false">https://zappysys.com/blog/?p=9974</guid>

					<description><![CDATA[<p>Introduction This time we will explain how to get all URLS from emails using MS Outlook. Sometimes we need get all URLs from emails. In this post, we will show how to do this. Step by step using SSIS. Microsoft Graph API is a unified way to access many Microsoft services API including Office 365 API Prerequisites Before [&#8230;]</p>
<p>The post <a href="https://zappysys.com/blog/how-to-get-all-urls-from-emails-from-outlook/">How to get all URLs from emails from Outlook</a> appeared first on <a href="https://zappysys.com/blog">ZappySys Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2><strong>Introduction</strong></h2>
<p style="text-align: left;">This time we will explain how to get all URLS from emails using MS Outlook. Sometimes we need get <a href="https://zappysys.com/blog/wp-content/uploads/2023/03/microsoft-office-365-api-integration-150x150.png"><img decoding="async" class="wp-image-10001 alignleft" src="https://zappysys.com/blog/wp-content/uploads/2023/03/microsoft-office-365-api-integration-150x150.png" alt="" width="73" height="73" /></a>all URLs from emails. In this post, we will show how to do this. Step by step using SSIS.</p>
<p style="text-align: left;"><a href="https://developer.microsoft.com/en-us/graph/docs/concepts/overview" target="_blank" rel="noopener">Microsoft Graph API</a> is a unified way to access many Microsoft services API including <strong>Office 365 API</strong></p>
<h2><strong>Prerequisites</strong></h2>
<p>Before we perform steps listed in this article, you will need to make sure following prerequisites are met:</p>
<ol>
<li>SSIS designer installed. Sometimes it is referred as BIDS or SSDT (<a href="https://learn.microsoft.com/en-us/sql/ssdt/download-sql-server-data-tools-ssdt?view=sql-server-ver16">download it from Microsoft site</a>).</li>
<li>Basic knowledge of SSIS package development using Microsoft SQL Server Integration Services.</li>
<li>Make sure <a href="https://zappysys.com/products/ssis-powerpack/">ZappySys SSIS PowerPack</a> is installed (<a href="https://zappysys.com/products/ssis-powerpack/download/">download it</a>).</li>
<li>Optional (If you want to Deploy and Schedule ) &#8211; <a href="https://zappysys.zendesk.com/hc/en-us/articles/360035974593">Deploy and Schedule SSIS Packages</a></li>
</ol>
<h2><strong>Steps-by-step process to download images from HTML using SSIS</strong></h2>
<h3><span id="Register_Application_OAuth2_App_for_Graph_API">Register Application (OAuth2 App for Graph API)</span></h3>
<p>First, check our article about how to get the OAuth2 connection, click <a href="https://zappysys.com/blog/get-office-365-mail-attachments-using-ssis/#Register_Application_OAuth2_App_for_Graph_API">here for the article</a></p>
<h3>Get the information for the body content from the emails</h3>
<p>1. Now, Drag and Drop SSIS <b>Data Flow Task</b> from SSIS Toolbox.</p>
<div style="width: 470px" class="wp-caption alignnone"><img fetchpriority="high" decoding="async" class="size-full" src="https://zappysys.com/onlinehelp/ssis-powerpack/scr/images/drag-and-drop-data-flow-task.png" width="460" height="155" /><p class="wp-caption-text">Drag and drop Data flow task</p></div>
<p>2. Double click on the DataFlow task to see the DataFlow designer surface.</p>
<p>3. From the SSIS toolbox drag and drop JSON Source on the dataflow designer surface.</p>
<div style="width: 551px" class="wp-caption alignnone"><img loading="lazy" decoding="async" class="size-full" src="https://zappysys.com/onlinehelp/ssis-powerpack/scr/images/json-source/ssis-json-source-adapter-drag.png" width="541" height="144" /><p class="wp-caption-text">Drag and drop a JSON sourcec</p></div>
<p>4. Select the <strong>OAuth connection</strong> you created, then use this URL to get the body content</p><pre class="crayon-plain-tag">https://graph.microsoft.com/v1.0/me/messages?$select=subject,body,bodyPreview,uniqueBody</pre><p>
&nbsp;</p>
<div id="attachment_9985" style="width: 838px" class="wp-caption aligncenter"><a href="https://zappysys.com/blog/wp-content/uploads/2023/03/JSON-URLS.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-9985" class="wp-image-9985 size-full" src="https://zappysys.com/blog/wp-content/uploads/2023/03/JSON-URLS.png" alt="Filter the body content" width="828" height="735" srcset="https://zappysys.com/blog/wp-content/uploads/2023/03/JSON-URLS.png 828w, https://zappysys.com/blog/wp-content/uploads/2023/03/JSON-URLS-300x266.png 300w, https://zappysys.com/blog/wp-content/uploads/2023/03/JSON-URLS-768x682.png 768w" sizes="(max-width: 828px) 100vw, 828px" /></a><p id="caption-attachment-9985" class="wp-caption-text">Getting the information from the emails</p></div>
<p>5. Drag and drop the <strong>trash destination</strong> and save the result in a file with the following configuration. Make sure to check the <strong>Overwrite target file if exists</strong>.</p>
<div id="attachment_9986" style="width: 648px" class="wp-caption aligncenter"><a href="https://zappysys.com/blog/wp-content/uploads/2023/03/trash-url.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-9986" class="wp-image-9986 size-full" src="https://zappysys.com/blog/wp-content/uploads/2023/03/trash-url.png" alt="Add destination and check the overwrite option" width="638" height="525" srcset="https://zappysys.com/blog/wp-content/uploads/2023/03/trash-url.png 638w, https://zappysys.com/blog/wp-content/uploads/2023/03/trash-url-300x247.png 300w" sizes="(max-width: 638px) 100vw, 638px" /></a><p id="caption-attachment-9986" class="wp-caption-text">Save the body from the emails in a file</p></div>
<p>Now we have saved the email body in the file, to extract the links we need to go control flow and drag and drop the Regular Expression Parser Task and follow the steps from below section.</p>
<h3>Getting all URLs from emails inside a variable</h3>
<p>6. The next step is to save the URLs&#8217;. We will use the Regular Expression Parser Task for this.<br />
Also, you need to use Regex and here is one example you can use, in this page <a href="https://regex101.com/">Regex101</a> you can check more details about the expressions we are using:</p>
<div id="attachment_9987" style="width: 694px" class="wp-caption aligncenter"><a href="https://zappysys.com/blog/wp-content/uploads/2023/03/regex-urls.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-9987" class="wp-image-9987 size-full" src="https://zappysys.com/blog/wp-content/uploads/2023/03/regex-urls.png" alt="URLs from emails - Create the expression" width="684" height="695" srcset="https://zappysys.com/blog/wp-content/uploads/2023/03/regex-urls.png 684w, https://zappysys.com/blog/wp-content/uploads/2023/03/regex-urls-295x300.png 295w" sizes="(max-width: 684px) 100vw, 684px" /></a><p id="caption-attachment-9987" class="wp-caption-text">Getting all URL from the emails</p></div>
<p>Expression: <pre class="crayon-plain-tag">href="(.*?)"{{*}}</pre>
<h3>Save the result in a file of the URLs from emails</h3>
<p>7. Finally, use the<strong> Logging task</strong> to save the URLs in a file, use the following configuration</p>
<div id="attachment_9988" style="width: 548px" class="wp-caption aligncenter"><a href="https://zappysys.com/blog/wp-content/uploads/2023/03/logging-urls.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-9988" class="wp-image-9988 size-full" src="https://zappysys.com/blog/wp-content/uploads/2023/03/logging-urls.png" alt="URLs from emails - Configure the logging Task to save on a file" width="538" height="475" srcset="https://zappysys.com/blog/wp-content/uploads/2023/03/logging-urls.png 538w, https://zappysys.com/blog/wp-content/uploads/2023/03/logging-urls-300x265.png 300w" sizes="(max-width: 538px) 100vw, 538px" /></a><p id="caption-attachment-9988" class="wp-caption-text">Uncheck all options and select message type as none</p></div>
<h2>Conclusion</h2>
<p>If everything is OK, you will be able to download the URLS from your emails. To do that, we read the list from the body of the emails. Then we get the URLs of the emails using expressions. Finally, we store them in a local file.</p>
<p>The post <a href="https://zappysys.com/blog/how-to-get-all-urls-from-emails-from-outlook/">How to get all URLs from emails from Outlook</a> appeared first on <a href="https://zappysys.com/blog">ZappySys Blog</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Extract audit data from SSIS Execution Log (EventHandler and Regular Expression)</title>
		<link>https://zappysys.com/blog/extract-audit-data-ssis-execution-log-eventhandler-regular-expression/</link>
		
		<dc:creator><![CDATA[ZappySys]]></dc:creator>
		<pubDate>Tue, 26 May 2020 20:11:33 +0000</pubDate>
				<category><![CDATA[SSIS CSV Export Task]]></category>
		<category><![CDATA[SSIS Logging Task]]></category>
		<category><![CDATA[SSIS Regex Parser Task]]></category>
		<category><![CDATA[audit]]></category>
		<category><![CDATA[EventHandler]]></category>
		<category><![CDATA[regex]]></category>
		<category><![CDATA[Regular Expression]]></category>
		<category><![CDATA[ssis]]></category>
		<guid isPermaLink="false">https://zappysys.com/blog/?p=8928</guid>

					<description><![CDATA[<p>Introduction In our last post (Regex Cheat Sheet) we explained use cases of SSIS Regular Expression Parser Task . Now lets look at some real world usecase. Basically in this article we are going to Extract data from audit log using SSIS. For demo purpose we will use log generated by SSIS Export CSV File Task output. When [&#8230;]</p>
<p>The post <a href="https://zappysys.com/blog/extract-audit-data-ssis-execution-log-eventhandler-regular-expression/">Extract audit data from SSIS Execution Log (EventHandler and Regular Expression)</a> appeared first on <a href="https://zappysys.com/blog">ZappySys Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2>Introduction</h2>
<p>In our <a href="https://zappysys.com/blog/using-regular-expressions-in-ssis/" target="_blank" rel="noopener">last post</a> (Regex Cheat Sheet) we explained use cases of <a href="https://zappysys.com/products/ssis-powerpack/ssis-regex-parser-task/" target="_blank" rel="noopener">SSIS Regular Expression Parser Task</a> . Now lets look at some real world usecase. Basically in this article we are going to Extract data from audit log using SSIS. For demo purpose we will use log generated by <a href="https://zappysys.com/products/ssis-powerpack/ssis-export-csv-file-task/" target="_blank" rel="noopener">SSIS Export CSV File Task output</a>. When you export many tables dynamically using this task you want to know how many Row count exported for each Table. This information is logged in SSIS Output Log but no easy way to capture in SSIS variable. So what we will do we will use few tricks to capture that data using Regular Expression usecase.</p>
<p>So let&#8217;s get started.</p>
<div class="content_block" id="custom_post_widget-2523"><h2><span id="Prerequisites">Prerequisites</span></h2>
Before we perform the steps listed in this article, you will need to make sure the following prerequisites are met:
<ol style="margin-left: 1.5em;">
 	<li><abbr title="SQL Server Integration Services">SSIS</abbr> designer installed. Sometimes it is referred to as <abbr title="Business Intelligence Development Studio">BIDS</abbr> or <abbr title="SQL Server Data Tools">SSDT</abbr> (<a href="https://docs.microsoft.com/en-us/sql/ssdt/download-sql-server-data-tools-ssdt" target="_blank" rel="noopener">download it from the Microsoft site</a>).</li>
 	<li>Basic knowledge of SSIS package development using <em>Microsoft SQL Server Integration Services</em>.</li>
 	<li>Make sure <span style="text-decoration: underline;"><a href="https://zappysys.com/products/ssis-powerpack/" target="_blank" rel="noopener">ZappySys SSIS PowerPack</a></span> is installed (<a href="https://zappysys.com/products/ssis-powerpack/download/" target="_blank" rel="noopener">download it</a>, if you haven't already).</li>
 	<li>(<em>Optional step</em>)<em>.</em> <a href="https://zappysys.zendesk.com/hc/en-us/articles/360035974593" target="_blank" rel="noopener">Read this article</a>, if you are planning to deploy packages to a server and schedule their execution later.</li>
</ol></div>
<h2>Setup Export CSV Task (Output Multiple Tables to CSV Files Dynamically)</h2>
<p>For example purpose we will use Export CSV File Task but you can use this technique for any other Tasks.</p>
<ol>
<li>Drag and Drop ZS Export CSV Task from SSIS Toolbox and configure as below.<br />
<img decoding="async" class="figureimage" title="SSIS Export to CSV File Task - Drag and Drop" src="https://zappysys.com/onlinehelp/ssis-powerpack/scr/images/export-csv-file-task/ssis-export-csv-file-task-drag.png" alt="SSIS Export to CSV File Task - Drag and Drop" /></li>
<li>  Configure Export CSV Task to output multiple tables <a href="https://zappysys.com/onlinehelp/ssis-powerpack/scr/export-csv-file-task.htm" target="_blank" rel="noopener">as explained here</a></li>
<li>Now create 2 SSIS User Variables
<ol>
<li><strong>FilePath</strong> (String Type)</li>
<li><strong>RowCount</strong> (String Type)</li>
</ol>
</li>
</ol>
<p>That&#8217;s it&#8230; now in the next section we will setup event handler to capture output from Log.</p>
<h2>Setup EventHandle / Extract Audit Data using Regex</h2>
<ol>
<li>Now select task and click <strong>Event Handler tab</strong></li>
<li>Select <strong>Executable Name</strong> from the dropdown, Select Event Name <strong>OnInformation</strong> from Handler Dropdown</li>
<li>Click Create Hyper Link to Create Event Handler like below
<div id="attachment_8929" style="width: 665px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-create-eventhandler.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-8929" class="size-full wp-image-8929" src="https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-create-eventhandler.png" alt="Create Task Level Event Handler in SSIS" width="655" height="260" srcset="https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-create-eventhandler.png 655w, https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-create-eventhandler-300x119.png 300w" sizes="(max-width: 655px) 100vw, 655px" /></a><p id="caption-attachment-8929" class="wp-caption-text">Create Task Level Event Handler in SSIS</p></div></li>
<li>Now Drag below 3 tasks and connect like this
<ol>
<li>Script Task (We will use as dummy start (No configuration needed)</li>
<li>ZS Regular Expression Parser Task</li>
<li>ZS Logging Task (Optional) but we will use to Output Extracted Values<br />
<a href="https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-eventhandler-extract-audit-information.png"><img loading="lazy" decoding="async" class="alignnone size-full wp-image-8930" src="https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-eventhandler-extract-audit-information.png" alt="" width="828" height="490" srcset="https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-eventhandler-extract-audit-information.png 828w, https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-eventhandler-extract-audit-information-300x178.png 300w, https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-eventhandler-extract-audit-information-768x454.png 768w" sizes="(max-width: 828px) 100vw, 828px" /></a></li>
</ol>
</li>
<li>Connect Dummy Script Task to ZS Regular Expression Parser Task and Right click on Green connected Arrow and Use Expression as below. Click OK to save</li>
<li>Configure ZS Regular Expression Parser Task as below
<ol>
<li>Enter <pre class="crayon-plain-tag">{{System::ErrorDescription}}</pre>  as Direct Value</li>
<li>Enter Below two mappings
<ol>
<li>For RowCount set<br />
<pre class="crayon-plain-tag">Total (\d+) records written to : (.*){{1,1}}</pre>
</li>
<li>For FilePath set<br />
<pre class="crayon-plain-tag">Total (\d+) records written to : (.*){{1,2}}</pre>
</li>
</ol>
</li>
</ol>
</li>
<li>Here is how it will look like
<div id="attachment_8932" style="width: 912px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-regular-expression-extract-audit-log-data.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-8932" class="wp-image-8932 size-full" src="https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-regular-expression-extract-audit-log-data.png" alt="Extract data from audit log using SSIS Regular Expression Parser Task" width="902" height="838" srcset="https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-regular-expression-extract-audit-log-data.png 902w, https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-regular-expression-extract-audit-log-data-300x279.png 300w, https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-regular-expression-extract-audit-log-data-768x714.png 768w" sizes="(max-width: 902px) 100vw, 902px" /></a><p id="caption-attachment-8932" class="wp-caption-text">Extract data from audit log using SSIS Regular Expression Parser Task</p></div></li>
<li>Thats it&#8230; Now connect 2nd step to last Logging Task&#8230;. in that Task you can set Text like below just to log extracted data. You can also use Execute SQL Task to log into Database Table<br />
<pre class="crayon-plain-tag">{{User::FilePath}} ===&gt; {{User::RowCount}}</pre>
</li>
<li>Now run Package and you will see audit data we extracted for each file. Once File is exported we capture OnInformation event&#8230; and if Message contains substring like &#8220;records written to&#8221; then we extract data into variable for Row Count and Path then you can save to desired place for auditing purpose<a href="https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-log-extracted-audit-information.png"><img loading="lazy" decoding="async" class="alignnone size-full wp-image-8931" src="https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-log-extracted-audit-information.png" alt="" width="656" height="163" srcset="https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-log-extracted-audit-information.png 656w, https://zappysys.com/blog/wp-content/uploads/2020/05/ssis-log-extracted-audit-information-300x75.png 300w" sizes="(max-width: 656px) 100vw, 656px" /></a></li>
</ol>
<p>&nbsp;</p>
<p>The post <a href="https://zappysys.com/blog/extract-audit-data-ssis-execution-log-eventhandler-regular-expression/">Extract audit data from SSIS Execution Log (EventHandler and Regular Expression)</a> appeared first on <a href="https://zappysys.com/blog">ZappySys Blog</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Using Regular Expressions in SSIS</title>
		<link>https://zappysys.com/blog/using-regular-expressions-in-ssis/</link>
		
		<dc:creator><![CDATA[ZappySys]]></dc:creator>
		<pubDate>Wed, 07 Mar 2018 17:38:24 +0000</pubDate>
				<category><![CDATA[SSIS Advanced File System Task]]></category>
		<category><![CDATA[SSIS Amazon Storage Task]]></category>
		<category><![CDATA[SSIS Azure Blob Storage Task]]></category>
		<category><![CDATA[SSIS Regex Parser Task]]></category>
		<category><![CDATA[SSIS SFTP Task]]></category>
		<category><![CDATA[SSIS Tasks]]></category>
		<category><![CDATA[SSIS Tips & How-Tos]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[regex]]></category>
		<category><![CDATA[Regular Expression]]></category>
		<guid isPermaLink="false">https://zappysys.com/blog/?p=2858</guid>

					<description><![CDATA[<p>Introduction In this short article, you will learn how to write Regular expressions in SSIS (i.e. Regex) and what tool to use to test them. You will also find helpful resources on how to write more sophisticated expressions and learn more about them. For demo purposes, we will use FREE SSIS Regex Parser Task to parse and [&#8230;]</p>
<p>The post <a href="https://zappysys.com/blog/using-regular-expressions-in-ssis/">Using Regular Expressions in SSIS</a> appeared first on <a href="https://zappysys.com/blog">ZappySys Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2>Introduction</h2>
<p><a href="https://zappysys.com/blog/wp-content/uploads/2018/03/ssis-regex-parser-task.png"><img loading="lazy" decoding="async" class=" wp-image-2974 alignleft" src="https://zappysys.com/blog/wp-content/uploads/2018/03/ssis-regex-parser-task.png" alt="" width="114" height="114" /></a>In this short article, you will learn how to write Regular expressions in SSIS (i.e. Regex) and what tool to use to test them. You will also find helpful resources on how to write more sophisticated expressions and learn more about them. For demo purposes, we will use <strong>FREE</strong> <a href="https://zappysys.com/products/ssis-powerpack/ssis-regex-parser-task/" target="_blank" rel="noopener">SSIS Regex Parser Task</a> to parse and extract the text using Regex.</p>
<p>&nbsp;</p>
<p>You can use Regular expressions in several SSIS PowerPack connectors:</p>
<ul>
<li><a href="https://zappysys.com/products/ssis-powerpack/ssis-regex-parser-task/" target="_blank" rel="noopener">SSIS Regex Parser Task (FREE)</a>,</li>
<li><a href="https://zappysys.com/products/ssis-powerpack/ssis-azure-blob-storage-task/" target="_blank" rel="noopener">Azure Blob Storage Task</a>,</li>
<li><a href="https://zappysys.com/products/ssis-powerpack/ssis-file-system-task-advanced/" target="_blank" rel="noopener">Advanced File System Task</a>,</li>
<li><a href="https://zappysys.com/products/ssis-powerpack/ssis-amazon-s3-task/" target="_blank" rel="noopener">Amazon S3 Storage Task</a> and others.</li>
</ul>
<h2>Writing Regular Expressions in SSIS</h2>
<p>Based on the SSIS component you use it will include the filename in filtering options or match the text: See below Syntax to write Regex in ZappySys tools. We support additional construct {{X, Y}} at the end of Regex to control two parameters. This additional construct is useful for data extracted from matching regex. If you want to Test patterns and not worry about data extraction then no need to use last {{<pre class="crayon-plain-tag">Occurance_Index</pre>, <pre class="crayon-plain-tag">Group_Index_Or_Name</pre>}}</p>
<p><strong>Syntax:</strong><br />
<pre class="crayon-plain-tag">&lt;your Regular Expression&gt;[{{Occurance_Index|*[,Group_Index_Or_Name]}}]</pre>
<p><strong>Where:</strong><br />
<pre class="crayon-plain-tag">Occurance_Index</pre>=Occurrence index you want to extract (X=0 means the first match) and * means all matches. Use a minus sign to get an occurrence from reverse (e.g. {{-0}} returns the last match)<br />
<pre class="crayon-plain-tag">Group_Index_Or_Name</pre>=Group index/name within your search pattern (Groups are indicated by parentheses in regular expression, Y=0 means first group). If you named your group in the pattern then you can use the Group name. To use Group Name you must use the new version (the old version doesn&#8217;t support this)</p>
<p><strong>How to name a group?</strong></p>
<p>E.g. (\w+)@(<strong>?&lt;domain&gt;</strong>\w+.com)</p>
<p><strong>How to use the group name in the match extract?</strong></p>
<p>E.g. (\w+)@(<strong>?&lt;domain&gt;</strong>\w+.com){{0,<strong>domain</strong>}}</p>
<p><strong>Example Input:</strong></p>
<p>Let&#8217;s assume we have the following input text. We will test various Expressions.</p><pre class="crayon-plain-tag">Customer =&gt; AAA
Email =&gt; aaa@google.com
Phone =&gt; 101-222-3333
========
Customer =&gt; BBB
Email =&gt; bbb@yahoo.com
Phone =&gt; 102-222-3333
========
Customer =&gt; CCC
Email =&gt; ccc@hotmail.com
Phone =&gt; 103-222-3333
========
Customer =&gt; DDD
Email =&gt; ddd@outlook.com
Phone =&gt; 104-222-3333</pre><p>
<strong>Sample Regex Expressions</strong></p>
<div class="su-table su-table-alternate">
<table style="border-collapse: collapse;width: 100%;height: 332px" border="1">
<tbody>
<tr style="height: 22px">
<td style="width: 50%;height: 22px"><strong>Expression</strong></td>
<td style="width: 50%;height: 22px"><strong>Description</strong></td>
</tr>
<tr style="height: 46px">
<td style="width: 50%;height: 46px">(?s).*</td>
<td style="width: 50%;height: 46px">Match anything including new lines. To match anything without new line just use <pre class="crayon-plain-tag">(.*)</pre></td>
</tr>
<tr style="height: 22px">
<td style="width: 50%;height: 22px">\w+([-+.&#8217;]\w+)*@(?&lt;domain&gt;\w+([-.]\w+)*\.\w+([-.]\w+)*)</td>
<td style="width: 50%;height: 22px">Get first email id from text ({{0}} is omitted from end because {{0}} id default)</td>
</tr>
<tr style="height: 22px">
<td style="width: 50%;height: 22px">\w+([-+.&#8217;]\w+)*@(?&lt;domain&gt;\w+([-.]\w+)*\.\w+([-.]\w+)*){{-0}}</td>
<td style="width: 50%;height: 22px">Get last email id from text</td>
</tr>
<tr style="height: 22px">
<td style="width: 50%;height: 22px">\w+([-+.&#8217;]\w+)*@(?&lt;domain&gt;\w+([-.]\w+)*\.\w+([-.]\w+)*){{*}}</td>
<td style="width: 50%;height: 22px">Get all email addresses (separate them with new line). When you suffix Regular expression with {{*}} it will return all matches.</td>
</tr>
<tr style="height: 22px">
<td style="width: 50%;height: 22px">\w+([-+.&#8217;]\w+)*@(?&lt;domain&gt;\w+([-.]\w+)*\.\w+([-.]\w+)*){{2}}</td>
<td style="width: 50%;height: 22px">Get third email id from text (i.e. ends with {{X}} where X is occurrence index starting from 0)</td>
</tr>
<tr style="height: 22px">
<td style="width: 50%;height: 22px">\w+([-+.&#8217;]\w+)*@(?&lt;domain&gt;\w+([-.]\w+)*\.\w+([-.]\w+)*){{0,2}}</td>
<td style="width: 50%;height: 22px">Get first email pattern match (i.e. Index=0) and extract domain (i.e. 2nd group). Index starting from 0 for occurrence and group</td>
</tr>
<tr style="height: 22px">
<td style="width: 50%;height: 22px">(\d*)-(\d*)-(\d*)</td>
<td style="width: 50%;height: 22px">Get first phone number from text (If you don&#8217;t include {{X,Y}} at the end then it will be always [0,0])</td>
</tr>
<tr style="height: 22px">
<td style="width: 50%;height: 22px">^((?!demo|test).)*$</td>
<td style="width: 50%;height: 22px">Match whole input text if it does not contain words like demo or test. If word found then No Match</td>
</tr>
<tr style="height: 22px">
<td style="width: 50%;height: 22px">&lt;tag&gt;((.|\n)*?)&lt;/tag&gt;{{0,1}}</td>
<td style="width: 50%;height: 22px">Extract anything between &lt;tag&gt;&#8230;&lt;/tag&gt; (Include new line char i.e. \n)</td>
</tr>
<tr style="height: 22px">
<td style="width: 50%;height: 22px">&lt;tag&gt;(.*)&lt;/tag&gt;{{0,1}}</td>
<td style="width: 50%;height: 22px">Extract anything between &lt;tag&gt;&#8230;&lt;/tag&gt; (Exclude new line char i.e. \n)</td>
</tr>
<tr style="height: 22px">
<td style="width: 50%;height: 22px">&lt;!\[CDATA\[((.|\n)*?)\]\]\&gt;{{0,1}}</td>
<td style="width: 50%;height: 22px">Extract content from CData section of XML Data (This can be CSV, JSON or nested XML )</td>
</tr>
<tr style="height: 22px">
<td style="width: 50%;height: 22px">^$</td>
<td style="width: 50%;height: 22px">Match blank string</td>
</tr>
</tbody>
</table>
</div>
<h2>More Regular Expression Examples</h2>
<div class="su-table su-table-alternate">
<table>
<tbody>
<tr>
<td style="width: 196px"><strong>Input Text</strong></td>
<td style="width: 223px"><strong>Regex</strong></td>
<td style="width: 163px"><strong>Matched text</strong></td>
<td style="width: 752px"><strong>Comment</strong></td>
</tr>
<tr>
<td style="width: 196px">&lt;row id=&#8221;123&#8243; process=&#8221;Y&#8221;&gt;</td>
<td style="width: 223px">id=&#8221;([^&#8221;]*)&#8221;{{0,1}}</td>
<td style="width: 163px"><strong>123</strong></td>
<td style="width: 752px">This expression shows how to extract group value (i.e. {{0,1}} &#8211; first match and 2nd group ). It extracts text between double quotes using <strong>[^&#8221;]*</strong> pattern , match anything until double quote is found.  {{0,1}} syntax is ZappySys specific so it may not work with other Regex engines.</td>
</tr>
<tr>
<td style="width: 196px">&lt;data&gt;123&lt;/data&gt;</td>
<td style="width: 223px">&lt;data&gt;([^&lt;]*)&lt;\/data&gt;{{0,1}}</td>
<td style="width: 163px"><strong>123</strong></td>
<td style="width: 752px">This expression shows how to extract group value (i.e. {{0,1}} &#8211; first match and 2nd group ). It extracts text between double quotes using <strong>[^&lt;]*</strong> pattern , match anything until <strong>&lt;</strong>  is found.  {{0,1}} syntax is ZappySys specific so it may not work with other Regex engines.</td>
</tr>
<tr>
<td style="width: 196px" valign="top">File_20180930_source.txt</td>
<td style="width: 223px" valign="top">File</td>
<td style="width: 163px" valign="top"><strong>File</strong></td>
<td style="width: 752px" valign="top">Will match text/filename that has &#8220;File&#8221; keyword in it.</td>
</tr>
<tr>
<td style="width: 196px" valign="top">File_20180930_SOURCE.dat<br />
File_20180930_source.dat</td>
<td style="width: 223px" valign="top">source|SOURCE</td>
<td style="width: 163px" valign="top"><strong>SOURCE</strong> and <strong>source</strong></td>
<td style="width: 752px" valign="top">Will match text/filenames that contain either &#8220;source&#8221; <strong>or</strong> &#8220;SOURCE&#8221; keyword.</td>
</tr>
<tr>
<td style="width: 196px" valign="top">File_20180930_source.txt</td>
<td style="width: 223px" valign="top">File.+source</td>
<td style="width: 163px" valign="top"><strong>File_20180930_source</strong></td>
<td style="width: 752px" valign="top">Will match text/filename that contains keyword that starts with &#8220;File&#8221; <strong>and</strong> ends with &#8220;source&#8221;.<br />
Basically, you can use this pattern if you want to match two keywords in the text that appear in particular order.</td>
</tr>
<tr>
<td style="width: 196px" valign="top">File_20180930_source.txt<br />
File_20180830_source.dat</td>
<td style="width: 223px" valign="top">\.txt$|\.dat$</td>
<td style="width: 163px" valign="top"><strong>.txt</strong> and <strong>.dat</strong></td>
<td style="width: 752px" valign="top">Will match text/all filenames that end with &#8220;.txt&#8221; <strong>or</strong> &#8220;.dat&#8221;.</td>
</tr>
<tr>
<td style="width: 196px" valign="top">File_20180930_source.txt<br />
file_20190102_source.txt</td>
<td style="width: 223px" valign="top">^(F|f)ile_\d{8}</td>
<td style="width: 163px" valign="top"><strong>File_20180930</strong><br />
<strong>file_20190102</strong></td>
<td style="width: 752px" valign="top">Will match text/filename that starts with &#8220;File_&#8221; <strong>or</strong> &#8220;file_&#8221; <strong>and</strong> then followed by 8 digits.</td>
</tr>
<tr>
<td style="width: 196px" valign="top">File_20180930_source.txt<br />
File_20190101_none.txt</td>
<td style="width: 223px" valign="top">(.+)_(.+)_(.+){{0,2}}</td>
<td style="width: 163px" valign="top"><strong>20180930</strong></td>
<td style="width: 752px" valign="top">Will match text that has three groups of text strings, separated by &#8220;_&#8221;.<br />
Non-Regex {{0,2}} notation will bring back second group (index &#8220;2&#8221;) of first match (index &#8220;0&#8221;).</td>
</tr>
<tr>
<td style="width: 196px">File_20180930_source.txt<br />
File_20190101_none.txt</td>
<td style="width: 223px">(.+)_(.+)_(.+){{1,2}}</td>
<td style="width: 163px"><strong>20190101</strong></td>
<td style="width: 752px">Will match text that has three groups of text strings, separated by &#8220;_&#8221;.<br />
Non-Regex {{1,2}} notation will bring back second group (index &#8220;2&#8221;) of second match (index &#8220;1&#8221;).</td>
</tr>
<tr>
<td style="width: 196px">File_20180930_source.txt<br />
File_20190101_none.txt</td>
<td style="width: 223px">(.+)_(.+)_(.+){{*,2}}</td>
<td style="width: 163px"><strong>20180930<br />
20190101<br />
</strong></td>
<td style="width: 752px">Will match text that has three groups of text strings, separated by &#8220;_&#8221;.<br />
Non-Regex {{*,2}} notation will bring back second group (index &#8220;2&#8221;) of all matches (index &#8220;*&#8221;). Returned matches are separated by \r\n</td>
</tr>
<tr>
<td style="width: 196px">&lt;html&gt;<br />
&lt;img src=&#8221;/img-1.png&#8221; /&gt;<br />
&lt;img src=&#8221;/img-2.png&#8221; /&gt;<br />
&lt;img src=&#8221;/img-3.png&#8221; /&gt;<br />
<span style="font-family: inherit;font-size: inherit">&lt;/html&gt;</span></td>
<td style="width: 223px">&lt;img[^&gt;]+src=&#8221;([^&#8221;&gt;]+)&#8221;{{*,1}}</td>
<td style="width: 163px"><strong>/img-1.png<br />
/img-2.png<br />
/img-3.png<br />
</strong></td>
<td style="width: 752px">Will return image URLs from HTML content. We used {{*,1}} means it will pull all occurrences and for each match it will extract first group (which is just src attribute value).</td>
</tr>
<tr>
<td style="width: 196px">null</td>
<td style="width: 223px">^((?!null\b).)*$</td>
<td style="width: 163px"><strong>&lt;blank&gt;</strong></td>
<td style="width: 752px">Returns blank if null word found (match all except null)</td>
</tr>
<tr>
<td style="width: 196px">black white</td>
<td style="width: 223px">^((?!red|blue|orange).)*$</td>
<td style="width: 163px"><strong>black white</strong></td>
<td style="width: 752px">Returns full string as is if any of those 3 words (i.e. red, blue, orange) not found anywhere in the string</td>
</tr>
<tr>
<td style="width: 196px">black white red</td>
<td style="width: 223px">^((?!red|blue|orange).)*$</td>
<td style="width: 163px"><strong>&lt;blank&gt;</strong></td>
<td style="width: 752px">Returns blank if any of those 3 words (i.e. red, blue, orange) found anywhere in the string</td>
</tr>
</tbody>
</table>
</div>
<h2>Regex Examples (Using SSIS Regular Expression Parser Task)</h2>
<p>Here is an example how Regex <pre class="crayon-plain-tag">(.+)_(.+)_(.+){{1,2}}</pre> works in <a href="https://zappysys.com/products/ssis-powerpack/ssis-regex-parser-task/" target="_blank" rel="noopener">Regular Expression Parser Task (FREE)</a>:</p>
<div id="attachment_2978" style="width: 742px" class="wp-caption alignnone"><a href="https://zappysys.com/blog/wp-content/uploads/2018/03/ssis_powerpack_regular_expression_parser_task.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-2978" class="wp-image-2978 size-full" src="https://zappysys.com/blog/wp-content/uploads/2018/03/ssis_powerpack_regular_expression_parser_task-e1520866478970.png" alt="Using Regular Expressions in SSIS Regex Parser Task (Extract Groups)" width="732" height="612" srcset="https://zappysys.com/blog/wp-content/uploads/2018/03/ssis_powerpack_regular_expression_parser_task-e1520866478970.png 732w, https://zappysys.com/blog/wp-content/uploads/2018/03/ssis_powerpack_regular_expression_parser_task-e1520866478970-300x251.png 300w" sizes="(max-width: 732px) 100vw, 732px" /></a><p id="caption-attachment-2978" class="wp-caption-text">Using Regular Expressions in SSIS Regex Parser Task (Extract Groups)</p></div>
<h2>Using Groups / Occurrence Index</h2>
<p>Some tasks like <a href="https://zappysys.com/products/ssis-powerpack/ssis-regex-parser-task/" target="_blank" rel="noopener">SSIS Regex Parser Task (FREE)</a> supports extracting value from specific occurrence and specific part of matched pattern using special syntax at the end of your pattern (see below).</p><pre class="crayon-plain-tag">Your Regex Pattern Here{{OccuranceIndex,GroupIndex}}</pre><p>
<strong>Where :</strong><br />
OccuranceIndex is 0 based (0=extract first occurrence)<br />
GroupIndex is 0 based (0=extract first matching group from pattern. First group is always entire text. )</p>
<p>See above screenshot in previous section for example.</p>
<h2>Tools</h2>
<p>The best tool we&#8217;ve found to write and test Regex is <a href="http://regexhero.net/tester/" target="_blank" rel="noopener">Regex Hero</a> (will require IE with Silverlight if you want to use it online, in the browser):</p>
<p>Another great site for Regex testing is <a href="https://regex101.com" target="_blank" rel="noopener">https://regex101.com</a> (Works in any browser unlike previous one)</p>
<p>and few more sites as below</p>
<p><a href="http://www.regexr.com/" target="_blank" rel="noopener">http://www.regexr.com/</a><br />
<a href="http://www.regexlib.com/" target="_blank" rel="noopener">http://www.regexlib.com/</a><br />
<a href="http://www.regular-expressions.info/" target="_blank" rel="noopener">http://www.regular-expressions.info/</a></p>
<p><a href="https://zappysys.com/blog/wp-content/uploads/2018/03/ssis_powerpack_regular_expression_task_using_regex_hero.png"><img loading="lazy" decoding="async" class="alignnone size-full wp-image-2972" src="https://zappysys.com/blog/wp-content/uploads/2018/03/ssis_powerpack_regular_expression_task_using_regex_hero.png" alt="" width="802" height="483" srcset="https://zappysys.com/blog/wp-content/uploads/2018/03/ssis_powerpack_regular_expression_task_using_regex_hero.png 802w, https://zappysys.com/blog/wp-content/uploads/2018/03/ssis_powerpack_regular_expression_task_using_regex_hero-300x181.png 300w, https://zappysys.com/blog/wp-content/uploads/2018/03/ssis_powerpack_regular_expression_task_using_regex_hero-768x463.png 768w" sizes="(max-width: 802px) 100vw, 802px" /></a></p>
<h2>Resources</h2>
<p><a href="https://zappysys.com/blog/wp-content/uploads/2018/03/Regular-expressions-quick-reference.pdf">Regular Expressions cheat-sheet to hang on the wall</a></p>
<p><a href="http://www.rexegg.com/regex-quickstart.html" target="_blank" rel="noopener">Regular Expressions quick reference</a></p>
<p>&nbsp;</p>
<p>The post <a href="https://zappysys.com/blog/using-regular-expressions-in-ssis/">Using Regular Expressions in SSIS</a> appeared first on <a href="https://zappysys.com/blog">ZappySys Blog</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Using SSIS Regex Parser Task for Extracting HTML Content</title>
		<link>https://zappysys.com/blog/using-ssis-regex-parser-task-extracting-html-content/</link>
		
		<dc:creator><![CDATA[ZappySys]]></dc:creator>
		<pubDate>Mon, 26 Dec 2016 17:09:57 +0000</pubDate>
				<category><![CDATA[SSIS Regex Parser Task]]></category>
		<category><![CDATA[regex]]></category>
		<category><![CDATA[ssis]]></category>
		<category><![CDATA[SSIS PowerPack]]></category>
		<category><![CDATA[ssis regex parser task]]></category>
		<category><![CDATA[ssis rest api task]]></category>
		<guid isPermaLink="false">http://zappysys.com/blog/?p=919</guid>

					<description><![CDATA[<p>Introduction In this post you will learn how to use FREE SSIS Regex Parser Task along with REST API Task to extract HTML content in few clicks. Scenario Assume that you want to search certain keywords from Bing or google and want to know how many pages found for that keyword. Url for search would [&#8230;]</p>
<p>The post <a href="https://zappysys.com/blog/using-ssis-regex-parser-task-extracting-html-content/">Using SSIS Regex Parser Task for Extracting HTML Content</a> appeared first on <a href="https://zappysys.com/blog">ZappySys Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2>Introduction</h2>
<p><a href="https://zappysys.com/blog/wp-content/uploads/2018/03/ssis-regex-parser-task.png"><img loading="lazy" decoding="async" class="size-full wp-image-2974 alignleft" src="https://zappysys.com/blog/wp-content/uploads/2018/03/ssis-regex-parser-task.png" alt="" width="100" height="100" /></a>In this post you will learn how to use FREE <a href="//zappysys.com/products/ssis-powerpack/ssis-regex-parser-task/" target="_blank" rel="noopener">SSIS Regex Parser Task</a> along with <a href="https://zappysys.com/products/ssis-powerpack/ssis-rest-api-web-service-task/" target="_blank" rel="noopener">REST API Task</a> to extract HTML content in few clicks.</p>
<p>Scenario</p>
<p>Assume that you want to search certain keywords from Bing or google and want to know how many pages found for that keyword. Url for search would be something like http://www.bing.com/search?q=regex where regex is our search word.</p>
<p>When page is returned view source code of that page and you will find tag like below.</p><pre class="crayon-plain-tag">&lt;span class="sb_count" data-bm="4"&gt;21,00,000 results&lt;/span&gt;</pre><p>
What we want is number 21,00,000 using Regular expression pattern search.</p>
<h2>Step-By-Step : Extract HTML Tag value using Regex Expression</h2>
<ol>
<li>Download and Install <a href="https://zappysys.com/products/ssis-powerpack/" target="_blank" rel="noopener">SSIS PowerPack</a> (It includes FREE <a href="//zappysys.com/products/ssis-powerpack/ssis-regex-parser-task/" target="_blank" rel="noopener">SSIS Regex Parser Task</a> )</li>
<li>Create new SSIS Package</li>
<li>Drag ZS REST API Task on Control flow designer from SSIS Toolbox</li>
<li>Double click to configure the task. Enter URL you like to fetch e.g. http://www.bing.com/search?q=regex</li>
<li>Click on Response Tab and check Save response option. Select Save to Variable. If needed create new variable.</li>
<li>Click Test (Scroll at the bottom to see html content)</li>
<li>Now drag ZS Regex Parser Task and connect with REST API Task</li>
<li>Select Variable which will hold HTML text you like to parse.</li>
<li>Enter following expression and map target to some Variable if you like to save extracted value. Below expression ends with {{0,1}} which means extract first match and 2nd group of that match (0 based Index). 2nd group of match will hold actual count of search result. If you omit {{x,y}} at the end then {{0,0}} is used.<br />
<pre class="crayon-plain-tag">\&lt;span\s*\w*\s*class="sb_count"\s*&gt;\s*(?&lt;p2&gt;[0-9,.]*){{0,1}}</pre>
See below screenshot</p>
<div id="attachment_920" style="width: 710px" class="wp-caption alignnone"><a href="//zappysys.com/blog/wp-content/uploads/2016/12/ssis-regex-expression-extract-html-tag-value.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-920" class="wp-image-920" src="//zappysys.com/blog/wp-content/uploads/2016/12/ssis-regex-expression-extract-html-tag-value.png" alt="SSIS Regex Parser Task - Extract HTML Tag Value using Regular Expression" width="700" height="461" srcset="https://zappysys.com/blog/wp-content/uploads/2016/12/ssis-regex-expression-extract-html-tag-value.png 881w, https://zappysys.com/blog/wp-content/uploads/2016/12/ssis-regex-expression-extract-html-tag-value-300x198.png 300w" sizes="(max-width: 700px) 100vw, 700px" /></a><p id="caption-attachment-920" class="wp-caption-text">SSIS Regex Parser Task &#8211; Extract HTML Tag Value using Regular Expression</p></div></li>
<li>In the above step you can select Variable as Input or use placeholder in Direct string (e.g  {{Use::varHtml}} )</li>
<li> You can also connect ZS Logging task to show extracted value</li>
</ol>
<p>Here is final flow.</p>
<div id="attachment_921" style="width: 700px" class="wp-caption alignnone"><a href="//zappysys.com/blog/wp-content/uploads/2016/12/ssis-regex-parse-example-download-page-extract-html-tag-value.png"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-921" class="size-full wp-image-921" src="//zappysys.com/blog/wp-content/uploads/2016/12/ssis-regex-parse-example-download-page-extract-html-tag-value.png" alt="SSIS Regular expression parsing example" width="690" height="406" srcset="https://zappysys.com/blog/wp-content/uploads/2016/12/ssis-regex-parse-example-download-page-extract-html-tag-value.png 690w, https://zappysys.com/blog/wp-content/uploads/2016/12/ssis-regex-parse-example-download-page-extract-html-tag-value-300x177.png 300w" sizes="(max-width: 690px) 100vw, 690px" /></a><p id="caption-attachment-921" class="wp-caption-text">SSIS Regular expression parsing example</p></div>
<p>&nbsp;</p>
<p>The post <a href="https://zappysys.com/blog/using-ssis-regex-parser-task-extracting-html-content/">Using SSIS Regex Parser Task for Extracting HTML Content</a> appeared first on <a href="https://zappysys.com/blog">ZappySys Blog</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
