How to import XML into SQL Server using SSIS

Introduction

In this post you will learn how to import XML file into target such as SQL Server, CSV, Excel, Oracle. Converting or Importing XML requires flattening XML structure (i.e. hierarchy) sometimes we refer this process as de-normalization.  Parsing XML can be tricky depending on how many levels you have and from which level you need data. If you want to export XML rather than Import then check this article.

This article assumes you have installed SSIS PowerPack and watched introduction videos of SSIS XML Source Connector and SSIS XML Parser Transform

Import XML File using SSIS XML Source

ZappySys SSIS XML Source is a very powerful component. It supports not only reading from file but also reading from Direct XML String or REST API / SOAP Web Service.

How ZappySys XML Source parse/output data

XML Source created by ZappySys is very different than what Microsoft provides out the box because ZappySys component outputs one flat view of nested XML document (de-normalized) compared to many outputs what Microsoft XML Source produces which is hard to manage and join based on relation with parent node.

With ZappySys XML component you don’t have to worry about joins because by default parent columns included along with hierarchy you extracted (e.g. If you extract $.Customer.Orders[*].OrderDetails[*] then you get all columns listed under <Order> and <Customer> nodes)

SSIS XML Source - Variable Mode - Call Web API or Read from File

ZappySys XML Source (Read XML Files, Consume REST API, SOPA Web Service)

Importing single or multiple XML Files

To import single files specify full XML file path

Example:  c:\ssis\File_001.xml

To import multiple files using XML source use wildcard pattern

Example:  c:\ssis\File_*.xml

SSIS XML Source - Read data from XML files (Single or Multiple files) - Use wildcard pattern in path

XML File Source – Specify File Path with wildcard pattern (Loop multiple files)

SSIS XML Source - Read data from XML files (Single or Multiple files) - Use wildcard pattern in path

Importing data from multiple XML files

 

Using variable to make path dynamic

You can also use variable placeholders in your path to make it dynamic

Example:  c:\ssis\{{User::FileName}}.xml

Importing data from XML Web Service or REST API

XML Source supports consuming REST API data or calling XML SOAP Web services. This article doesn’t include this feature in depth. But refer to learn more about this feature.

SSIS XML Source - Direct Mode - Call REST API web service or read from XML File

SSIS XML Source – Direct Mode – Call REST API web service or read from XML File

 

Parse XML string and split into columns using XML Parser Transform

SSIS XML Parser Transform - Parse XMl Document, Convert XML to CSV or save to RDBMS (e.g. SQL Server, MySQL)

Parsing XML Data – XML String into Columns using SSIS XML Parser Transform

 

Other Settings (XML Source, XML Parser Transform)

Preview Data

All access modes support preview data like below. Change your settings such as Filter expression, Source etc and click preview.

Read XML data in SSIS

Preview XML Data

Filtering data using expression

You can filter XML data by specifying simple expressions (e.g. $.store.book[*]) to navigate various parts of XML where you can extract data (See above screenshot).

Things to remember for filter expressions

  • Expression must start with $ (means root from where parsing will start)
  • Use dot (.) to separate Hierarchy and use brackets to indicate array []
  • Use * to indicate all records

Sample XML for examples

Example of filter expression

Filter Output
Get all records
SSIS XML Source Adpater - XMLPath filter
$.Root.Customer[*] Get all customers
SSIS XML Source Adpater - XMLPath filter
$.Root.Customer[*].Order[*] Get all orders for all customers
SSIS XML Source Adpater - XMLPath filter
$.Root.Customer[*].Order[*](Include Parent Columns=True) Get all orders for all customers (Include Parent Columns option checked)
SSIS XML Source Adpater - XMLPath filter
$.Root.Customer[*].@CustId Get only Customer Id for each customer record
$.Root.Customer[*].Order[:1] Get first order for each customer
$.Root.Customer[*].Order[:3] Get top 3 orders for each customer
$.Root.Customer[*].Order[:1] Get last order of each customer
$.Root.Customer[*].Order[0,4] Get orders starting from first row to fifth row (Zero based index). If you want 5th row to 10th row then use [4,9]
$.Root.Customer[*].Order[2] Get 3rd order for each customer

Defining array elements

Any hierarchy you extracting with array indicator can be added to array element name list (On Array handling Tab). This helps when you only have one row for node specified as array.

For example if your filter expression is $.Customer.Order[*].OrderItem[*] in this case you indicated that you can have one or more Order nodes under Customer and you may have one or more OrderItem nodes under Order node. If this is the case then set array handling as below

Handling Rows/Array Elements (Repeated XML Elements)

Handling Rows/Array Elements (Repeated XML Elements)

Editing Column Metadata (Length and DataType)

XML Source and XML Parser both detects Datatype/Length by scanning sample data (Default 300 rows) but sometimes this is not accurate and it may produce invalid metadata. If that’s the case you can easily change it in Columns tab. Once you edit Length or DataType make sure you check Lock column (Last column in grid). Once Lock is applied metadata refresh wont affect your changes. You can also add new column manually if its not detected (Lets say your XML is very large and many columns not found in first 300 rows scanned). To add new column type in the last grid row or click new icon in the grid toolbar.

SSIS XML Source - Configure Columns and DataType

Edit XML File Source Column Metadata

Conclusion

In this post you have seen how to use  SSIS XML Source Connector and SSIS XML Parser Transform to extract data from nested XML documents without any coding.

Related to: SSIS Import XML Files | SSIS Parse XML data | SSIS Extract from XML files | Read from XML Files

Posted in SSIS XML Source (File / SOAP) and tagged , , , , .