Introduction
In our previous post we saw various ways to transform JSON arrays. However there will be a time when your JSON / XML file wont have Array and you need to Pivot JSON Data.
Prerequisites
Before we perform the steps listed in this article, you will need to make sure the following prerequisites are met:- SSIS designer installed. Sometimes it is referred to as BIDS or SSDT (download it from the Microsoft site).
- Basic knowledge of SSIS package development using Microsoft SQL Server Integration Services.
- Make sure ZappySys SSIS PowerPack is installed (download it, if you haven't already).
- (Optional step). Read this article, if you are planning to deploy packages to a server and schedule their execution later.
Sample JSON data file
Here is a sample JSON file which we like to parse into rows and columns. Notice how it is using values inside in property name. This pattern needs Pivot option usage described later in this article.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
{ "version": 1.0, "products": { "P001": { "name": "Product 1", "price": 10 }, "P002": { "name": "Product 2", "price": 11 }, "P003": { "name": "Product 3", "price": 12 } } } |
If you like to play with real data then use below URL. This JSON has 60MB worth of data for all Products offered on AWS. And It has similar structure as above except many more attributes under products node.
1 |
https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/us-east-1/index.json |
Using SSIS PowerPack to Pivot JSON data
ZappySys offers powerful JSON and XML Connectors. Below section will describe how to Parse and Pivot JSON data or Pivot XML data. For demo purpose we will use JSON Source but steps are almost same for XML too.
Step-By-Step JSON Pivot
- First drag data flow from SSIS Toolbox and then double click data flow
- Once Data flow designer is open, drag ZS JSON Source from SSIS Toolbox.
- Now for demo we will use Hard coded JSON using Direct Value option as below. You can also enter URL or File path if you select Direct Path option.
- Go to Pivot Columns to Rows Tab and check Enable Pivoting Option
- Now come back to Filter Options Tab here you can either browse Filter or enter by hand. If file is too large then enter by hand. For example in our case we will enter $.products because we like to to parse structure below that node. If you have nested hierarchy (i.e. products node under orders node and so on) then use dot to separate them e.g. $.customer.orders.products
- For very Large file you need to Un-check Include Parent Columns option else you will get OutOfMemory Exception
- Now click Preview to see our data.
- Now you can connect your JSON / XML source to destination. See below example.
Using ODBC PowerPack to Pivot JSON data
So far we talked how to use SSIS to read your JSON data and Pivot it but what if you want to consume JSON / XML / REST API data in some other App without doing ETL via SSIS? Thats where ZappySys ODBC PowerPack comes in picture. Its a collection of many drivers including JSON Driver and XML Driver. You can also use Data Gateway Approach, which allows T-SQL code to fetch JSON / XML data directly into SQL Server Table without any ETL / Programming.
Here is sample query you can run using JSON Driver or XML Driver
1 2 3 4 5 6 7 |
SELECT * FROM $ WITH( Src='https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/us-east-1/index.json' ,Filter='$.products' ,IncludeParentColumns='False' ,EnablePivot='True' ) |
See below screenshot of JSON Driver Configuration to get idea.
Note: Use above query rather than what is displayed in the Screenshot.
BI / Reporting Tools Integration |
ETL Tools Integration | Programming Languages Integration |
|
|
|
Advanced Pivot with Path / Search and Replace Option
Now let’s look at an advanced scenario where you like to extract Pivot_Path (know where Pivoted Property came from). For that, you can download the latest version which providers IncludePivotPath and EnablePivotPathSearchReplace options as below.
As you can see in the below example let’s say you like to extract table names and view names if defined under “views” node.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
{ "table-1": { "views": { "view-tbl1-1": {}, "view-tbl1-2": {} } }, "table-2": { "views": {} }, "table-3": { "views": { "view-tbl3-1": {} } } } |
We set following
- On the Filter Options Tab
Filter = $..views - on the Pivot Columns To Rows Tab
Check Enable Pivoting
Check Include_PivotPath
Check Enable Pivot_Path search/replace
Set Search for (.*).views.(.*)--regex Set Replace With $1
Conclusion
In this particle we saw how to use some advanced options for XML/ JSON Source and ODBC Drivers. You can download respective product depending your usecase. For SSIS usecase Download SSIS PowerPack and for other ETL / Reporting app integration scenarios (i.e. Power BI, Informatica, SSRS , Excel, MS Access, SQL Server, JAVA, C# ….) Download ODBC PowerPack.