What is JSONPath expression
JSONPath expression is an expression language to filter JSON Data. Its a similar concept to XPath expression in XML but has limited features compared to XPath. If you are extracting data from REST API Services using SSIS JSON / XML Source or ODBC JSON / XML Driver then you will quickly realize that it’s very important to extract nested data by navigating to a certain level.
Sample data for JSONPath expression
All examples of JSONPath expression in this post will use the following sample JSON for reference.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
{ "store": { "books": [ { "category": "reference", "author": "Nigel Rees", "title": "Sayings of the Century", "price": 8.95, "sections": ["s1","s2","s3"] }, { "category": "fiction", "author": "Evelyn Waugh", "title": "Sword of Honour", "price": 12.99, "sections": ["s4","s1","s3"] }, { "category": "fiction", "author": "Herman Melville", "title": "Moby Dick", "isbn": "0-553-21311-3", "price": 8.99, "sections": ["s1"] }, { "category": "fiction", "author": "J. R. R. Tolkien", "title": "The Lord of the Rings", "isbn": "0-395-19395-8", "price": 22.99, "sections": ["s8,s9"] } ], "time": { "starttime": "8:00", "endtime": "18:00" } } } |
JSONPath Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
//get all books for store $.store.books[*] //get all sections from all books $.store.books[*].sections[*] //get all authors of all books for store $.store.books[*].author //get 3rd book record $.store.books[2] //get first 2 books from the top $.store.books[:2] //get last 2 books $.store.books[-2:] //get all books where author attribute equals to 'sam' $.store.books[?(@author=='Nigel Rees')] //get all documents / sub documents (any level) where author attribute equals to 'sam' $..[?(@author=='Nigel Rees')] //get books where price is less than 10 $.store.books[?(@.price<10)] //filter all books with tag $.store.books[?(@.tag)] //find all books which contains section s1 or s2 (Use of Logical operator OR ( || ) $.store.books[?(@.sections[*]=='s1' || @.sections[*]=='s2' )] //find all books where first section is s1 and second section is s2 (Logical operator OR ( && ) $.store.books[?(@.sections[0]=='s1' && @.sections[1]=='s2' )] |
JSON Path – Using Regular Expressions (Regex)
The new version of SSIS PowerPack and ODBC PowerPack Supports Regular expressions in the path. Learn more about Regular expressions check this article
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
//Using Regular Expression: get all books where author attribute contains 'Nigel' or 'Waugh' //Learn more about regular expressions here : https://zappysys.com/blog/using-regular-expressions-in-ssis/ $.store.books[?(@author=~ /Nigel|Waugh/ )] //Using Regular Expression: get all books where category is 'reference' OR author attribute contains 'Nigel' or 'Waugh' $.store.books[?(@category=='reference' || @author=~ /Nigel|Waugh/ )] //Using Regular Expression: get all books where category is 'reference' AND author attribute contains 'Nigel' or 'Waugh' $.store.books[?(@category=='reference' && @author=~ /Nigel|Waugh/ )] //Using Regular Expression: get all books where author name either start with 'Nigel' (see ^ prefix) or ends with 'Waugh' (see $ suffix) //Learn more about regular expressions here : https://zappysys.com/blog/using-regular-expressions-in-ssis/ $.store.books[?(@author=~ /^Nigel|Waugh$/ )] //Using Regular Expression: get all books where author name starts with 'Nigel' //Learn more about regular expressions here : https://zappysys.com/blog/using-regular-expressions-in-ssis/ $.store.books[?(@author=~ /^Nigel/ )] //Using Regular Expression: get all books where author name ends with 'Waugh' //Learn more about regular expressions here : https://zappysys.com/blog/using-regular-expressions-in-ssis/ $.store.books[?(@author=~ /Waugh$/ )] |
JSON Path – Nested Scan for any level (double dots)
Now lets look at example of below JSON. Assume that you like to extract information from all nested “children” nodes (basically extract id and name of every child from any level).
You can use double dots without specifying parent heirarchy like below to scan recusrsivly.
$..children[*]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
{ children: [{ id: 1,name:"c1", children: [{ id: 2,name:"c2", children: [{ id: 3,name:"c3", children: [ {id: 4,name:"c4"}, {id: 5,name:"c5"} ] }] }] }] } |
OR (below to find all children where name is c1 or c2 or c3)
$..children[?(@name=~ /c1|c2|c3/ )]Here is an example of a JSON Driver in ODBC. The same expression can be used in SSIS too.
JSONPath for Root Objects in Array
Now lets look at another pattern of JSON where your documents are on the first level rather than under some array property. Root filter only works if you have a version released after Dec-29-2021. Previous version didn’t support filtering on Root Array.
Example JSON
1 2 3 4 5 6 |
[ { id:1 , color:'red'}, { id:2 , color:'red'}, { id:3 , color:'blue'}, { id:4 , color:'green'} ] |
Example JSON Path Filters
1 2 |
$.[0] //Extract first document $..[?(@color=='red')] //Recursive scan (any level) - Extract all documents where color=red use 2 dots after $ for filter |
NOTE: using two dots after $ (e.g. $..[zzzz]) invokes nested scan so any level array will return documents, this may not be the acceptable case sometimes but if you want to filter the root array then this is the only way for now in ZappySys.
If you must not allow recursive scan then you can use the following two properties
EnableRawOutputModeSingleRow = true;
RawOutputDataRowTemplate = “{data: [$1] }” –where [$1] is replaced at runtime with Document Text you like to parse (can be static or URL response)
OR
Use functions in RawOutputDataRowTemplate if needed
RawOutputDataRowTemplate = “{data: <<[$1],FUN_BASE64DEC>>}”
The above will wrap around the actual JSON / XML you need to parse so now after this setting you can use JSONPath = $.data[?(@id==1)] to avoid two dots to prevent a recursive scan