Introduction
In our previous blog post we saw how to update / delete mongodb array item. Now let’s look at how to Upsert MongoDB Array Items (i.e. Insert Item if not found in Array else Update existing record). Upsert into nested MongoDB Array requires two step process, unfortunately there is no easy way to do in a single step. You may see in below screenshot that first we will do Update Step for Existing MongoDB Array Items and then we will insert missing Records in second step. So let’s get started.
Prerequisites
Before we perform the steps listed in this article, you will need to make sure the following prerequisites are met:- SSIS designer installed. Sometimes it is referred to as BIDS or SSDT (download it from the Microsoft site).
- Basic knowledge of SSIS package development using Microsoft SQL Server Integration Services.
- Make sure ZappySys SSIS PowerPack is installed (download it, if you haven't already).
- (Optional step). Read this article, if you are planning to deploy packages to a server and schedule their execution later.
MongoDB Array Upsert – Update / Insert using Custom JOIN condition (such as $ne )
By default SSIS Mongodb Destination performs Lookup using $eq condition (Match rows using Equal operator). However there will be a time when you like to lookup and update target rows using custom join criteria (e.g. Use Not Equal condition – $ne ). V2.7.6 and later introduced new property called EnableCustomLookupQuery . When you enable this setting, you can supply a document which contains Data and Condition. You can map this XML formatted document to __DOCUMENT__ (see below)
Here is the description how to use custom Join using this new property.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
Enables use of custom lookup query for Update / Upsert or Delete Operation. By default JOIN condition for target record match is generated automatically based on columns you supply in ColumnsForLookup property. However in some cases you need to supply custom condition for lookup to perform complex operations, in such case enable this option. When you enable this option you must supply map __DOCUMENT__ input column. String you supply to this column should be in this format <command> <query>YOUR_LOOKUP_QUERY</query> <document>YOUR_JSON_DOC</document> </command>. Lookup query in <query> tag can be either Mongo JSON format (e.g. { \"CustomerID\" : \"AAA\", \"Orders.OrderID\" : { \"$ne\" : \"1000\" } }) OR you can use ZappySys SQL query (e.g. select * from mytable where CustomerID='AAA' and [Orders.$.OrderID] != '1000' ) |
Now lets look at step by step. In below example we have a table called CustomerTest. We will load it with 2 records (With No orders). Then Update array using custom lookup condition, and later step we will insert record in array using custom lookup condition.
Update Array using Custom Lookup Condition
- Lets create few sample rows in MongoDB Collection. You can use following command in SSIS MongoDB Execute SQL Task
12345678910{scope: 'database',db: 'Northwind',command: 'eval',args:{code: 'db.CustomerTest.insert( [ {CustomerID:"AAA", Orders:[]}, {CustomerID:"BBB", Orders:[]}, {CustomerID:"CCC", Orders:[{OrderID:"1004","OrderDate" : "2008-02-05","Qty" : "5"} ]} ] )'}} - Now drag Data flow.
- Drag ZS CSV Source or any other source. For example purpose we will configure CSV Source using Direct Value option with following Sample data. In the first run of package it will update only orders for CCC customer because only that customer will have records after table creation. one row. But in 2nd run all rows updated.
123456CustomerID,Company,OrderID,OrderDate,Qty,AddressAAA,Anthony Inc,1000,2008-01-01,90,Po Box 111\4612AAA,Anthony Inc,1001,2010-02-01,91,Po Box 111 / 4612BBB,Bob Inc,1002,2008-02-01,92,Po Box 222 / 4612BBB,Bob Inc,1003,2010-01-01,93,Po Box 222 / 4612CCC,Cindy Inc,1004,2010-01-07,2,Po Box 555 / 2345 - Click OK to save Source UI
- Now drag ZS Template Transform to build XML document (Data + Query). Connect Source to Template transform. You can also use ZS XML Generator Transform but for simplicity we will use Template Transform.
- Double click to open Template Transform and enter text as below. Notice how we have entered MongoDB Query and Data in two separate XML nodes. First node is custom query for lookup. Second node is Operation we like to perform (e.g. $set in our case to update existing data). You can use Insert Variable and then select Columns option to insert placeholders. Also in some fields we used JSONENCODE function to make sure we escape double quote and slash correctly.
1234567891011121314151617181920<command><query><![CDATA[{"CustomerID": "<%CustomerID%>","Orders.OrderID": { "$eq" : "<%OrderID%>"}}]]></query><document><![CDATA[{$set :{ "CompanyName":"<%Company,JSONENCODE%>","Address":"<%Address,JSONENCODE%>","Orders.$.OrderID": "<%OrderID%>","Orders.$.OrderDate": "<%OrderDate%>","Orders.$.Qty": "<%Qty%>"}}}]]></document></command> - Click OK to save Template Transform
- Drag ZS MongoDB Destination. Connect Template Transform to Destination
- Double click MongoDB Destination to edit. Set Connection, Table name, Operation=Update, EnableCustomLookupQuery=True
- Goto Mappings Tab and attach TemplateOutput to __DOCUMENT__
- Click OK to Save
Insert into Array using Custom Lookup Condition
Now let’s look at slightly modified steps to Insert into Array for new records. In this example we will insert Source records if CustomerID found but OrderID is not found. (See Template Transform Step). We will use Not Equal Condition this time.
- Now drag Data flow rename it to something like [Add records to array]
- Drag ZS CSV Source or any other source. For example purpose we will configure CSV Source using Direct Value option with following Sample data
12345CustomerID,Company,OrderID,OrderDate,QtyAAA,Anthony Inc,1000,2008-01-01,30AAA,Anthony Inc,1001,2010-02-01,4BBB,Bob Inc,1002,2008-02-01,30BBB,Bob Inc,1003,2010-01-01,4 - Click OK to save Source UI
- Now drag ZS Template Transform to build XML document (Data + Query). Connect Source to Template transform. You can also use ZS XML Generator Transform but for simplicity we will use Template Transform.
- Double click to open Template Transform and enter text as below. Notice how we have entered MongoDB Query and Data in two separate XML nodes. First node is custom query for lookup. Second node is Operation we like to perform (e.g. $addToSet in our case to insert into array )
12345678910<command><query>{"CustomerID": "<%CustomerID%>", "Orders.OrderID":{ "$ne" : "<%OrderID%>"}}</query><document>{$addToSet :{ Orders : {"OrderID": "<%OrderID%>", "OrderDate": "<%OrderDate%>", "Qty": "<%Qty%>" } }}</document></command> - Click OK to save Template Transform
- Drag ZS MongoDB Destination. Connect Template Transform to Destination
- Double click MongoDB Destination to edit. Set Connection, Table name, Operation=Update, EnableCustomLookupQuery=True
- Goto Mappings Tab and attach TemplateOutput to __DOCUMENT__
- Click OK to Save
Download Sample SSIS Package (2012 format) – MongoDB_Upsert_ArrayItem
Conclusion
MongoDB integration can be challenging if you are new to NoSQL world. If you are using SSIS as your primary ETL tool then not to worry because SSIS PowerPack can give super power needed to complete your project on time with drag and drop high performance connectors.
Keywords: ssis mongodb upsert array item | ssis mongodb update array item | ssis mongodb update array elements | mongodb update array documents | MongoDB $set operator | MongoDB $addToSet operator | MongoDB update sub document items | MongoDB CRUD operations | MongoDB Bulk Updates | MongoDB bulk updates to array items