Introduction
MongoDB is one of the most poplar NoSQL database out there. In this article you will learn how to use ZappySys MongoDB Destination Connector to perform various bulk operations on MongoDB such as Insert, Update, Delete and Upsert. If you wish to learn how to Update MongoDB Array items then check this blog post instread.
MongoDB data read/write Example SSIS Package
Before we talk more about loading data into MongoDB lets look at screenshot and steps involved in our sample package for this tutorial. Below sample SSIS package uses 4 Custom SSIS Tasks
- SSIS MongoDB Source Connector (Used to read data from MongoDB Collection)
- SSIS MongoDB Destination Connector (Used to write data into MongoDB Collection also you can do other bulk operations such as MongoDB update, MongoDB upsert and MongoDB delete)
- SSIS MongoDB ExecuteSQL Task (Used to call any MongoDB Shell commands including MongoDB DDL or DML type operations such as CREATE/DROP, Insert, Update, Delete, Read, you can also call server side JavaScript)
- Dummy Data Source (Used to generate sample JSON Documents which can be loaded in MongoDB)
MongoDB Examples
Here is the list of various MongoDB examples included in attached sample package (see at the end for download links)
- MongoDB example – How to read data from MongoDB collection using SQL query language
- MongoDB example – How to load raw JSON document into MongoDB Collection (Bulk Insert with Reload)
- MongoDB example – How to do MongoDB Upsert (Bulk Update or Insert) for specific fields or entire document
- MongoDB example – How to perform MongoDB Delete ( Bulk Delete based on Matching Rows)
- MongoDB example – How to create new MongoDB collection (Only if not exist – Use safe option)
- MongoDB example – How to fetch MongoDB collection record count and save to SSIS variable
- MongoDB example – How to fetch MongoDB collection size and save to SSIS variable
- MongoDB example – How to get collection names and loop through them using ForEach Loop Task
- MongoDB example – How to save query output into Recordset variable and loop though records using ForEach Loop Task
Video Tutorial – Update/Delete/Write/Bulk Load data into MongoDB from SQL Server/ORACLE
In below video tutorial you will see how easy it is to load data into MongoDB from any Datasource such as SQL Server, Oracle, MySQL, Flatfile etc. You will also learn how to product nested JSON from multiple tables using JSON Generator Transform. You can also use JSON Source with Output as Document Mode to load raw JSON data.
Video Tutorial – Read data from MongoDB
In the following video tutorial you will learn how to consume data from MongoDB Collection using SSIS MongoDB Source Connector. You will notice that it uses SQL query syntax rather than MongoDB specific native query syntax (JSON query). This makes it super easy to query data without any learning curve. Data stored in MongoDB is in JSON document format but Data coming out from MongoDB Source Connector is Flat table structure (It de-normalize nested nodes). You can also Query inner Hierarchy using JSON Path expression (e.g. query Orders from Customer Document using $.Customer.Orders[*] ) .
MongoDB Insert – Bulk Loading data into MongoDB
Using MongoDB Destination Connector you can easily Insert JSON documents in Bulk. There are two input modes for inserting records into MongoDB.
Simple Mode – Loading data in simple mode (array not allowed)
In simple loading mode you can map source columns to target column in MongoDB destination. If you Insert data using this mode then you cant load complex documents with array.
Document Mode – Loading JSON documents into MongoDB
In this mode you can insert/update/delete documents by mapping inbuilt __DOCUMENT__ input column which appears on mapping tab (Target columns). When you perform Insert you have two options (Insert (i.e. Append) and Reload (Truncate + Insert). When you select Operation=Reload in that case Collection is first Truncated and then new records are loaded.
In JSON document load mode if you don’t supply _id column part of your input JSON then new _id is automatically generated by MongoDB for newly inserted document.
Loading JSON files into MongoDB
Another scenario could be loading JSON files into MongoDB. You can load JSON Files into MongoDB two ways.
- Use Flat File Source with just one column with DT_NTEXT datatype (use this method when you each JSON documents in one line and they are separated by new line as below screenshot)
- Use JSON Source Connector with Output as Document option checked. You can map __DOCUMENT__ source column to Target MongoDB collection as below. Advantage of this method is you extract JSON from array too and specify many other advanced options.
Specifying LoadOptions
MongoDB Destination Connector supports following LoadOptions property (see Component Properties Tab on UI) which controls how target data gets modified. LoadOptions property is ignored for Reload and Insert operations.
- op : this option specify update operators (Valid for Update, Replace and Upsert operation (i.e. UpdateOrInsert) ). See this link for more information on various Update operators available in MongoDB. Most common operators listed below. Refer official MongoDB help for more information.
- op=none : Replace entire document
- op=$set : Update only those fields specified in mapping. If Specified mapping field not available in target then its added to target document.
- op=$push : Adds new item to destination array. See following examples (How to insert single or multiple documents into MongoDB Array by supplying JSON into __DOCUMENT__ column
12345678/* Insert single item into array - Construct input JSON as below */{YourArrayField : 111 }/* Insert multiple items into array - Construct document as below */{YourArrayField : { $each: [ 111, 222, 333 ] } }/* Insert multiple documents into array */{YourArrayField : { $each: [ {OrderID:1, Total:20.00}, {OrderID:2, Total:12.00} ] } }
- multi : This option controls if you want to Update or Delete multiple records for matching condition for each upstream input record. For example if you joining by Country and City columns for update and target system finds 5 records then by default they all will be updated if multi=true is set in LoadOptions property. If you don’t want to update multiple records
MongoDB Update
MongoDB Destination Connector supports Batch Update and Batch Upsert Operations. There are few things to remember how Update operation works with MongoDB Destination Connector.
- To perform Update operation you have to specify Join Criteria for record lookup on target (see ColumnsForLookup property).
- Specify JOIN columns in ColumnsForLookup property. Use comma to specify multiple columns (e.g. CustomerID,OrderID). If you dont specify columns in this property then any columns mapped other than __DOCUMENT__ will be considered as JOIN column.
- With MongoDB destination you can only specify simple Update condition by mapping JOIN input columns. Behind the scene it will construct search query for update. For example if you Map 3 input columns (__DOCUMENT__ , Country, State) and you specify LoadOptions : op=$set;multi=true then its similar as below SQL Statement.
123UPDATE MyCollectionSET [properties specified in __DOCUMENT__]WHERE Country=[input value] AND State=[input value] - Join columns can be individually mapped or can be supplied as part of JSON document mapped to __DOCUMENT__ column.
- If __DOCUMENT__ column is not mapped then you must specify at least one column name in ColumnsForLookup property to identify search column.
- If __DOCUMENT__ column is mapped and column name is not specified in ColumnsForLookup property then you must map at least one more input column which can treated as JOIN column
Update specific fields ($set operator)
To update specific fields in target MongoDB collection you can map JOIN column(s) and then map fields you want to update or map __DOCUMENT__ if you suppling JSON document with fields you want to update.
See below screenshot where its updating Customers record with new coordinates based on City and Country information.
Update single or multiple target records for single input row
If you want to update all matching records in target then set LoadOptions property to op=$set,multi=true
If you want to update only one record per matching criteria in target then set LoadOptions property to op=$set,multi=false
Update entire document (i.e. Overwrite/Replace documents)
If you like to replace entire target document then you can do two ways. Few things you have to change for MongoDB Destination.
Approach#1
- Select Operation=Replace (Rather than Update or Upsert)
- Specify Join column in ColumnsForLookup
- On the mapping tab Map __DOCUMENT__ (Assuming you have Raw JSON coming from Upstream)
Approach#2
Another approach is use $set operator in LoadOptions property then it will only update specified fields in mappings. But if you want to overwrite entire document (Replace document) then set LoadOptions property to op=none,multi=true or op=none,multi=false (If you want to update one record for matching condition)
Insert single item into MongoDB Array using $push operator
If you want to insert new items into array then set LoadOptions property to op=$push,multi=true or op=$push,multi=false (If you want to update one record for matching condition). You have to map __DOCUMENT__ column on target mappings tab to supply item(s) or document(s) you want to insert into target array.
1 2 3 4 5 |
/* Insert single number or string into MongoDB array - Construct input JSON as below */ {YourArrayField : 111 } /* Insert single document into MongoDB array - Construct input JSON as below */ {YourArrayField : {OrderID:1, Total:20.00} } |
Insert multiple items into MongoDB Array using $push operator along with $each
Inserting multiple items into array requires use of $each modifier as below. Use above instructions except input document (e.g. data passed to __DOCUMENT__ column) will look like below (see how $each used).
1 2 3 4 5 |
/* Insert multiple items into MongoDB array - Construct document as below */ {YourArrayField : { $each: [ 111, 222, 333 ] } } /* Insert multiple documents into MongoDB array */ {YourArrayField : { $each: [ {OrderID:1, Total:20.00}, {OrderID:2, Total:12.00} ] } } |
Other MongoDB Update Operators
So far we have seen $set and $push update operators. To learn more about other operators see this link
MongoDB Update Array Items ($pull, $push, $addToSet)
If you wish to update items found inside nested array of document then Check this full length article
MongoDB Upsert – Bulk Update or Insert JSON documents
If you wish to perform Bulk Upsert using MongoDB Destination connector then select Upsert Action from dropdown (Action=UpdateOrInsert). In Upset if document is not found for matching criteria then new document is inserted in MongoDB Collection. If document is found then Update operation occurs. To set specific fields use op=$set in LoadOptions and if you wish to overwrite entire document then use $op=none in LoadOptions
MongoDB Delete – Bulk delete MongoDB JSON documents
To delete documents from collection you can set operation=Delete on Component properties page. For delete you can simply map _id column on target or just supply columns based on you want to search records you want to delete.
To delete maximum one record for each input row set LoadOptions property to op=none,multi=false or set op=none,multi=false if you want to delete more than one record for matching condition.
MongoDB Collection Reload – Remove All (Truncate collection)
To reload collection set operation to Reload on Component properties page. Reload operation first clears target collection and perform bulk Insert operation
Generate JSON for MongoDB data load
Now lets talk about different ways to produce simple or complex JSON using ZappySys components. There are 3 ways you can produce JSON which can be loaded into MongoDB Collection. Following section will describe each approach and you can select which one best suited for you.
Generate JSON – Using Export JSON Task
Export JSON Task can export JSON to File or variable which you can consume via JSON Source Connector (Check Output as Document option) and supply to MongoDB Destination Connector (map to __DOCUMEN__).
Check this link for more info : Export Json from sql server using ssis
Generate JSON – Using JSON Generator Transform
JSON generator transform is useful when you want to Produce JSON inside pipeline from single or multiple input.
Check this link for more info : JSON generator Transform (Watch Video on this page)
Generate JSON – Using Dummy Data Source (Fake JSON for Testing)
If you need to produce JSON for testing purpose then use Dummy Data Source (Select JsonDocuments Template from dropdown and click ok)
Query/Read data from MongoDB
Visit following links to learn more about reading data from MongoDB.
SSIS MongoDB Source Homepage
MongoDB SQL Query Reference
MongoDB data extract – from nested sub documents and array
Execute MongoDB Shell commands in SSIS (e.g. Create/Drop collection)
MongoDB SOurce and Destination Connectors are great for read/write data. But what if you want to issue ad-hoc MongoDB shell command such as create collection/drop collection ? No worry.. SSIS PowerPack v2.0 released new Task called SSIS MongoDB ExecuteSQL Task This task comes with many examples listed on the UI itself (Look for Examples dropdown menu on UI). You can use SSIS MongoDB ExecuteSQL Task to perform many operations such as
- Create/Drop MongoDB collections
- Get collection list as ADO.net recordset so you can loop through using ForEachLoop task
- Call serverside JavaScript
- Fetch data as ADO.net recordset and store in SSIS variable so you can loop using ForEachLoop task
- Drop database
- Call many shell commands
Download MongoDB Example Files used in this tutorial
To test this package you first need to download SSIS PowerPack
Once SSIS PowerPack installed you can download following SSIS sample project file(s).
Download sample files – SSIS 2008
Download sample files – SSIS 2012
Download sample files – SSIS 2014
Download sample files – SSIS 2016
Conclusion
MongoDB makes it super easy to store JSON documents and gives you whole new way of manipulating documents with ease of use. ZappySys SSIS Connectors and Tasks for MongoDB makes it possible to implement various data integration scenarios with MongoDB. Click here to Download SSIS PowerPack and try it your self.
Keywords:
Loading data into MongoDB | mongodb upsert array | mongodb upsert example | mongodb insert or update | mongo insert or update | mongo upsert | mongo insert JSON | mongo upsert JSON | update data into MongoDB