SSIS MongoDB Destination Adapter (Bulk Load,Write,Copy NoSQL Data)
PreviousNext

MongoDB Destination can be used to bulk insert large amount of records to MongoDB Table from any data source using SSIS. MongoDB is an open-source document database, and the most popular NoSQL database out there.

Download SSIS PowerPack

Content

Video Tutorial

Step-By-Step

In this section you will learn how to use MongoDB Destination Adapter to write data to MongoDB Collection.
  1. You need to Download and Install SSIS ZappySys PowerPack.
  2. First of all, Download and Install MongoDB.
  3. For this sample exercise we will need new MongoDB Collection. To create new collection perform following steps.
  4. Launch command prompt and type c:\{path of mongo installation}\mongo.exe and then type following command to create Customers Collection (i.e. Table). If you see OK status means Collection created.
    use admin //Admin is Database name. you can replace it.
    db.createCollection("Customers");
    
  5. Once you finished above steps, Open Visual Studio and Create New SSIS Package Project.
  6. Now, Drag and Drop SSIS Data Flow Task from SSIS Toolbox.
    SSIS Data Flow Task - Drag and Drop
  7. Double click on the Data Flow task to see Data Flow designer surface.
  8. Here, In Visual Studio, drag and drop the ZS DummyData Source and MongoDB Destination in the design surface and join the components with the blue arrow.
    SSIS MongoDB Destination - Drag and Drop
  9. Here, you can use OLE DB Data Source, and CSV File Source.
  10. But, we are going through ZS DummyData Source for practice it and we need MongoDB Connection for MongoDB Destination.

How to create MongoDB Connection.

  1. Right click on Connection Managers Panel to Create MongoDB Connection, and Context Menu will appear, Select New Connection from the Context Menu.
    MongoDB execute command - insert JSON document into MongoDB
  2. Select ZS-MongoDB from the Connection Managers list and Click on Add Button.
    SSIS MongoDB Create Connection
  3. Now, in Connection Manager you will get by default Host name if you have already installed MondoDB, just give your User Name, Password and Database Name and hit on Test Connection and then click OK.
    SSIS MongoDB Create Connection
  4. Click OK to save connection configure setting UI.

How to load data into MongoDB from source using MongoDB Destination.

  1. The Dummy Data Source is a ZappySys Task in the Data Flow that can be used to generate testing data.
  2. Double click DummyData Source to configure it.
  3. From Template pick Customers and enter row count=100.
    Configure SSIS Dummy Data Generator Source
  4. Click OK to save DummyData Source setting UI.
  5. Now double click MongoDB Destination to configure it.
  6. In the Connection Manager tab select MongoDB Connection manager.
    Configure SSIS MongoDB Destination Adapter - Connection Tab
  7. Click on Component Properties tab and select AccessMode=Table and Pick destination Table from Tables dropdown list.
    Configure SSIS MongoDB Destination Adapter - Component Tab
  8. Now, click on Mappings tab and verify mapping. If target table is new table and its empty then all input columns will be automatically mapped (Except PartitionKey and RowKey). Drop CustomerID, CompanyName mappings and reconnect as shown below.. This is just to make sure PrimaryKey columns are mapped. If table is existing table with some data then you can manually pick mapping columns by dragging it from source list to target list.
    Configure SSIS MongoDB Destination Adapter - Mapping Tab
  9. Click OK to save settings.
  10. Execute the package and verify target data (Robomongo FREE Client used to view data).
    Execute SSIS MongoDB Destination Adapter - Load Data

Component Properties

Property Name Description
EnableCustomLookupQuery Enables use of custom lookup query for Update / Upsert or Delete Operation. By default JOIN condition for target record match is generated automatically based on columns you supply in ColumnsForLookup property. However in some cases you need to supply custom condition for lookup to perform complex operations, in such case enable this option. When you enable this option you must supply map __DOCUMENT__ input column. String you supply to this column should be in this format <command><query>YOUR_LOOKUP_QUERY</query><document>YOUR_JSON_DOC</document></command>. Lookup query in <query> tag can be either Mongo JSON format (e.g. { "CustomerID" : "AAA", "Orders.OrderID" : { "$ne" : "1000" } }) or you can use ZappySys SQL query (e.g. select * from mytable where CustomerID='AAA' and [Orders.$.OrderID] != '1000' )
LoggingMode LoggingMode determines how much information is logged during Package Execution. Set Logging mode to Debugging for maximum log.

Available Options (Use numeric value listed in bracket if you have to define expression on this property (for dynamic behavior).

Option Description
Normal [0] Normal
Medium [1] Medium
Detailed [2] Detailed
Debugging [3] Debugging
PrefixTimestamp When you enable this property it will prefix timestamp before Log messages.
SwitchToRowByRowOnBatchFail When this option is true then each batch is processed in bulk mode in first attempt and when it fails that particular batch is processed in row by row mode to detect bad record. By default when error occurs all records from that batch is rejected but when this option is true you can process all good records and only reject records causing error.
TableName Table name from where you want to read data
BatchSize Indicates how many records are process in a single request. Setting this value to 0 will extract all records in single request.
AccessMode Indicates Access Mode of data (e.g. Table or Sql Query)

Available Options (Use numeric value listed in bracket if you have to define expression on this property (for dynamic behavior).

Option Description
Table [0] Table
Query [1] Query
SqlCommand Specifies SQL query to access data. This setting is ignored when access mode is set to Table.
Operation Specifies operation for destination data load (e.g. Insert, Reload etc). Default operation is Insert (data always appended to target).

Available Options (Use numeric value listed in bracket if you have to define expression on this property (for dynamic behavior).

Option Description
Insert [0] Insert
Reload [1] Reload
Update [2] Update
UpdateOrInsert [3] UpdateOrInsert
Delete [4] Delete
Replace [5] Replace
ColumnsForLookup Update/Upsert/Delete you will need JOIN columns. If you have not specified JOIN column in Mapping Editor then supply here. Specify comma separated list of target column names which can be used to lookup documents on target for selected Operation. Column names are case-sensitive. Make sure JOIN column name is correctly spelled. JOIN column can be part of supplied input DOCUMENT or it can be mapped separately as input column on Mapping Editor. When JOIN column is found inside nested sub document (i.e. not in first level) then you have to use hierarchy separator (>) and server side path for same column (Use dot). Example: if you joining by two columns OrderID and ZipCode where ZipCode is found in nested sub document then use JOIN criteria like this (e.g. OrderID,LocationInfo > ZipInfo > ZipCode AS LocationInfo.ZipInfo.ZipCode). If your JOIN condition contains column from Array then your condition may looks like this Orders.$.OrderID AS Orders.OrderID (assuming You have named your property in input document as Orders.$.OrderID). For more information about how to using $ inside Update document see this https://docs.mongodb.com/manual/reference/operator/update/positional/#op._S_
LoadOptions Load options can be used to provide additional instruction for Update/Upsert/Delete operations.
Syntax: <option1>=<value1>;<option2>=<value2>.....<optionN>=<valueN>
Following options are supported right now for MongoDB update/upsert/delete operations
• op : [Default Value = $set] Use this option to specify additional update operator (e.g. none, $set, $push, $pull, $inc, $mul, $rename etc.). If you set op=none then you have to supply update operator along with your input document mapped to input column __DOCUMENT__ e.g. {$set : {..your document..} } . For more info visit https://docs.mongodb.org/manual/reference/operator/update/
• multi : [Default Value = true] Set this to true if you want to affect more than one records for matching input lookup keys. If you just want to update one record if multiple records returned then set this to false.

For Update/Upsert/Delete you will need JOIN columns (set as different property) Make sure JOIN column name is correctly spelled, its found inside supplied DOCUMENT or mapped separately as input column. Also make sure its named correctly (its case-sensitive). If you are using column alias then check syntax (JOIN with alias must be like this yourColumnNameFromInputDoc AS yourColumnNameFromServerDoc). If you are joining sub document column which is nested then specify JOIN condition like this LocationInfo > ZipInfo > ZipCode AS LocationInfo.ZipInfo.ZipCode
MetaDataScanMode Metadata scan mode controls how data type and length is determined. By default few records scanned to determine datatype/length. Changing ScanMode affects length/datatype accuracy.

Available Options (Use numeric value listed in bracket if you have to define expression on this property (for dynamic behavior).

Option Description
Auto [0] Auto
Strict [1] Strict - Exact length
Guess2x [2] Guess2x - 2 times bigger
Guess3x [3] Guess3x - 3 times bigger
Guess4x [4] Guess4x - 4 times bigger
TreatAsUnicodeString [5] Set all columns as string
Guess10x [6] Guess10x - 10 times bigger
TreatStringAsMaxLength [7] Set string columns with MAX Length - i.e. DT_WSTR(4000)
TreatStringAsBlob [8] Set string columns as BLOB - i.e. DT_NTEXT
MetaDataCustomLength Length for all string column. This option is only valid for MetaDataScanMode=Custom
MetaDataTreatStringAsAscii When this option is true, it detects all string values as DT_STR (Ascii) rather than DT_WSTR (Unicode)

Output Column Properties

Error handling

  • This component supports error output.
  • Any bad value will cause Error or Truncation (e.g. If your sample value is 5 character long but output column length is set to 3 chars). If you have error output attached with RedirectRow option on Error, Truncation then you can capture bad rows along with exact reason of error.
  • Use error handling tab to set option for error handling. By default component fails on any data error or truncation issue

Loading Raw JSON into MongoDB

  1. In previous section [Step-By-Step] we saw how to map input columns to MongoDB columns. It was simple mappings but in real world you may have nested JSON with complex layout. In that case it would be easy to load full JSON document to MongoDB rather than mapping approach we saw earlier becase mapping tab doesnt support parent child mapping for nested JSON.

    To produce desired JSON document using SSIS you may use
    JSON Generator Transform. See below screenshot how to generate and load JSON Documents into MongoDB (map ZS_JSON_OUTPUT to __DOCUMENT__ input column of MongoDB destination). When anything is mapped to __DOCUMENT__ column on MongoDB Destination Mapping tab then input is automatically treated as JSON document, all any other mappings are ignored.

    Loading JSON document into MongoDB using SSIS MongoDB Destination Adapter - Mapping Tab for loading JSON document
    Configure SSIS MongoDB Destination Adapter - Mapping Tab for loading JSON document
  2. Thats all, Execute the package and verify target data (Robomongo FREE Client used to view data).
    Loading JSON document into MongoDB using SSIS
  3. View MongoDB JSON Documents using client tool.
    View MongoDB JSON Documents using client tool

Settings UI

SSIS MongoDB Destination - Setting UI
SSIS MongoDB Destination - Setting UI
SSIS MongoDB Destination - Setting UI
SSIS MongoDB Destination - Setting UI

MongoDB Source FAQs

See Also

References

Articles / Tutorials

Click here to see all articles for [SSIS MongoDB Destination] category
Update or Insert – Upsert MongoDB Array Items using SSIS

Update or Insert – Upsert MongoDB Array Items using SSIS

Introduction In our previous blog post we saw how to update / delete mongodb array item. Now let’s look at how to Upsert MongoDB Array Items (i.e. Insert Item if not found in Array else Update existing record). Upsert into nested MongoDB Array requires two step process, unfortunately there is no easy way to do […]


Update MongoDB Array Items using SSIS

Update MongoDB Array Items using SSIS

Introduction In our previous blog post we saw how to perform Read and Write operations in MongoDB using SSIS (i.e. Bulk Update, Delete, Upsert, Insert). In this post we specifically focus on how to update MongoDB Array items / elements using SSIS. To make things simple to follow we have used JSON Source to produce […]


SSIS – Loading data into MongoDB (Upsert, Delete, Update)

SSIS – Loading data into MongoDB (Upsert, Delete, Update)

Introduction MongoDB is one of the most poplar NoSQL database out there. In this article you will learn how to use ZappySys MongoDB Destination Connector to perform various bulk operations on MongoDB such as Insert, Update, Delete and Upsert. If you wish to learn how to Update MongoDB Array items then check this blog post instead. […]



Copyrights reserved. ZappySys LLC.