How to Read / Write Amazon DynamoDB in SSIS

Introduction

In this article we will look at how to Read / Write Amazon DynamoDB in SSIS. ZappySys developed many AWS related components but in this article we will look at 3 Tasks/Components for DynamoDB Integration Scenarios (Read, Write, Update, Bulk Insert, Create / Drop Table etc.). We will discuss how to use SSIS DynamoDB Source Connector , SSIS DynamoDB Destination Connector and SSIS DynamoDB ExecuteSQL Task . DynamoDB Source connector supports familiar SQL Like query language so any DBA or SQL user can learn it in no time. ZappySys developed highly sophisticated engine to turn your SQL Query into native DynamoDB query. For more information check this help file

SSIS Amazon DynamoDB Source Connector - Extract data from DynamoDB

Extract data from Amazon DynamoDB in SSIS

Prerequisites

Before we perform the steps listed in this article, you will need to make sure the following prerequisites are met:

  1. SSIS designer installed. Sometimes it is referred to as BIDS or SSDT (download it from the Microsoft site).
  2. Basic knowledge of SSIS package development using Microsoft SQL Server Integration Services.
  3. Make sure ZappySys SSIS PowerPack is installed (download it, if you haven’t already).
  4. (Optional step). Read this article, if you are planning to deploy packages to a server and schedule their execution later.

Video Tutorial – Read/Write/Update data in Amazon DynamoDB in SSIS

Below video tutorial will show you from start to end how to use ZappySys Components to perform various AWS DynamoDB Integration Tasks such as Reading Bulk data, Writing bulk data, Executing ad-hock Command for DynamoDB (such as Create/Drop Table). Get Row Count etc.

Using Amazon DynamoDB Emulator for Local Testing

If you don’t have AWS Cloud account and you still want to try DynamoDB related options in SSIS then you can use FREE Local Emulator for DynamoDB.

  1.  Download it from here.
  2. Extract to local folder (e.g. c:\tools\dynamodb_local
  3. Create a batch file there call it RunDynamo.bat and you can put command like below.
    Or (Java Home Path not set use below)
  4. Double click Batch file to launch DynamoDB Local instance

Connect to DynamoDB in SSIS

Now lets look at step by step how to Connect to DynamoDB in SSIS. For Simple demo we will use  ZS Amazon DynamoDB ExecuteSQL Task but in next section we will show you How to Read / Write DynamoDB Data using other SSIS Components.

Create SSIS DynamoDB Connection / List Tables Example

  1. Installed SSIS PowerPack if you have not done it.
  2. Open SSIS Package
  3. Drag and drop ZS Amazon DynamoDB ExecuteSQL Task  from SSIS Toolbox
  4. Double click to edit the task
  5. Click New button next to Connection Dropdown
  6. Select DynamoDB from Service Type. If you are connecting to Local DynamoDB then check DynamoDB Local option else enter your AWS Account Access Key and Secret Key to connect to Cloud Instance.
    Create new Amazon DynamoDB Connection in SSIS  - Use Amazon DynamoDB Execute SQL Task - List Tables Example

    Create new Amazon DynamoDB Connection in SSIS – Use Amazon DynamoDB Execute SQL Task – List Tables Example

  7. Click Test Connection to confirm and click OK to save Connection.
  8. Back to ExecuteSQL Task UI, Click on Examples dropdown and select command like List Tables.
  9. Click Test / Preview button.

Create DynamoDB Table Example

Now lets look at how to create DynamoDB table in SSIS Amazon DynamoDB ExecuteSQL Task.

Change your Command Type Option to CreateTableSafe (Safe Suffix means, do not throw error if table already exists).

Enter the following Command and Click Test / Preview Result button to create new table called Customers. Primary Key for table is CustomerID.

 

Write data to Amazon DynamoDB in SSIS

Loading data into Amazon DynamoDB can be easily done using SSIS DynamoDB Destination Connector. It supports loading data two different ways. You can insert Raw JSON documents or Masp individual columns too. Let’s look at both ways.

Create Sample DynamoDB Table (via AWS Console)

In the previous section, we mentioned how to create a DynamoDB Table using DynamoDB ExecuteSQL Task. If you want to learn how to create DynamoDB Table via Console then follow below steps. You can skip reading this section if you already created sample Customers table or know how this process works.
  1. In order to connect to Amazon DynamoDB from SSIS you will need Access Key and Secret Key. Ask your SysAdmin or responsible person to provide that to you. Your keys will look something like this: (this is just example key which may differ in your case)
    Click here to read more how to get your AccessKey and Secret Key
  2. For this sample exercise we will need new DynamoDB table (with CustomerID column as HASH key, see below screenshots). To create new dynamoDB table perform following steps
    • Click on create table under DynamoDB console
    • On Primary Key screen give table name (e.g. Customer), Pick Key Type=HASH, enter keyname=CustomerID and click next
      Create Amazon DynamoDB Table for SSIS Testing - Primary Key HASH or RANGE
    • On index screen no need to enter anything for this exercise so just click next
    • On Read/Write throughput just enter 1 in both fields and click next
      Create Amazon DynamoDB Table for SSIS Testing - Read WRITE Throughput
    • Review and finalize table creation by clicking Create. It may take few minutes to provision DynamoDB table (Check for AVAILABLE status)
      Create Amazon DynamoDB Table for SSIS Testing - Verify Status
  3. Once you have created DynamoDB table, got Account Key (its like UserID) and Secret Key (its like password) you may proceed to next section.

Load data into DynamoDB (Column Mappings Mode)

First lets look at how to load sample Customers data into DynamoDB Table using Simple column mapping mode.

In this section you will learn how to use Amazon DynamoDB Destination Adapter to write data to Amazon DynamoDB Table.
  1. From the SSIS toolbox drag and drop Data Flow Task on the control flow designer surface.
  2. Double click on the Data Flow task to see Data Flow designer surface.
  3. Right Click in Connection Manager Panel and select [New Connection…] menu item.
    Create new SSIS connection

    Create new SSIS connection

  4. Pick ZS-AWS-STORAGE connection type. On New Connection Dialogbox Select ServiceType from Top Dropdown, Enter AccessKey and SecrentKey and Click Test connection to verify credentials.
    Create SSIS Amazon DynamoDB Connection - Choose Type
    Configure SSIS Amazon DynamoDB Destination Connection
  5. Click OK to save connection information.
  6. From the SSIS toolbox drag and drop ZS Dummy Data Source on the dataflow designer surface.
  7. Double click Dummy Data Source to configure it.
  8. From Template pick Customer and enter row count=50 (For sample dataset has less than 100 unique values so lets pick small number).
    Configure SSIS Dummy Data Generator Source
  9. From the SSIS toolbox drag and drop Amazon DynamoDB Destination on the dataflow designer surface.
  10. Now single click on the Dummy Data Source, once you see blue arrow from source … connect it to DynamoDB Destination.
  11. Now double click Amazon DynamoDB Destination to configure it.
  12. On [Connection Manager] tab select DynamoDB connection manager.
    Configure SSIS Amazon DynamoDB Destination Adapter - Connection Tab
  13. Click on [Component Properties] tab and select AccessMode=Table and Pick DynamoDB table from Tables dropdown list.
    Configure SSIS Amazon DynamoDB Destination Adapter - Component Tab

    Configure SSIS Amazon DynamoDB Destination Adapter – Component Tab

  14. Click on [Mappings] tab and verify mapping. If dynamoDB table is new table and its empty then all input columns will be automatically mapped. If DynamoDB table is existing table then you can manually pick mapping columns by dragging it from source list to target list.
    Configure SSIS Amazon DynamoDB Destination Adapter - Mapping Tab

    Configure SSIS Amazon DynamoDB Destination Adapter – Mapping Tab

  15. Click OK to save settings.
  16. Execute the package and verify target data by going to your AWS Console under DynamoDB section.
    Execute SSIS Amazon DynamoDB Destination Adapter - Load Data

    Execute SSIS Amazon DynamoDB Destination Adapter – Load Data

  17. That’s it. Now you can go back to your AWS COnsole and verify inserted data in DynamoDB Table.
    SSIS Load Test - View DynamoDB Data in AWS Console - Record Count Check

    SSIS Load Test – View DynamoDB Data in AWS Console – Record Count Check

Insert JSON documents into DynamoDB (RAW Mode)

Previous simple approach can be used if you have simple JSON structure in DynamoDB (i.e. no nested array). Now let’s look at how to create nested JSON documents and load into DynamoDB using RAW loading approach. With this method you can create nested JSON from multiple SQL Server tables or other sources (e.g. File, Oracle, MySQL, DB2).

You can use JSON Generator Transform or Template Transform to generate JSON documents in Data flow and then load into DynamoDB. Check these article to learn more on JSON Generator use cases.

Amazon DynamoDB Data Load - Inserting Raw JSON Document
Video Tutorial – Creating Nested JSON for data loading
Below video tutorial was created for MongoDB but concepts are almost identical for DynamoDB data loading.

Read from Amazon DynamoDB in SSIS

Now lets move onto the next part – How to read data from DynamoDB in SSIS. DynamoDB is NoSQL database and it doesn’t have support for SQL Query language like any other RDBMS. But no worry.. If you using SSIS DynamoDB Source Connector then it supports custom SQL query language so it turns SQL into DynamoDB native API commands. This makes it super easy for any SQL Developer to use connector in drag and drop manner saving huge amount of time.

Let’s see how to read data from DyanmoDB Table

  1. Drag data flow

Specifying Options in WITH Clause of DynamoDB Query:

  • Specify SCAN mode using SCAN option
  • Specify SCAN mode with LIMIT of 1000 rows
  • Specify QUERY mode using QUERY option
  • Specify QUERY mode using QUERY option, and use of INDEX

Query Amazon DynamoDB with Date/Time column

There is no Datetime datatype in DynamoDB but if you are storing your date values in ISO 8601 DateTime Format as string then your query will understand automatically if you specify >, >=, <, <= or BETWEEN Filter Condition

Using SSIS variable to query DynamoDB Date/Time column

If you want to make your query dynamic then you can use SSIS variable placeholder (use quote around it). If your variable DataType is datetime then you can format it using following way so it has ISO date format. If your variable DataType is String then make sure its correctly formatted.

Click here to see various ISO date formats

Amazon DynamoDB Query Examples (DynamoDB Filter) :

 

Amazon DynamoDB Source Connector Screenshots

Amazon DynamoDB Source – Configure Connection

SSIS Amazon DynamoDB Source Connector General Settings

Amazon DynamoDB Source – Query DynamoDB using SQL language

SSIS Amazon DynamoDB Source Connector - Query DynamoDB using SQL language

Query Amazon DynamoDB – Read from Table Mode

SSIS Amazon DynamoDB Source - Read from Table

Query Amazon DynamoDB – Read from Table Mode

Amazon DynamoDB Source – Select Columns

SSIS Amazon DynamoDB Source Connector - Select Columns

Amazon DynamoDB Source – Select Columns

Query Amazon DynamoDB – Read from nested array using JSON Path

SSIS Amazon DynamoDB Source Connector - Read from nested array using JSON Path

Query Amazon DynamoDB – Read from nested array using JSON Path

Query Amazon DynamoDB – Advanced JSON Filter Options

SSIS Amazon DynamoDB Source Connector - Advanced JSON Filter Options

Query Amazon DynamoDB – Extract RAW JSON documents

SSIS Amazon DynamoDB Source Connector - Amazon DynamoDB Source - Extract RAW JSON documents

Amazon DynamoDB Source – Extract data from DynamoDB

SSIS Amazon DynamoDB Source Connector - Extract data from DynamoDB

Update Documents in DynamoDB

Now let’s look at how to Update Documents in DynamoDB. Amazon DynamoDB does not allow Bulk Update so you have to do row by row processing. You have to submit UpdateItem Command as per this document. We will update Document for matching condition and update specific columns in this example. By default Document is matched By Key unless you supply additional Condition in the command (e.g. “ConditionExpression”: “LastPostedBy = :val2” ) .

  1. Drag ZS Dummy Data Source and configure like previous section with Customers Template. You can also use OLEDB Source for read DB data.
  2. Now drag ZS Template Transform. You can enter command  like below to match record by Key and Update CompanyName and Phone fields (for demo we added -MY-NEW suffix in the data )
    If you have multi Key then use like this (e.g. CustomerID + ProductID )
    Also you can add additional condition on the top of Key JOIN criteria like below (e.g. Match Country too)
     
    Update Amazon DynamoDB Table - Use SSIS Template Transform

    Update Amazon DynamoDB Table – Use SSIS Template Transform

  3. Connect Template Transform to ZS DynamoDB Destination
  4. Now drag ZS Amazon DynamoDB Destination
    1. On first tab Select Connection
    2. On second tab Change Action to Update, Select Table as Customers
    3. On Mappings tab Map TemplateOutput column to __DOCUMENT__ column. Remove any other mappings. If you dont see columns in Target just click Refresh.
  5. Thats it now run Data flow to Update records. AFter execution you can review updated records in DynamoDB.

Call/Execute Ad-Hoc DynamoDB commands

Sometimes you have need to call ad-hoc Amazon DynamoDB commands which are different than read/write. For that you can use Amazon DynamoDB ExecuteSQL Task.

Call Amazon DynamoDB API Commands (Create/Drop Table, Get, Put)

Call Amazon DynamoDB API Commands (Create/Drop Table, Get, Put)

Conclusion

Amazon DynamoDB is one of the most popular Managed NoSQL database in cloud at fraction of cost of traditional RDBMS. Extracting/Writing/Updating data into DynamoDB can be challenge without right toolset. ZappySys AWS Connectors/Tasks makes it super easy for various AWS Integration scenarios (S3, Redshift, DynamoDB, JSON, Redshift etc). Try SSIS PowerPack for FREE to uncover many more things you can do.

 

Posted in SSIS Amazon DynamoDB Destination, SSIS Amazon DynamoDB ExecuteSQL Task, SSIS Amazon DynamoDB Src, SSIS Dummy Data Source, SSIS JSON Generator Transform, SSIS Template Transform and tagged , , , , , , , .