Introduction
In our previous post you learned how to load data into Redshift using SSIS. Now in this post you will learn how to load data into Redshift using Informatica PowerCenter. For PowerCenter we will use ZappyShell Command line for Redshift Data Load. This small powerful command line utility can handle load of several millions or billions records in few minutes. It uses parallel execution engine which takes care most tedious steps of loading data into Amazon Redshift.
Using informatica command task to load data into redshift
Informatica PowerCenter has simple task to execute command lines. you can use this task to execute ZappyShell command line. Perform following steps to load data.
- Download and install ZappyShell for Amazon Redshift from here
- Now you ready to data into Amazon Redshift
- Drag new command task into your Informatica workflow
- Enter below command to load data from SQL Server to Redshift. If you have ODBC connectivity then use ODBC DSN to read your data.
- You can use script file approach too where each command line parameter can be nicely wrapped in new line.
Here is sample command you can try to load data from SQL Server to Redshift. if you have ODBC connectivity then use ODBC DSN to read your data.
1234567891011121314151617c:\zappyshell\aws.exe import Db--source-driver ADONET_MSSQL--source-query "select ROW_NUMBER()Over(order by a.CustomerID) Id, a.*,b.*,c.OrderID,c.OrderDate,c.Freight from customers a,products b,orders c"--source-archivemethod None--source-stage-archivemethod Delete--target-stage-archivemethod Delete--source-stagepath "c:\redshift\stage"--target-table "customerdata"--target-stagepath "bw-rstest/cmdstage"--target-truncate--logfile "c:\redshift\log.txt"--maxrows-perfile 100000--region us-east-1--accesskey "AKIA*****************"--secretkey "lPi+XQ************************"--source-connstr "Data Source=localhost;Initial Catalog=Northwind;Integrated Security=SSPI;"--target-connstr "Host=mytestcluster-1.csu********.us-east-1.redshift.amazonaws.com;Port=5439;Database=dev;UserName=masteruser;Password=*********;EnableSsl=true;Timeout=30;CommandTimeout=3600;" - For complete help on import command check help file //zappysys.com/onlinehelp/zappyshell/scr/aws/aws-redshift-import.htm
ZappyShell Command Line for Redshift
Below are few feature for Command line for Amazon Redshift
◾Import data to AWS Redshift database from files or relational source (e.g. MySQL, Oracle, SQL Server)
◾Import huge amount of data (millions of rows) in few minutes with parallel load techniques
◾Load local flat files to Redshift in just single line command (option to compress data files to *.gz to speed up transfer)
◾Support for Client side encryption using AES 256 Key
◾Load data from any data source (ODBC,ADO.net or OLEDB) using sql query
◾Import compressed data files (*.gz) to redshift
◾Archive files, error reporting, file splitting, many other features