import {Db|File}<SOURCE_TYPE> [--enc-key ENCRYPTION_KEY_PATH] [--load-options INPUT] [--maxthreads INPUT] [--source-compressed] [--source-driver {ADONET_MSSQL|ODBC|OLEDB|ZS_REDSHIFT}] [--source-path INPUT] [--source-recursive] [--source-stagepath INPUT] [--source-query INPUT] [--archivemethod {Delete|Move}] [--source-archivemethod {Delete|Move}] [--source-stage-archivemethod {Delete|Move}] [--source-archivefolder INPUT] [--source-stage-archivefolder INPUT] [--escapechar] [--target-stage-archivemethod {Delete|Move}] [--target-stage-archivefolder INPUT] [--target-stagepath INPUT] [--target-table INPUT] [--no-split] [--maxrows-perfile INPUT] [--maxrows-all INPUT] [--logfile INPUT] [--firstrow-has-colnames] [--target-truncate] [--target-truncate-fast] [--nullas INPUT] [--treat-empty-asnull] [--target-columns INPUT] [--cleanup-onerror] [--treat-blank-asnull] [--coldelimiter INPUT] [--quotedvalue] [--dateformat INPUT] [--allow-invalidchar] [--invalidchar-replacement INPUT] [--maxerrors INPUT] [--testload] [--rows-to-skip INPUT] [--ignore-invalid-date] [--truncate-to-fit] [--roundec] [--identinsert] [--ignore-blank] [--trimend] [--no-compress] [--updatestats] [--allow-missing-cols] [--source-connstr INPUT] [--target-connstr INPUT] [--accesskey YOUR_ACCESS_KEY] [--secretkey YOUR_SECRET_KEY] [--serviceurl INPUT] [--region INPUT] [--consoleout FILEPATH] [-v|--verbose] [-w|--wait] [-i|--interactive]
Parameter | Description |
---|---|
<SOURCE_TYPE> |
Type of source from where you want to read data. Valid options are DB or FILE . If input files are already compressed in gz format then use --source-compressed option. Possible values are [Db | File] |
--enc-key ENCRYPTION_KEY_PATH |
Specifies AES 256 encryption key file path. This encryption key is used to encrypt data before uploading to S3. Data is decrypted automatically by Redshift while loading S3 files into target table. To create new encryption key use mkkey command. |
--load-options |
Additional load options for Redshift Copy |
--maxthreads |
Maximum parallel threads |
--source-compressed |
Specifies whether source file(s) already compressed |
--source-driver |
Driver type (e.g. ADONET_SQL, ODBC, OLEDB) |
--source-path |
Source file(s) path (e.g. c:\cust*.txt --or-- c:\cust001.txt |
--source-recursive |
Include source files from sub directories |
--source-stagepath |
Source stage path |
--source-query |
Source data query |
--archivemethod |
Default Archive method for source and target stage file |
--source-archivemethod |
Archive method for source file(s) |
--source-stage-archivemethod |
Archive method for source stage file(s) |
--source-archivefolder |
Archive folder path for source file(s) |
--source-stage-archivefolder |
Archive folder path for source stage file(s) |
--escapechar |
Allow escape sequence character |
--target-stage-archivemethod |
Archive method for target stage file(s) |
--target-stage-archivefolder |
Archive folder path for target stage file(s) |
--target-stagepath |
Target staging folder path |
--target-table |
Target table where you want to load data |
--no-split |
Do not split data into multiple files. If you don't specify this option then data is exported to stage into multiple files (default rows per file can be set using --maxrows-perfile option (default is 1000000 per file) |
--maxrows-perfile |
Max rows per file when splitfiles option is enabled |
--maxrows-all |
Max rows to load from input dataset |
--logfile |
Log file path |
--firstrow-has-colnames |
Truncate target table before load |
--target-truncate |
Truncate target table before data load using transaction with rollback support |
--target-truncate-fast |
Truncate target table before data load without using transaction (not recommended. Cant rollback delete if data load fails) |
--nullas |
Null indicator string in data |
--treat-empty-asnull |
Treat empty string as NULL |
--target-columns |
List of columns to load in target table (input data must be in the same order) |
--cleanup-onerror |
Always make sure stage files cleaned up when archive method is set to Delete. When this option is not set files are left in stage area if error occurs. |
--treat-blank-asnull |
Treat blank string (empty or only spaces) as NULL |
--coldelimiter |
Field separator character for data files. (NOTE: use \t for tab) |
--quotedvalue |
Wrap quotes around value in data files |
--dateformat |
Date format in data file (e.g. YYYY-MM-DD HH24:MI:SS.MI) |
--allow-invalidchar |
Allow invalid UTF8 character in input data files |
--invalidchar-replacement |
Replacement char for invalid char |
--maxerrors |
Maximum allowed errors before failing data load |
--testload |
Load data in test mode. This option validates full load without actually loading it |
--rows-to-skip |
Number of initial rows to skip from data file |
--ignore-invalid-date |
Ignore invalid date |
--truncate-to-fit |
Truncate extra chars to fit target column |
--roundec |
Round decimal values rather than truncate |
--identinsert |
Enable identity insert |
--ignore-blank |
Enable identity insert |
--trimend |
Trim trailing blanks |
--no-compress |
Do not compress files before transfer. If this option is not specified then files are always compressed |
--updatestats |
Update statistics after load |
--allow-missing-cols |
Allow missing columns at the end in source file |
--source-connstr |
Source DB connection string |
--target-connstr |
Target DB connection string |
--accesskey YOUR_ACCESS_KEY |
Cloud account access key |
--secretkey YOUR_SECRET_KEY |
Cloud account secret key |
--serviceurl |
Service url for Amazon S3 service. Leave blank to route requests to default endpoint. |
--region |
Region of your S3 Bucket used for staging area for Redshift. Specify this option if S3 bucket is in different region than your redshift cluster. |
--consoleout FILEPATH |
Captures console output and saves to file |
-v | --verbose |
Log verbosity level. -v means detailed, If you don't specify this option then minimum logging will be done. |
-w | --wait |
Wait for user input before exit |
-i | --interactive |
Continue in interactive mode after this command is completed. This switch is not applicable if you already in interactive mode |
Examples |
---|
Import SQL Server records to Redshift Cluster using minimum default optionsimport Db --source-driver ADONET_MSSQL --source-query "select ROW_NUMBER()Over(order by a.CustomerID) Id, a.*,b.*,c.OrderID,c.OrderDate,c.Freight from customers a,products b,orders c" --source-stagepath "c:\redshift\stage" --target-table "customerdata" --target-stagepath "bw-rstest/cmdstage" --logfile "c:\redshift\log.txt" --region us-east-1 --accesskey "AKIA*****************" --secretkey "lPi+XQ************************" --source-connstr "Data Source=localhost;Initial Catalog=Northwind;Integrated Security=SSPI;" --target-connstr "Host=mytestcluster-1.csu********.us-east-1.redshift.amazonaws.com;Port=5439;Database=dev;UserName=masteruser;Password=*********;EnableSsl=true;Timeout=30;CommandTimeout=3600;" |
Import SQL Server records to Redshift Cluster using minimum default optionsimport Db --source-driver ADONET_MSSQL --source-query "select ROW_NUMBER()Over(order by a.CustomerID) Id, a.*,b.*,c.OrderID,c.OrderDate,c.Freight from customers a,products b,orders c" --source-stagepath "c:\redshift\stage" --target-table "customerdata" --target-stagepath "bw-rstest/cmdstage" --logfile "c:\redshift\log.txt" --region us-east-1 --accesskey "AKIA*****************" --secretkey "lPi+XQ************************" --source-connstr "Data Source=localhost;Initial Catalog=Northwind;Integrated Security=SSPI;" --target-connstr "Host=mytestcluster-1.csu********.us-east-1.redshift.amazonaws.com;Port=5439;Database=dev;UserName=masteruser;Password=*********;EnableSsl=true;Timeout=30;CommandTimeout=3600;" |
Import SQL Server records to Redshift Cluster - With Encryption turned on#-------------------------------------- #Step-1 : Create your encryption key (This step only required first time. Once encryption key is generated you can reuse it) #-------------------------------------- mkkey "c:\private.key" --aes #-------------------------------------- #Step-2 : Load your data to redshift #-------------------------------------- import Db --enc-key "c:\private.key" --source-driver ADONET_MSSQL --source-query "select ROW_NUMBER()Over(order by a.CustomerID) Id, a.*,b.*,c.OrderID,c.OrderDate,c.Freight from customers a,products b,orders c" --source-stagepath "c:\redshift\stage" --target-table "customerdata" --target-stagepath "bw-rstest/cmdstage" --logfile "c:\redshift\log.txt" --region us-east-1 --accesskey "AKIA*****************" --secretkey "lPi+XQ************************" --source-connstr "Data Source=localhost;Initial Catalog=Northwind;Integrated Security=SSPI;" --target-connstr "Host=mytestcluster-1.csu********.us-east-1.redshift.amazonaws.com;Port=5439;Database=dev;UserName=masteruser;Password=*********;EnableSsl=true;Timeout=30;CommandTimeout=3600;" |
Import SQL Server records to Redshift Clusterimport Db --source-driver ADONET_MSSQL --source-query "select ROW_NUMBER()Over(order by a.CustomerID) Id, a.*,b.*,c.OrderID,c.OrderDate,c.Freight from customers a,products b,orders c" --source-archivemethod None --source-stage-archivemethod Delete --target-stage-archivemethod Delete --source-stagepath "c:\redshift\stage" --target-table "customerdata" --target-stagepath "bw-rstest/cmdstage" --target-truncate --logfile "c:\redshift\log.txt" --maxrows-perfile 100000 --region us-east-1 --accesskey "AKIA*****************" --secretkey "lPi+XQ************************" --source-connstr "Data Source=localhost;Initial Catalog=Northwind;Integrated Security=SSPI;" --target-connstr "Host=mytestcluster-1.csu********.us-east-1.redshift.amazonaws.com;Port=5439;Database=dev;UserName=masteruser;Password=*********;EnableSsl=true;Timeout=30;CommandTimeout=3600;" |
Import local flat files to Redshift clusterimport File --source-path "c:\data\files\*.part" --source-archivemethod None --source-stage-archivemethod Delete --target-stage-archivemethod Delete --source-stagepath "c:\redshift\stage" --target-table "customerdata" --target-stagepath "bw-rstest/cmdstage" --target-truncate --logfile "c:\redshift\log.txt" --maxrows-perfile 100000 --region us-east-1 --accesskey "AKIA*****************" --secretkey "lPi+XQ************************" --target-connstr "Host=mytestcluster-1.csu********.us-east-1.redshift.amazonaws.com;Port=5439;Database=dev;UserName=masteruser;Password=*********;EnableSsl=true;Timeout=30;CommandTimeout=3600;" |
Import local compressed gzip files to Redshift clusterimport File --source-compressed --source-path "c:\data\gzip\*.gz" --source-archivemethod None --target-stage-archivemethod Delete --target-table "customerdata" --target-stagepath "bw-rstest/cmdstage" --target-truncate --coldelimiter "|" --logfile "c:\redshift\log.txt" --maxrows-perfile 100000 --region us-east-1 --accesskey "AKIA*****************" --secretkey "lPi+XQ************************" --target-connstr "Host=mytestcluster-1.csu********.us-east-1.redshift.amazonaws.com;Port=5439;Database=dev;UserName=masteruser;Password=*********;EnableSsl=true;Timeout=30;CommandTimeout=3600;" |
Import data from any ODBC data source to Redshift cluster. Use tab for column seperatorimport Db --source-driver ODBC --source-query "select ROW_NUMBER()Over(order by a.CustomerID) Id, a.*,b.*,c.OrderID,c.OrderDate,c.Freight from customers a,products b,orders c" --source-archivemethod None --source-stage-archivemethod Delete --target-stage-archivemethod Delete --source-stagepath "c:\redshift\stage" --target-table "customerdata" --target-stagepath "bw-rstest/cmdstage" --target-truncate --coldelimiter "\t" --logfile "c:\redshift\log.txt" --maxrows-perfile 100000 --region us-east-1 --accesskey "AKIA*****************" --secretkey "lPi+XQ************************" ## Approach-1: Below syntax is for Odbc connectivity using DSN approach ## --source-connstr "Dsn=MyDsn;Uid=MyUser1;Pwd=MyP@ssword;" ## Approach-2: Below syntax is for Odbc connectivity using DSN less approach. Refer vendor connection string documentation for valid Driver name/Attributes --source-connstr "Driver={SQL Server Native Client 10.0};Server=localhost;Database=Northwind;Trusted_Connection=yes;" ## Adjust CommandTime out for long running Load. Default is 1 hr for command timeout and 30 seconds for connect timeout --target-connstr "Host=mytestcluster-1.csu********.us-east-1.redshift.amazonaws.com;Port=5439;Database=dev;UserName=masteruser;Password=*********;EnableSsl=true;Timeout=30;CommandTimeout=3600;" |