SSIS Amazon Redshift Data Transfer Task (AWS Redshift Datawarehouse Bulk Copy, Import, Export)

SSIS Amazon AWS Redshift Data Transfer Task
Amazon Redshift Data Transfer Task can be used for bulk import data to Amazon Redshift. Source data can be Flat File, ODBC, OLEDB or ADO.net source. This task uses multiple threads to upload data in parallel and optionally compress data files to speedup process.

Video Tutorial

Step-By-Step

In this tutorial we will learn how to Redshift data transfer using ZS Amazon Redshift Data Transfer(In this case its from local files, any RDBMS).

Notes

Parallel loading of multiple tables

Since COPY command is parallel load of multiple S3 files (listed in *.manifest file) we do not recommend doing other parallel activities while table is loaded.

Firstly, You need to Download and Install SSIS ZappySys PowerPack.
Once you finished first step, Open Visual Studio and Create New SSIS Package Project.
Now, Drag and Drop ZS Amazon Redshift Data Transfer Task from SSIS Toolbox.
Lets create OLE DB Server Connection, Amazon Redshift Connection and Amazon Storage Connection.

How to mass import to Redshift from RDBMS(using SQL Query).

Double click on Amazon Redshift Data Transfer to configure it.
Set Action to Bulk import to Redshift from any RDBMS. In the source tab select connection and enter following SQL Query. Select storage folder.
```
SELECT top 500 ROW_NUMBER() Over (Order By cu.CustomerID),
cu.*, ord.OrderID, ord.OrderDate, OrderAmount
FROM Customers cu, Orders ord
```
In the File Format tab, Select field delimiter and DateTime format.
Select archive options for source, target and staging files (i.e. Delete, Move or None).
In the Error handling tab, you can change settings.
Here in Advanced options tab you can change more settings.
In the Audit tab you can select variables to get values or you can create new too.
Now click on Target tab and configure like below image.

SSIS Redshift data transfer task UI - Select S3 bucket/folder location for staging area.
Click on OK button to save configure setting UI.
Thats all, you can run or execute your package.

How to mass import to Redshift from local files.

Double click on Amazon Redshift Data Transfer to configure it.

Set action to Bulk import to Redshift from local files (compressed or non-compressed). In the source tab, select files with wildcard pattern and storage folder.

Note: If you want to operation with multiple files then use wild card pattern as below (when you use wild card pattern in source path then system will treat target path as folder regardless you end with slash ).

Examples:
c:\data\file123.txt (single file)
c:\data\file*.txt (all files starting with file)
c:\data\subfolder\*.txt (all files with .csv Extension and located under folder subfolder)

SSIS Amazon AWS Redshift Data Transfer Task

Now click on Target tab and configure like below image.
Click on OK button to save configure setting UI.
Thats all, you can run or execute your package.

Properties

Property Name Description

Action

Specifies action you want to perform

Available Options

ImportRelationalDataToRedshift: Imports relational data to Amazon Redhshift using OLEDB/ADO.net/ODBC driver. You can create OLEDB connection or ADO.net connection for connecting to various Relational/Non-relational databases such as ORACLE, MySQL, DB2, MS Access, Excel. There are many 3rd party ADO.net drivers available to connect to Cloud Services such as Salesforce, Microsoft Dynamics too. If you have ODBC driver then you can always use ADO.net Connection with Odbc Data Provider.
ImportLocalFilesToRedshift: Imports local flat file(s) (compressed or non compressed) to Amazon Redhshift. Flat file(s) have to be CSV format (using any single character column delimiter and new line as row delimiter (\r\n)). If your files are in *.gz format then you must specify IsSourceFilesCompressed=True

LoadOptions Additional load options for Redshift COPY command. Most of options are available through UI but if you find something is missing and you want to set additional option during Redshift Copy command after files upload to Amazon S3 Stage area then use this property. For example to turn off statistic update and encoding to UTF8 specify Load Options= ENCODING UTF8 STATUPDATE OFF. Check COPY command for more available options

MaxParallelThreads Maximum number of parallel operations. By default total threads are CPU CORES * 8

IsSourceFilesCompressed If you specify Action=ImportLocalFilesToRedshift and your source data files are already compressed in GZIP format then check this option. When this option is checked system skips compress file step before uploading. if this option is not checked then by default system will compress data before uploading unless you specified AlwaysCompressFilesBeforeTransfer=False

SourceConnection Connection manager name for source data when Action=ImportRelationalDataToRedshift selected. Connection type can be OLEDB or ADO.net

SourcePath File path or pattern for source files for Action=ImportLocalFilesToRedshift. Files have to be in CSV format using any single character column delimiter (e.g. {|}, {tab}, {comma}). You can use wildcard chars (* or ?) to specify multiple files (e.g. c:\data\customer*.dat.???) [*=any character(s), ?=any one character]

IncludeFilesFromSubFolders If your SourcePath contains files in nested sub folders then check this option to scan files recursively

SourceStagePath Staging folder path where intermediate files will be generated after extracting data from source system. Staging folder is also used to generate compressed files before actual upload process.

SourceQuery Source SQL Query to extract data from relational data source. Records are streamed and then dumped to staging files. It may generate multiple data files depending on MaxRowsPerFile and SplitFiles setting

SourceArchiveMethod

Specifies archive action for source files

Available Options

None (Default): No action performed on source file(s) after they are uploaded to S3 Staging area
Delete: Delete local source file(s) after they are uploaded to S3 Staging area
Move: Move local source file(s) to some archive folder after they are uploaded to S3 Staging area. Archive folder is specified by SourceArchiveFolder property

SourceStageArchiveMethod

Specifies archive action for source staging files (*.gz files created from datafiles are considered as source stage files)

Available Options

None: No action performed on source stage file(s) after they are uploaded to S3 Staging area
Delete (Default): Delete local source stage file(s) after they are uploaded to S3 Staging area
Move: Move local source stage file(s) to some archive folder after they are uploaded to S3 Staging area. Archive folder is specified by SourceArchiveFolder property

SourceArchiveFolder

Archive folder path for source files. This option is only valid when SourceArchiveMethod=Move. Path can contain format specifier for timestamp. Valid format specifiers are as below.

{{DATAROOT}} = Path of source folder
{{YYYY}} = Year (e.g. 2014)
{{MM}} = Month number (e.g. 03)
{{MON}} = Month name (e.g. March)
{{DD}} = Day (e.g. 31)
{{HH}} or {{HH24}} = Hour in 24 hrs format (e.g. 23)
{{HH12}} = Hour in 12 hrs format (e.g. 12)
{{MI}} = Minute (e.g. 59)
{{SS}} = Second (e.g. 45)
{{MS}} = Milisecond (e.g. 999)
{{US}} = Microsecond (e.g. 999999)

Example: {{DATAROOT}}\archive_data\{{YYYY}}{{MM}}{{DD}}-{{HH}}{{MI}}{{SS}}
Example: c:\data\archive_data\{{YYYY}}-{{MM}}-{{DD}}

SourceStageArchiveFolder

Archive folder path for source stage files. This option is only valid when SourceStageArchiveMethod=Move. Path can contain format specifier for timestamp. Valid format specifiers are as below.

{{DATAROOT}} = Path of source stage folder
{{YYYY}} = Year (e.g. 2014)
{{MM}} = Month number (e.g. 03)
{{MON}} = Month name (e.g. March)
{{DD}} = Day (e.g. 31)
{{HH}} or {{HH24}} = Hour in 24 hrs format (e.g. 23)
{{HH12}} = Hour in 12 hrs format (e.g. 12)
{{MI}} = Minute (e.g. 59)
{{SS}} = Second (e.g. 45)
{{MS}} = Milisecond (e.g. 999)
{{US}} = Microsecond (e.g. 999999)

Example: {{DATAROOT}}\archive_gzip\{{YYYY}}{{MM}}{{DD}}-{{HH}}{{MI}}{{SS}}
Example: c:\data\archive_gzip\{{YYYY}}-{{MM}}-{{DD}}

EscapeSystemChars

When this option is specified, the backslash character (\) in input data is treated as an escape character. The character that immediately follows the backslash character is loaded into the table as part of the current column value, even if it is a character that normally serves a special purpose. For example, you can use this option to escape the delimiter character, a quote, an embedded newline, or the escape character itself when any of these characters is a legitimate part of a column value.

If you specify the ESCAPE option in combination with the REMOVEQUOTES option, you can escape and retain quotes (' or ") that might otherwise be removed. The default null string, \N, works as is, but can also be escaped in the input data as \\N. As long as you do not specify an alternative null string with the NULL AS option, \N and \\N produce the same results.

Note

The control character 0x00 (NUL) cannot be escaped and should be removed from the input data or converted. This character is treated as an end of record (EOR) marker, causing the remainder of the record to be truncated.

You cannot use the ESCAPE option for FIXEDWIDTH loads, and you cannot specify the escape character itself; the escape character is always the backslash character. Also, you must ensure that the input data contains the escape character in the appropriate places.

Here are some examples of input data and the resulting loaded data when the ESCAPE option is specified. The result for row 4 assumes that the REMOVEQUOTES option is also specified. The input data consists of two pipe-delimited fields:

1|The quick brown fox\[newline]
jumped over the lazy dog.
2| A\\B\\C
3| A \| B \| C
4| 'A Midsummer Night\'s Dream'

The data loaded into column 2 looks like this:

The quick brown fox
jumped over the lazy dog.
A\B\C
A|B|C
A Midsummer Night's Dream

Note

Applying the escape character to the input data for a load is the responsibility of the user. One exception to this requirement is when you reload data that was previously unloaded with the ESCAPE option. In this case, the data will already contain the necessary escape characters.

The ESCAPE option does not interpret octal, hex, unicode, or other escape sequence notation. For example, if your source data contains the octal linefeed value (\012) and you try to load this data with the ESCAPE option, Amazon Redshift loads the value 012 into the table and does not interpret this value as a linefeed that is being escaped.

In order to escape newlines in data that originates from Windows platforms, you might need to use two escape characters: one for the carriage return and one for the linefeed. Alternatively, you can remove the carriage returns before loading the file (for example, by using the dos2unix utility).

TargetStageArchiveMethod

Specifies archive action for target staging files

Available Options

None: No action performed on target stage file(s) after they are loaded to Redshift Database
Delete (Default): Delete target stage file(s) after they are loaded to Redshift Database
Move: Move target stage file(s) to some archive folder after they are loaded to Redshift Database. Archive folder is specified by TargetStageArchiveFolder property

TargetStageArchiveFolder

Archive folder path for target stage files. This option is only valid when TargetStageArchiveMethod=Move. Path can contain format specifier for timestamp. Valid format specifiers are as below.

{{DATAROOT}} = Path of target stage folder
{{YYYY}} = Year (e.g. 2014)
{{MM}} = Month number (e.g. 03)
{{MON}} = Month name (e.g. March)
{{DD}} = Day (e.g. 31)
{{HH}} or {{HH24}} = Hour in 24 hrs format (e.g. 23)
{{HH12}} = Hour in 12 hrs format (e.g. 12)
{{MI}} = Minute (e.g. 59)
{{SS}} = Second (e.g. 45)
{{MS}} = Milisecond (e.g. 999)
{{US}} = Microsecond (e.g. 999999)

Example: {{DATAROOT}}\archive_gzip\{{YYYY}}{{MM}}{{DD}}-{{HH}}{{MI}}{{SS}}
Example: c:\data\archive_gzip\{{YYYY}}-{{MM}}-{{DD}}

TargetConnection Target redshift database connection

TargetStageConnection Amazon Storage Connection for S3 Staging area. Make sure you select correct region on Amazon Storage Connection otherwise your final load may fail.

TargetStagePath This is the Amazon S3 path where local files will be uploaded before final Redshift Copy (COPY command reads data files from S3 Storage). Example: mybucket/stage

TargetTable Redshift table name where you want to load data. If table name doesn't contain schema name then table inside default schema (e.g. public) will be used. Examples of fully qualified table name ( public.customers --or-- "public"."customerdata"). Use of double quotes is optional in name but its recommended because when you have spaces inside object/schema name then not using double quotes can cause issue. Sometime you may have lots of tables in that case use Schema Filter in the UI to restrict table list from specified schema only.

MaxRowsPerFile Maximum number of rows in a single file before splitting data into next file (Default is 1 Million rows per file but change this setting depending on your need. If your record is very wide then reduce this setting to make your file smaller). You always have to try out some performance settings to get optimal speed (e.g. MaxRowsPerfile, MaxParallelThreads etc)

MaxRowsAllFiles This is Limit of total record you want to load into Redshift Database from relational source. This option is similar to TOP clause in SQL Server or LIMIT clause in MySQL/PostgreSQL. This option is not valid when Action=ImportLocalFilesToRedshift

FirstRowHasColumnNames When this option indicates that flat file contains column names in the first row

TruncateTargetBeforeLoad This option indicates that before target table load empty entire target table. Also check DoNotUseTransactionForTruncate setting.

DoNotUseTransactionForTruncate Use this setting to truncate target table without using transactions to improve truncate speed (NOTE: When you use this setting you wont be able to rollback changes if anything goes bad during Load process)

SplitFiles Specifies weather source sql query produces one big file or split records into multiple files based on MaxRowsPerFile setting. This option is only valid when Action=ImportRelationalDataToRedshift

NullStringAs Loads fields that match null_string as NULL, where null_string can be any string. This option cannot be used with numeric columns. To load NULL into numeric columns, such as INT, use an empty field. If your data includes a null terminator, also referred to as NUL (UTF-8 0000) or binary zero (0x000), COPY treats it as an end of record (EOR) and terminates the record. If a field contains only NUL, you can use NULL AS to replace the null terminator with NULL by specifying '\0' or '\000'. For example, NULL AS '\0' or NULL AS '\000'. If a field contains a string that ends with NUL and NULL AS is specified, the string is inserted with NUL at the end. Do not use '\n' (newline) for the null_string value. Amazon Redshift reserves '\n' for use as a line delimiter. The default null_string is '\N'.

Note: If you attempt to load nulls into a column defined as NOT NULL, the COPY command will fail.

TreatEmptyStringAsNull Indicates that Amazon Redshift should load empty CHAR and VARCHAR fields as NULL. Empty fields for other data types, such as INT, are always loaded with NULL. Empty fields occur when data contains two delimiters in succession with no characters between the delimiters. EMPTYASNULL and NULL AS '' (empty string) produce the same behavior.

TargetColumns Comma separated target column list (e.g. column1 [, column2, ...])
Specifies an optional column list to load data fields into specific columns. The columns can be in any order in the COPY statement, but when loading from flat files, such as in an Amazon S3 bucket, their order must match the order of the source data. Order does not matter when loading from an Amazon DynamoDB table. Any columns omitted from the column list are assigned either the defined DEFAULT expression or NULL if the omitted column is nullable and has no defined DEFAULT expression. If an omitted column is NOT NULL but has no defined DEFAULT expression, the COPY command fails. If an IDENTITY column is included in the column list, then EnableIdentityColumnInsert must also be specified; if an IDENTITY column is omitted, then EnableIdentityColumnInsert cannot be specified. If no column list is specified, the command behaves as if a complete, in-order column list was specified (with IDENTITY columns omitted if EnableIdentityColumnInsert was also not specified).

DoNotDeleteStageFilesOnCopyFailure Specifies weather to keep StageFiles eventhough archive method is set to Delete. This option is useful if you want to manually issue COPY command after failure and save processing time of extract, compress and upload

TreatBlankStringAsNull Loads blank fields, which consist of only white space characters, as NULL. This option applies only to CHAR and VARCHAR columns. Blank fields for other data types, such as INT, are always loaded with NULL. For example, a string that contains three space characters in succession (and no other characters) is loaded as a NULL. The default behavior, without this option, is to load the space characters as is.

DelimiterForColumn Single ASCII character that is used to separate fields in the input file, such as a pipe character ( | ), a comma ( , ), or a tab ( \t ). Non-printing ASCII characters are supported. ASCII characters can also be represented in octal, using the format '\ddd', where 'd' is an octal digit (0-7). The default delimiter is a pipe character ( | ), unless the CSV option is used, in which case the default delimiter is a comma ( , ).

QuotesAroundValue

Surrounding quotation marks are removed from strings in the incoming data. All characters within the quotes, including delimiters, are retained. If a string has a beginning single or double quotation mark but no corresponding ending mark, the COPY command fails to load that row and returns an error. The following table shows some simple examples of strings that contain quotes and the resulting loaded values.

Input String	Loaded Value with QuotesAroundValue Option
"The delimiter is a pipe (\|) character"	The delimiter is a pipe (\|) character
'Black'	Black
"White"	White
Blue'	Blue'
'Blue	Value not loaded: error condition
"Blue	Value not loaded: error condition
' ' 'Black' ' '	' 'Black' '
' '	<white space>

DateFormat

If no DATEFORMAT is specified, the default format is 'YYYY-MM-DD'. For example, an alternative valid format is 'MM-DD-YYYY'.

If you want Amazon Redshift to automatically recognize and convert the date format in your source data, specify 'auto'. The 'auto'' keyword is case sensitive. If the COPY command does not recognize the format of your date or time values, or if your date or time values use different formats, use the 'auto' option with the DATEFORMAT or TIMEFORMAT parameter. The 'auto' option recognizes several formats that are not supported when using a DATEFORMAT and TIMEFORMAT string. For more information, see Using Automatic Recognition with DATEFORMAT and TIMEFORMAT.

The date format can include time information (hour, minutes, seconds), but this information is ignored. The AS keyword is optional. For more information, see DATEFORMAT and TIMEFORMAT Strings .

AllowInvalidCharacters

Enables loading of data into VARCHAR columns even if the data contains invalid UTF-8 characters. When AllowInvalidCharacters is specified, COPY replaces each invalid UTF-8 character with a string of equal length consisting of the character specified by replacement_char. (See ReplaceInvalidCharacterWith) For example, if the replacement character is '^', an invalid three-byte character will be replaced with '^^^'.

The replacement character can be any ASCII character except NULL. The default is a question mark ( ? ). For information about invalid UTF-8 characters, see Multibyte Character Load Errors.

COPY returns the number of rows that contained invalid UTF-8 characters, and it adds an entry to the STL_REPLACEMENTS system table for each affected row, up to a maximum of 100 rows per node slice. Additional invalid UTF-8 characters are also replaced, but those replacement events are not recorded.

If ACCEPTINVCHARS is not specified, COPY returns an error whenever it encounters an invalid UTF-8 character.

ACCEPTINVCHARS is valid only for VARCHAR columns.

ReplaceInvalidCharacterWith When AllowInvalidCharacters is checked you can specify which character to use as replacement char for any invalid chars found

MaxAllowedErrorsBeforeFail

If the load returns the error_count number of errors or greater, the load fails. If the load returns fewer errors, it continues and returns an INFO message that states the number of rows that could not be loaded. Use this option to allow loads to continue when certain rows fail to load into the table because of formatting errors or other inconsistencies in the data. Set this value to 0 or 1 if you want the load to fail as soon as the first error occurs. The AS keyword is optional.

The actual number of errors reported might be greater than the specified MaxAllowedErrorsBeforeFail because of the parallel nature of Amazon Redshift. If any node in the Amazon Redshift cluster detects that MaxAllowedErrorsBeforeFail has been exceeded, each node reports all of the errors it has encountered.

TestLoad Checks the validity of the data file without actually loading the data. Use the TestLoad option to make sure that your data file will load without any errors before running the actual data load. Running COPY with the NOLOAD option is much faster than loading the data since it only parses the files.

NumberOfInitialRowsToSkip Treats the specified number_rows as a file header and does not load them. Use this option to skip file headers in all files in a parallel load.

IgnoreInvalidDateFormat Allows any date format, including invalid formats such as 00/00/00 00:00:00, to be loaded without generating an error. Applies only to TIMESTAMP and DATE columns. Always use this option with the DateFormat option. If the date format for the data does not match the DateFormat specification, Amazon Redshift inserts a NULL value into that field.

TruncateStringToFitTarget Truncates data in columns to the appropriate number of characters so that it fits the column specification. Applies only to columns with a VARCHAR or CHAR data type, and rows 4 MB or less in size.

RoundDecimalRatherThanTruncate Rounds up numeric values when the scale of the input value is greater than the scale of the column. By default, COPY truncates values when necessary to fit the scale of the column. For example, if a value of 20.259 is loaded into a DECIMAL(8,2) column, COPY truncates the value to 20.25 by default. If RoundDecimalRatherThanTruncate is specified, COPY rounds the value to 20.26. The INSERT command always rounds values when necessary to match the column's scale, so a COPY command with the RoundDecimalRatherThanTruncate option behaves the same as an INSERT command.

EnableIdentityColumnInsert Use EnableIdentityColumnInsert with tables that have IDENTITY columns if you want to override the auto-generated values with explicit values from the source data files for the tables. If the command includes a column list, that list must include the IDENTITY columns to use this option. The data format for EnableIdentityColumnInsert values must match the IDENTITY format specified by the CREATE TABLE definition.

IgnoreBlankLines Ignores blank lines that only contain a line feed in a data file and does not try to load them.

TrimTrailingBlank Removes the trailing whitespace characters from a VARCHAR string. Only applicable to columns with a VARCHAR data type.

AlwaysCompressFilesBeforeTransfer When this option is specified then files are compressed to *.gz (GZIP) format. Compressing files saves network bandwidth, speed up upload significantly and saves cost of data transfer (Default: True)

UpdateStatisticsAfterLoad

Governs automatic computation and refresh of optimizer statistics at the end of a successful COPY command. By default, if the STATUPDATE option is not used, statistics are updated automatically if the table is initially empty. See also Analyzing Tables.

Whenever ingesting data into a nonempty table significantly changes the size of the table, we recommend updating statistics either by running an ANALYZE command or by using the STATUPDATE ON option.

With STATUPDATE ON (or TRUE), statistics are updated automatically regardless of whether the table is initially empty. If STATUPDATE is used, the current user must be either the table owner or a superuser. If STATUPDATE is not specified, only INSERT permission is required.

With STATUPDATE OFF (or FALSE), statistics are never updated.

AllowMissingTrailingColumns

Allows data files to be loaded when contiguous columns are missing at the end of some of the records. The missing columns are filled with either zero-length strings or NULLs, as appropriate for the data types of the columns in question. If the EMPTYASNULL option is present in the COPY command and the missing column is a VARCHAR column, NULLs are loaded; if EMPTYASNULL is not present and the column is a VARCHAR, zero-length strings are loaded. NULL substitution only works if the column definition allows NULLs.

For example, if the table definition contains four nullable CHAR columns, and a record contains the values apple, orange, banana, mango, the COPY command could load and fill in a record that contains onlythe values <apple, orange. The missing CHAR values would be loaded as NULL values.

LoggingMode LoggingMode determines how much information is logged during Package Execution. Set Logging mode to Debugging for maximum log.

Sample log file

	
 SSIS package "TestAll.dtsx"
starting. LICENSE Found: You are running Standard Edition. License for[Company=BW,User=ZappySys].
Exporting data from sql query to ...[C:\datafiles]
Executing query - select top 500 ROW_NUMBER()Over(order by a.CustomerID) Id , a.*,b.*,c.OrderID,c.OrderDate,c.Freight  from customers a,products b,orders c
Writing data to : C:\datafiles\public_customerdatabQCW_xcx0E6KdOIts4YZdQ.txt.001.part
Total 500 records written to : C:\datafiles\public_customerdatabQCW_xcx0E6KdOIts4YZdQ.txt.001.part
Export data completed. Took 0.115 seconds for 500 records, 1 files
Compressing data files...[Stage path: C:\datafiles]
Compress data files completed. Took 0.017 seconds
Uploading data files to S3 using total [32] threads...
Creating manifest file
Uploading Stream ...[119 bytes]
public_customerdatabQCW_xcx0E6KdOIts4YZdQ.manifest Transferred 100% (119 bytes) 
New file created at bwtest-west2/public_customerdatabQCW_xcx0E6KdOIts4YZdQ.manifest, Size= 119
bytes Uploading C:\datafiles\public_customerdatabQCW_xcx0E6KdOIts4YZdQ.txt.001.part.gz ...[8.63
KB] C:\datafiles\public_customerdatabQCW_xcx0E6KdOIts4YZdQ.txt.001.part.gz Transferred 100% (8837 
bytes) Deleted local file C:\datafiles\public_customerdatabQCW_xcx0E6KdOIts4YZdQ.txt.001.part.gz because DeleteWhenDone option is
enabled Transfer
  Summary: Total items :
  1  Successful :
  1      Failed :
  0    Canceled :
  0     Skipped :
  0 Time  taken :
  00:00:00.497 Max threads :
  4 Transferred : 8.63
  KB       Speed : 17.4
Kbps Upload data files completed. Took 1.228
seconds Truncating target table "public"."customerdata" because TruncateTargetBeforeLoad option is turned
on Starting redshift bulk
insert... Executing following 
command: COPY "public"."customerdata" FROM 's3://bwtest-west2/public_customerdatabQCW_xcx0E6KdOIts4YZdQ.manifest' credentials'aws_access_key_id=xxxxxxx;aws_secret_access_key=yyyyyyyy' GZIP DELIMITER '|' DATEFORMAT 'YYYY-MM-DD HH24:MI:SS' TIMEFORMAT 'YYYY-MM-DD HH24:MI:SS' IGNOREHEADER 1 MANIFEST REGION 'us-west-2'
Completed redshift bulk insert. Took 2.823 seconds
Loaded total 500 rows, 1 files, total 501 lines scanned
Archiving cloud storage stage files...
Successfully deleted total 2 item(s)
Deleted cloud storage files because Archive method is set to Delete files
Cloud storage stage files archived. Took 0.672 seconds
RedShift Table loaded: UploadData data took total 5.1182928 sec
==========================================================
Summary: Data Transfer took total .....5.12 sec.
         Source file creation .........0.12 sec. 2.3% of total time
         compress file ................0.02 sec. 0.4% of total time
         Upload file ..................1.23 sec. 24.0% of total time
         truncate target table ........0.14 sec. 2.7% of total time
         Load data to target table ....2.94 sec. 57.4% of total time
         Archive target stage files ...0.67 sec. 13.1% of total time
==========================================================
         Total files uploaded .........1 (0.01 MB)
         Actual files loaded ..........1
         Total records loaded .........500
SSIS package "TestAll.dtsx" finished: Success.

Content

Video Tutorial

Step-By-Step

Notes

Parallel loading of multiple tables

How to mass import to Redshift from RDBMS(using SQL Query).

How to mass import to Redshift from local files.

Properties

Sample log file

Setting UI

FAQs

References

Articles / Tutorials