SSIS Azure Storage Task (Blob Storage)
PreviousNext

Azure Blob Storage Task can be used to perform various operations with Azure Storage objects (blobs and containers) (e.g. Download, Upload, Copy, Move, Rename, Delete etc). You can also get single property of any Azure blob or get list of blobs as ADO.net Table and which can be easily loop through using ForEachLoop Task. Most of operations in this task support recursive search using wild card pattern (e.g. Myfile*.txt). Azure Cloud Storage service is one of the most popular Cloud Storage Services available right now.

Download SSIS PowerPack

Content

Video Tutorial

Step-By-Step

In this section you will learn how to download blob from Azure Storage using Azure Storage Task.
  1. In order to connect to Azure Storage Service from SSIS you will need Storage Account Name and Access Key. Ask your SysAdmin or responsible person to provide that information to you. Here is sample Account Name and Access Key(this is just example key which may differ in your case)
    Account Name: mystorageaccount
    Access Key: Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==

    Microsoft also provides offline simulator to test Azure Storage functionality. This is great way to try out Azure Storage functionality without worrying about Cost but we strongly recommend you to test everything with Azure Cloud Account before going to Production.
    Click here to read more how to get your Storage Account Name and Access Key


    If you don't have Azure Storage account then its easy to get Trial Azure Storage account or use your MSDN to get credit each month which will be more than enough for real testing. If you prefer to avoid all this hassle then Microsoft provides another great way to test Azure Storage functionality in totally offline mode on your local machine. You can just download Azure Storage Emulator and start testing. Click here to learn more how to configure Azure Storage Emulator for Testing and Development purpose.

  2. Once you have Account Name (its like UserID) and Access Key (its like password) you may proceed to create new SSIS package. In BIDS/SSDT create new SSIS package
  3. From toolbox drag Azure Storage Task on the designer surface.
    configure-ssis-azure-storae-task
  4. Double click on the task to configure it.
  5. From the Storage Action dropdown select DownloadFileFromAzure (This is default option).
  6. Under Source Connection click [New] to create S3 Connection Manager.
    configure-ssis-azure-storae-task
  7. On Connection Manager UI enter your Azure S3 account Access Key and Secret Key and leave all other options as default. Click Test Connection and if test successful then hit OK to close the connection manager dialogbox.
    configure-ssis-azure-storae-task
  8. Now on Task UI browse source file you want to download. To browse Azure S3 file click on browse button this will launch S3 File Picker dialogbox. After file is selected click OK.
    Note: If you want to download multiple files then use wild card pattern as below (when you use wild card pattern in source path then system will treat target path as folder regardless you end with slash )
    Examples:
    mybucket/myfile123.txt (single file)
    mybucket/myfile*.txt (all files starting with myfile)
    mybucket/mysubfolder1/myfile*.txt (all files starting with myfile and located under folder mysubfolder1)

    configure-ssis-azure-storae-task
  9. Once source file is selected browse target path where you want to place downloaded file. You may type folder path or exact file name you want to assign to downloaded file.
    configure-ssis-azure-storae-task
  10. Once Source and Target path configured you may run the package to test. That's it you downloaded S3 file to local machine without learning any Complex APIs.
    configure-ssis-azure-storae-task configure-ssis-azure-storae-task

Properties

Property Name Description
StorageAction This property determines what action needs to be performed on Azure Storage object(s).

Available Options

  • DownloadFileFromAzure : Download Azure Storage blob(s) to local machine (Wild card pattern is supported). File Transfer is done using multiple threads which can be configured using MaxParallelThreads property of Azure Storage Connection (From Connection Manager UI)

  • UploadFileToAzure : Upload local file(s) to Azure Storage (Wild card pattern is supported). File Transfer is done using multiple threads which can be configured using MaxParallelThreads property of Azure Storage Connection (From Connection Manager UI)

  • DeleteAzureFile : Delete Azure Storage blob(s) (Wild card pattern is supported)

  • DeleteAzureContainer : Delete single Azure Storage Container

  • CopyAzureFileToAzure : Copy Azure Storage blob(s) to different Azure Storage Container or different subfolder (Wild card pattern is supported).
    Note: In current version source and target Azure Storage Account must be same for Copy/Move operation. However in the future we may add support to copy/move between different Storage Accounts.

  • MoveAzureFileToAzure : Move Azure Storage file(s) to different Storage Container or different subfolder (Wild card pattern is supported).
    Note: In current version source and target Storage Account must be same for Copy/Move operation. However in the future we may add support to copy/move between different Storage Accounts.

  • RenameAzureFile : Rename single Azure Storage file

  • CreateNewAzureFile : Create new Azure Storage file and write content from direct value or user defined SSIS variable at runtime.

  • CreateAzureContainer : Create new Azure Storage Container. Throw error if Container already exists.

  • CreateAzureContainerIfNotExists : Create new Azure Storage Container only if its not found otherwise Task completes without error.

  • GetAzureFileSize : Returns file(s) size as Int64 and stores in ResultSet Variable. Use wild card pattern to get total file size of multiple files using recursive option.

  • GetAzureFileExistsFlag : Returns True if specified file found else return false.

  • GetAzureFileCount : Returns count of total file(s) matching with wild card pattern.

  • GetAzureFileLastModifiedDate : Returns date when file was last modified (Date is in Utc so to convert to local time add timezone difference).

  • GetAzureFileContentType : Returns content type of specified file

  • GetAzureFileMetadata : returns metadata of specified file in the following format
    <MetaData><User>SamW</User><Host>BW001-DEV</Host></MetaData>

  • GetAzureFileDataAsString : Returns file content as string and stores to result variable

  • GetAzureFileDataAsBinary : Returns file content as binary array and stores to result variable (Result variable must be set to Object datatype)

  • GetAzureContainerExistsFlag : Returns True if specified Container exists else returns False and result is stored into result variable

  • GetAzureFileListAsADONETDataTable : Returns DataTable with the following information about one or more file(s). You can loop through this record set using Native ForEachLoop Task (Make sure you select [ADO.net dataset only] option as below)

    ssis-foreachloop-azure-storage-container

    ssis-foreachloop-azure-storage-container

    ssis-foreachloop-azure-storage-blobs-files

    ssis-foreachloop-azure-storage-blobs-files

  • GetAzureContainerListAsADONETDataTable : Returns DataTable with following information about one or more Container(s). You can loop through this record set using Native ForEachLoop Task (Make sure you select [ADO.net dataset only] option as below)

    ssis-foreachloop-azure-storage-container

    ssis-foreachloop-azure-storage-container

Permission This property determines what permission is applied on selected container path. This option is only applicable for SetContainerPermission action.

Available Options

  • PrivateContainer (default): Container is only accessible by owner.

  • PublicContainerWithReadAccess : public read access for any blob using Full Uri in this container but you cant list all blobs.

  • PublicContainerWithReadAndListAccess : public read access for any blob and list all blobs in this container.

AutoSetContentType When this option is True (default is false) it will automatically determines content type of the file during Upload, Create, Copy and Move operation.Default content type of any new file is application/octet-stream
UserDefinedContentTypes If content type is not correctly set for certain file extension then you may provide your own list of content types. It has to be in the following format. Each content type in new line
*.csv=text/plain
*.dtsx=text/plain
*.dtsconfig=text/plain
.....
.....
DefaultContentType By default every new file on Azure Storage is set to content type application/octet-stream. If this is not what you want as default content type then you may provide UserDefinedContentTypes or just set this property to your own content type. System set content type in the following order

1. Check if UserDefinedContentTypes is set (if extension of file being processed match with user defined content type list then set file content Type)
2. Check if AutoSetContentType is set (if True then Lookup Predefined list and set file content Type)
3. Check if Check DefaultContentType is set (if DefaultContentType is set then use it for file being processed)
ThrowErrorIfSourceNotFound If this option is set to True (default is true) then when source path not found on Azure Storage exception is thrown. If this option is False then no error is thrown when file not found.
DeleteSourceWhenDone If this option is set to True (default is false) then after successful upload/download operation source file is deleted. This option is applicable only for Upload and Download actions.
MetadataKeyValue For any new file you may specify metadata in Key/Value pair from Metadata Tab of task UI. If you want to enter from expression then use the following syntax for multiple Key/value pairs of metadata. This option is ignored if EnableMetadataUpdate=False (see next option).
<MetaData><User>SamW</User><Host>BW001-DEV</Host></MetaData>
EnableMetadataUpdate If this option is True (default is False) then MetadataKeyValue property is processed (see previous option)
SourceStorageConnection Connection Name for Source S3 Connection (Not applicable for UploadFileToAzure and CreateNewAzureFile operations)
SourcePathAccessMode Specifies how source path will be provided

Available Options

  • Direct (default) : Direct value is provided.

  • Variable : Value is read from specified variable at runtime.

  • Connection : Value is read from path specified by file connection. This option is only applicable when Source is local File system.

SourcePathValue Direct static value (only applicable when SourcePathAccessMode=Direct)
SourcePathVariable Variable Name (only applicable when SourcePathAccessMode=Variable)
SourcePathConnection Connection Name or ID (only applicable when SourcePathAccessMode=Connection)
Recursive Specifies whether source path pattern should scan sub folder (default false)
ResultVariable Variable name which will hold result from S3 Action (default false)
TargetStorageConnection Connection Name for target S3 Connection (Only applicable for UploadFileToAzure, CopyAzureFileToAzure, MoveAzureFileToAzure and CreateNewAzureFile operations)
TargetPathAccessMode Specifies how target path will be provided

Available Options

  • Direct (default) : Direct value is provided.

  • Variable : Value is read from specified variable at runtime.

  • Connection : Value is read from path specified by file connection. This option is only applicable when target is local File system.

TargetPathValue Direct static value (only applicable when TargetPathAccessMode=Direct)
TargetPathConnection Connection Name or ID which points to target file path (only applicable when TargetPathAccessMode=Connection)
TargetPathVariable Variable Name (only applicable when TargetPathAccessMode=Variable)
OverWriteOption Specifies whether overwrite action when target file exists

Available Options

  • Direct (default) : Direct value is provided.

  • Variable : Value is read from specified variable at runtime.

  • Connection : Value is read from path specified by file connection. This option is only applicable when target is local File system.

CreateMissingTargetFolder Create target folder if missing (default True). This option is applicable for DownloadAzureFileToLocal
ExcludeRegXPattern Regular expression pattern to exclude items from selection (if you apply MyFile*.* filter for source path and then use ExcludeRegXPattern=(.msi|.exe) then it will include all files with matching name but exclude *.msi and *.exe
IncludeRegXPattern Regular expression pattern to include items from selection (if you apply MyFile*.* filter for source path and then use IncludeRegXPattern=(.txt|.csv) then it will include only txt and csv files with matching name pattern
SortBy Property by which you want to sort result

Available Options

Option Description
Size File Size in Bytes
LastModifiedDate File Last Modified DateTime
CreationDate File Creation DateTime
AgeInDays File Age in Days
LastEditInDays Last Edit in Days
Content File Content
Exists File Exists Flag
FileCount File Count
FolderPath File Directory Path
FilePath File Path
FileName File Name
FileExtension File Extension
FileEncoding File Encoding
SortDirection Sort order (e.g. Ascending or Descending)

Available Options

Option Description
Asc Asc
Desc Desc
WhereClause Where clause expression to filter items. (e.g. Size>100 and Extension IN ('.txt','.csv') )
EnableSort Sort items by specified attribute
MaxItems Maximum items to return (e.g. TOP)
ContinueOnError This option prevents task failure on error such as transfer of file(s) failed for some reason, delete operation failed for few files.
LoggingMode Determines logging detail level. Available options are Normal, Medium, Detailed, Debugging

Remarks

Settings UI

ui-azure-storage-task-settings

Metadata UI

ui-azure-storage-task-metadata ui-azure-storage-task-content-type

FAQ

References

See Also


Copyrights reserved. ZappySys LLC.