Read Azure Blob Storage Files in SSIS (CSV, JSON, XML) – Gen2 / Gen1

Introduction

Azure BLOB StorageIn our previous blog we saw how to load data into Azure Blob Storage. Now in this blog, we will see How to read Azure Blob Storage Files in SSIS (CSV, JSON, XML Format files). To illustrate, we will use ZappySys SSIS PowerPack, which includes several tasks to import/export data from multiples sources to multiple destinations like flat files, Azure, AWS, databases, Office files and more. They are Coding free, drag and drop high-performance suite of Custom SSIS Components and SSIS Tasks. If you like perform other operations on Azure Blob Storage Files (e.g. Download, Upload, Create, Delete) then check these articles.

In nutshell, this post will focus on how to Read Azure Blob Storage files in SSIS using following tasks / sources.

 

Components Mentioned in this article

Prerequisite

  1. First, you will need to have SSIS installed
  2. Secondly, make sure to have SSDT
  3. Download and install Microsoft Azure Storage Emulator
  4. Download and install Microsoft Azure Storage Explorer
  5. Finally, do not forget to install ZappySys SSIS PowerPack

NOTE: If you want to use Live account (Azure Blob Storage) then you can skip Step #3

What is Azure Blob Storage

Azure Blob storage is Microsoft’s object storage solution for the cloud. you can store large amounts of unstructured data, such as text or binary data. Blob storage discloses three resources:

  • Storage account (You can access data objects in Azure Storage through a storage account.
    For more information, click here.)
  • the containers in the account(constructs a set of blobs, it is similar to a folder in a file system.
    All blobs lie within a container, Note: Container name must be lowercase)
  • the blobs in a container. (Azure Storage offers three types of blobs: block blobs, append blobs, and page blobs)
    See the below-attached diagram. It shows the relationship between these resources.
         You can also use Azure Storage Explorer on your Local machine. Azure Storage Explorer is a standalone app that enables you to easily work with Azure Storage data on Windows, macOS, and Linux. You can use Blob storage to expose data publicly to the world or to store application data privately. Connect to an Azure storage account or service

 

Getting Started

In order to start, we will show several examples. ZappySys includes an SSIS Azure Blob Source for CSV/JSON/XML File that will help you in reading CSV, JSON and XML Files from Azure Blob to the Local machine, Upload files(s) to Azure Blob Storage. It will also support Delete, Rename, List, Get Property, Copy, Move, Create, Set Permission … and many more operations. Here we are showing you is, How to download files from Azure Blob Storage.

You can connect to your Azure Storage Account by entering your storage account credentials. Here I am showing an example of the use of local Azure Storage Emulator.

Setup Azure Storage client tools

  1. Once you have downloaded and installed storage emulator You can launch Microsoft Azure Storage Emulator from its Physical location or from the desktop or start menu shortcut.
    Microsoft Azure Storage Emulator Physical Location

    Microsoft Azure Storage Emulator Physical Location

  2. If You can see the below-attached Command Prompt screen after Emulator started. Then you can proceed to start Microsoft Azure Storage Explorer as the Azure Storage Emulator is started successfully.
    Command Prompt Screen after Microsoft Azure Storage Emulator Started

    Command Prompt Screen after Microsoft Azure Storage Emulator Started

  3. Now, You have to download and install Microsoft Azure Storage Explorer and then You can launch Microsoft Azure Storage Explorer from its Physical location or from the desktop or start menu shortcut.
    Microsoft Azure Storage Explorer Physical Location

    Microsoft Azure Storage Explorer Physical Location

Create an Azure Blob Storage Container

For Creating a Blob Container, First of all, you need to go to Microsoft Storage Explorer Window. Then you can go through like this way (Storage Accounts –> (Development) –> Blob Containers).

Microsoft Azure Storage Explorer: Create Blob Container

Microsoft Azure Storage Explorer: Create a Blob Container

you can also create a Virtual Directory under it. A Virtual Directory does not actually exist in Azure until you paste, drag or upload blobs into it.

Creating the new Virtual Directory under Blob Container

Creating the new Virtual Directory under Blob Container

Read Azure Blob Storage Files in SSIS (CSV, JSON, XML)

Let´s start with an example. In this SSIS Azure Blob Source for CSV/JSON/XML File task example, we will read CSV/JSON/XML files from Azure Blob Storage to SQL Server database.

  1. First of All, Drag and drop Data Flow Task from SSIS Toolbox and double click it to edit.
    Drag and Drop SSIS Data Flow Task from SSIS Toolbox

    Drag and Drop SSIS Data Flow Task from SSIS Toolbox

  2. Drag and Drop relevant Azure Blob Source for CSV/JSON/XML File Task from the SSIS Toolbox.
    Add Azure Blob Source Tasks

    Add Azure Blob Source Tasks

  3. Create a connection for Azure Blob Storage Account.
    Create Azure Storage Connection

    Create Azure Storage Connection

  4. We can also connect the Microsoft Azure Storage emulator also like this.
    Connection Form of Azure Blob Storage Account

    Connection Form of Azure Blob Storage Account

  5. Select the relevant single file to read from Azure Blob Storage in their relevant source of CSV/JSON/XML File Task.
    Select File From Azure Blob Storage

    Select File From Azure Blob Storage

  6. We can also read the multiple files stored in Azure Blob Storage using wildcard pattern supported e.g. dbo.tblNames*.csv / dbo.tblNames*.json / dbo.tblNames*.xml in relevant source task
    Use wildcard pattern .* to read multiple files data

    Use wildcard pattern .* to read multiple files data

  7. We can also read the zip and gzip compressed files also without extracting it in the specific Azure Blob Source for CSV/JSON/XML File Task.
    Reading zip and gzip compressed files (stream mode)

    Reading zip and gzip compressed files (stream mode)

  8. That’s it, we are ready to load this file(s) data into the SQL Server.

Load Azure Blob Storage Files data into SQL Server

Now let's look at how to load data into target like SQL Server, Oracle or Flat File. In below example we will see loading data into SQL Server database but steps may remain same for other targets which can be accessed using OLEDB Drivers (e.g. Oracle).
  1. Inside Data Flow, Drag and drop Upsert Destination Component from SSIS Toolbox
  2. Connect our Source component to Upsert Destination
  3. Double click Upsert Destination to configure it
  4. Select Target Connection or click NEW to create new connectionConfigure SSIS Upsert Destination Connection - Loading data (REST / SOAP / JSON / XML /CSV) into SQL Server or other target using SSIS Configure SSIS Upsert Destination Connection - Loading data (REST / SOAP / JSON / XML /CSV) into SQL Server or other target using SSIS
  5. Select Target Table or click NEW to create new table based on source columns
  6. Click on Mappings Tab to Auto map columns by name. You can change mappings as you need SSIS Upsert Destination - Columns Mappings SSIS Upsert Destination - Columns Mappings
  7. Click OK to Save Upsert Destination Settings
  8. That's it, You are now ready to run data flow. NOTE: If you wish to debug data flow and see records when you run, add data viewer by right click on blue arrow > Click Enable Data Viewer
  9. To execute data flow, Right click anywhere inside Data Flow Surface and click Execute Task
 

Read / Write data to Azure Data Lake Storage Gen 2 / Gen 1 (CSV / XML / JSON)

Check the below articles if you like to know more about how to write to Azure Blob Storage.

Article#1

https://community.zappysys.com/t/how-to-read-write-from-azure-data-lake-storage-gen2-in-ssis/125

Article#2

SSIS Data Load – SQL Server to Azure Blob (Split Files, GZip)

 

Conclusion

Above all, in this blog, we learned how to Read Azure Blob Storage Files in SSIS. We used Azure Blob Source for CSV FileAzure Blob Source for JSON File and Azure Blob Source for XML File to read the file(s) from Microsoft Azure Blob Storage and load data into SQL server. You can download SSIS PowerPack here to try many other scenarios not discussed in this blog along with 70+ other components.

References

Finally, you can use the following links for more information:

Posted in SSIS Azure Blob Connection, SSIS Azure Blob CSV Source, SSIS Azure Blob JSON Source, SSIS Azure Blob Storage Task, SSIS Azure Blob XML Source and tagged , , , , , .