JSON Connector for Azure Data Factory (Pipeline)

JSON Connector can be used to extract and output JSON data coming from REST API web service calls (Web URL) or direct JSON String (variables or DB columns) or local JSON files data. JSON Connector also supports JSONPath to filter data from nested array/sub-documents. This Connector is optimized to work with very large JSON string.

In this article you will learn how to quickly and efficiently integrate JSON data in Azure Data Factory (Pipeline) without coding. We will use high-performance JSON Connector to easily connect to JSON and then access the data inside Azure Data Factory (Pipeline).

Let's follow the steps below to see how we can accomplish that!

Download Documentation

JSON Connector for Azure Data Factory (Pipeline) is based on ZappySys JSON Driver which is part of ODBC PowerPack. It is a collection of high-performance ODBC drivers that enable you to integrate data in SQL Server, SSIS, a programming language, or any other ODBC-compatible application. ODBC PowerPack supports various file formats, sources and destinations, including REST/SOAP API, SFTP/FTP, storage services, and plain files, to mention a few.

Create ODBC Data Source (DSN) based on ZappySys JSON Driver

Step-by-step instructions

To get data from JSON using Azure Data Factory (Pipeline) we first need to create a DSN (Data Source) which will access data from JSON. We will later be able to read data using Azure Data Factory (Pipeline). Perform these steps:

Download and install ODBC PowerPack.
Open ODBC Data Sources (x64):
Create a User data source (User DSN) based on ZappySys JSON Driver

ZappySys JSON Driver
- Create and use User DSN if the client application is run under a User Account. This is an ideal option in design-time, when developing a solution, e.g. in Visual Studio 2019. Use it for both type of applications - 64-bit and 32-bit.
- Create and use System DSN if the client application is launched under a System Account, e.g. as a Windows Service. Usually, this is an ideal option to use in a production environment. Use ODBC Data Source Administrator (32-bit), instead of 64-bit version, if Windows Service is a 32-bit application.
Azure Data Factory (Pipeline) uses a Service Account, when a solution is deployed to production environment, therefore for production environment you have to create and use a System DSN.
Select Url or File and paste the following Url for this example OR you can load existing connection string as per this article.

NOTE: Here for demo, We are using odata API, but you need to refer your own API documentation and based on that you need to use your own API URL and need to configure connection based on API Authentication type

https://services.odata.org/V3/Northwind/Northwind.svc/Customers?$format=json
Now enter JSONPath expression in Array Filter textbox to extract only specific part of JSON file as below ($.value[*] will get content of value attribute from JSON document. Value attribute is array of JSON documents so we have to use [*] to indicate we want all records of that array)

NOTE: Here, We are using our desired filter, but you need to select your desired filter based on your requirement.

Click on Test Connection button to view whether the Test Connection is SUCCESSFUL or Not.

$.value[*]
Once you configured a data source, you can preview data. Hit Preview tab, and use similar settings to preview data:
Click OK to finish creating the data source
That's it; we are done. In a few clicks we configured the call to JSON API using ZappySys JSON Connector.

Video Tutorial

Watch this video on YouTube

Read data in Azure Data Factory (ADF) from ODBC datasource (JSON)

To start press New button:
Select "Azure, Self-Hosted" option:
Select "Self-Hosted" option:
Set a name, we will use "OnPremisesRuntime":
Download and install Microsoft Integration Runtime.
Launch Integration Runtime and copy/paste Authentication Key from Integration Runtime configuration in Azure Portal:
After finishing registering the Integration Runtime node, you should see a similar view:
Go back to Azure Portal and finish adding new Integration Runtime. You should see it was successfully added:
Go to Linked services section and create a new Linked service based on ODBC:
Select "ODBC" service:
Configure new ODBC service. Use the same DSN name we used in the previous step and copy it to Connection string box:

JsonDSN

DSN=JsonDSN
For created ODBC service create ODBC-based dataset:
Go to your pipeline and add Copy data connector into the flow. In Source section use OdbcDataset we created as a source dataset:
Then go to Sink section and select a destination/sink dataset. In this example we use precreated AzureBlobStorageDataset which saves data into an Azure Blob:
Finally, run the pipeline and see data being transferred from OdbcDataset to your destination dataset:

Configuring pagination in the JSON Driver

ZappySys JSON Driver equips users with powerful tools for seamless data extraction and management from REST APIs, leveraging advanced pagination methods for enhanced efficiency. These options are designed to handle various types of pagination structures commonly used in APIs. Below are the detailed descriptions of these options:

Page-based Pagination: This method works by retrieving data in fixed-size pages from the Rest API. It allows you to specify the page size and navigate through the results by requesting different page numbers, ensuring that you can access all the data in a structured manner.
Offset-based Pagination: With this approach, you can extract data by specifying the starting point or offset from which to begin retrieving data. It allows you to define the number of records to skip and fetch subsequent data accordingly, providing precise control over the data extraction process.
Cursor-based Pagination: This technique involves using a cursor or a marker that points to a specific position in the dataset. It enables you to retrieve data starting from the position indicated by the cursor and proceed to subsequent segments, ensuring that you capture all the relevant information without missing any records.
Token-based Pagination: In this method, a token serves as a unique identifier for a specific data segment. It allows you to access the next set of data by using the token provided in the response from the previous request. This ensures that you can systematically retrieve all the data segments without duplication or omission.

Utilizing these comprehensive pagination features in the ZappySys JSON Driver facilitates efficient data management and extraction from REST APIs, optimizing the integration and analysis of extensive datasets.

For more detailed steps, please refer to this link: How to do REST API Pagination in SSIS / ODBC Drivers

Authentication

ZappySys offers various authentication methods to securely access data from various sources. These authentication methods include OAuth, Basic Authentication, Token-based Authentication, and more, allowing users to connect to a wide range of data sources securely.

ZappySys Authentication is a robust system that facilitates secure access to data from a diverse range of sources. It includes a variety of authentication methods tailored to meet the specific requirements of different data platforms and services. These authentication methods may involve:

OAuth: ZappySys supports OAuth for authentication, which allows users to grant limited access to their data without revealing their credentials. It's commonly used for applications that require access to user account information.
Basic Authentication: This method involves sending a username and password with every request. ZappySys allows users to securely access data using this traditional authentication approach.
Token-based Authentication: ZappySys enables users to utilize tokens for authentication. This method involves exchanging a unique token with each request to authenticate the user's identity without revealing sensitive information.

By implementing these authentication methods, ZappySys ensures the secure and reliable retrieval of data from various sources, providing users with the necessary tools to access and integrate data securely and efficiently. For more comprehensive details on the authentication process, please refer to the official ZappySys documentation or reach out to their support team for further assistance.

For more details, please refer to this link: ZappySys Connections

Conclusion

In this article we showed you how to connect to JSON in Azure Data Factory (Pipeline) and integrate data without any coding, saving you time and effort.

We encourage you to download JSON Connector for Azure Data Factory (Pipeline) and see how easy it is to use it for yourself or your team.

If you have any questions, feel free to contact ZappySys support team. You can also open a live chat immediately by clicking on the chat icon below.

Download JSON Connector for Azure Data Factory (Pipeline) Documentation