Cosmos DB Connector for Azure Data Factory (Pipeline)Connect to your Azure Cosmos DB databases to read, query, create, update, and delete documents and more! In this article you will learn how to quickly and efficiently integrate Cosmos DB data in Azure Data Factory (Pipeline) without coding. We will use high-performance Cosmos DB Connector to easily connect to Cosmos DB and then access the data inside Azure Data Factory (Pipeline). Let's follow the steps below to see how we can accomplish that! Cosmos DB Connector for Azure Data Factory (Pipeline) is based on ZappySys API Driver which is part of ODBC PowerPack. It is a collection of high-performance ODBC drivers that enable you to integrate data in SQL Server, SSIS, a programming language, or any other ODBC-compatible application. ODBC PowerPack supports various file formats, sources and destinations, including REST/SOAP API, SFTP/FTP, storage services, and plain files, to mention a few. |
Connect to Cosmos DB in other apps
|
Create ODBC Data Source (DSN) based on ZappySys API Driver
Step-by-step instructions
To get data from Cosmos DB using Azure Data Factory (Pipeline) we first need to create a DSN (Data Source) which will access data from Cosmos DB. We will later be able to read data using Azure Data Factory (Pipeline). Perform these steps:
-
Download and install ODBC PowerPack.
-
Open ODBC Data Sources (x64):
-
Create a User data source (User DSN) based on ZappySys API Driver
ZappySys API Driver-
Create and use User DSN
if the client application is run under a User Account.
This is an ideal option
in design-time , when developing a solution, e.g. in Visual Studio 2019. Use it for both type of applications - 64-bit and 32-bit. -
Create and use System DSN
if the client application is launched under a System Account, e.g. as a Windows Service.
Usually, this is an ideal option to use
in a production environment . Use ODBC Data Source Administrator (32-bit), instead of 64-bit version, if Windows Service is a 32-bit application.
Azure Data Factory (Pipeline) uses a Service Account, when a solution is deployed to production environment, therefore for production environment you have to create and use a System DSN. -
Create and use User DSN
if the client application is run under a User Account.
This is an ideal option
-
When the Configuration window appears give your data source a name if you haven't done that already, then select "Cosmos DB" from the list of Popular Connectors. If "Cosmos DB" is not present in the list, then click "Search Online" and download it. Then set the path to the location where you downloaded it. Finally, click Continue >> to proceed with configuring the DSN:
CosmosDbDSNCosmos DB -
Now it's time to configure the Connection Manager. Select Authentication Type, e.g. Token Authentication. Then select API Base URL (in most cases, the default one is the right one). More info is available in the Authentication section.
Steps how to get and use Cosmos DB credentials : API Key [Http]
Connecting to your Azure Cosmos DB data requires you to authenticate your REST API access. Follow the instructions below:- Go to your Azure portal homepage: https://portal.azure.com/.
- In the search bar at the top of the homepage, enter Azure Cosmos DB. In the dropdown that appears, select Azure Cosmos DB.
- Click on the name of the database account you want to connect to (also copy and paste the name of the database account for later use).
-
On the next page where you can see all of the database account information, look along the left side and select Keys:
- On the Keys page, you will have two tabs: Read-write Keys and Read-only Keys. If you are going to write data to your database, you need to remain on the Read-write Keys tab. If you are only going to read data from your database, you should select the Read-only Keys tab.
- On the Keys page, copy the PRIMARY KEY value and paste it somewhere for later use (the SECONDARY KEY value may also be copied and used).
- Now go to SSIS package or ODBC data source and use this PRIMARY KEY in API Key authentication configuration.
- Enter the primary or secondary key you recorded in step 6 into the Primary or Secondary Key field.
- Then enter the database account you recorded in step 3 into the Database Account field.
- Next, enter or select the default database you want to connect to using the Defualt Database field.
- Continue by entering or selecting the default table (i.e. container/collection) you want to connect to using the Default Table (Container/Collection) field.
- Select the Test Connection button at the bottom of the window to verify proper connectivity with your Azure Devops account.
- If the connection test succeeds, select OK.
- Done! Now you are ready to use Asana Connector!
Fill in all required parameters and set optional parameters if needed:
CosmosDbDSNCosmos DBAPI Key [Http]https://[$Account$].documents.azure.comRequired Parameters Primary or Secondary Key Fill-in the parameter... Account Name (Case-Sensitive) Fill-in the parameter... Database Name (keep blank to use default) Case-Sensitive Fill-in the parameter... API Version Fill-in the parameter... Optional Parameters Default Table (needed to invoke #DirectSQL) -
Once the data source connection has been configured, it's time to configure the SQL query. Select the Preview tab and then click Query Builder button to configure the SQL query:
ZappySys API Driver - Cosmos DBConnect to your Azure Cosmos DB databases to read, query, create, update, and delete documents and more!CosmosDbDSN -
Start by selecting the Table or Endpoint you are interested in and then configure the parameters. This will generate a query that we will use in Azure Data Factory (Pipeline) to retrieve data from Cosmos DB. Hit OK button to use this query in the next step.
#DirectSQL SELECT * FROM root where root.id !=null order by root._ts desc
Some parameters configured in this window will be passed to the Cosmos DB API, e.g. filtering parameters. It means that filtering will be done on the server side (instead of the client side), enabling you to get only the meaningful datamuch faster . -
Now hit Preview Data button to preview the data using the generated SQL query. If you are satisfied with the result, use this query in Azure Data Factory (Pipeline):
ZappySys API Driver - Cosmos DBConnect to your Azure Cosmos DB databases to read, query, create, update, and delete documents and more!CosmosDbDSN#DirectSQL SELECT * FROM root where root.id !=null order by root._ts desc
You can also access data quickly from the tables dropdown by selecting <Select table>.AWHERE
clause,LIMIT
keyword will be performed on the client side, meaning that thewhole result set will be retrieved from the Cosmos DB API first, and only then the filtering will be applied to the data. If possible, it is recommended to use parameters in Query Builder to filter the data on the server side (in Cosmos DB servers). -
Click OK to finish creating the data source.
Video Tutorial
Read data in Azure Data Factory (ADF) from ODBC datasource (Cosmos DB)
-
To start press New button:
-
Select "Azure, Self-Hosted" option:
-
Select "Self-Hosted" option:
-
Set a name, we will use "OnPremisesRuntime":
-
Download and install Microsoft Integration Runtime.
-
Launch Integration Runtime and copy/paste Authentication Key from Integration Runtime configuration in Azure Portal:
-
After finishing registering the Integration Runtime node, you should see a similar view:
-
Go back to Azure Portal and finish adding new Integration Runtime. You should see it was successfully added:
-
Go to Linked services section and create a new Linked service based on ODBC:
-
Select "ODBC" service:
-
Configure new ODBC service. Use the same DSN name we used in the previous step and copy it to Connection string box:
CosmosDbDSNDSN=CosmosDbDSN -
For created ODBC service create ODBC-based dataset:
-
Go to your pipeline and add Copy data connector into the flow. In Source section use OdbcDataset we created as a source dataset:
-
Then go to Sink section and select a destination/sink dataset. In this example we use precreated AzureBlobStorageDataset which saves data into an Azure Blob:
-
Finally, run the pipeline and see data being transferred from OdbcDataset to your destination dataset:
Actions supported by Cosmos DB Connector
Learn how to perform common Cosmos DB actions directly in Azure Data Factory (Pipeline) with these how-to guides:
- Create a document in the container
- Create Permission Token for a User (One Table)
- Create User for Database
- Delete a Document by Id
- Get All Documents for a Table
- Get All Users for a Database
- Get Database Information by Id or Name
- Get Document by Id
- Get List of Databases
- Get List of Tables
- Get table information by Id or Name
- Get table partition key ranges
- Get User by Id or Name
- Query documents using Cosmos DB SQL query language
- Update Document in the Container
- Upsert a document in the container
- Generic Request
- Generic Request (Bulk Write)
Conclusion
In this article we showed you how to connect to Cosmos DB in Azure Data Factory (Pipeline) and integrate data without any coding, saving you time and effort. It's worth noting that ZappySys API Driver allows you to connect not only to Cosmos DB, but to any Java application that supports JDBC (just use a different JDBC driver and configure it appropriately).
We encourage you to download Cosmos DB Connector for Azure Data Factory (Pipeline) and see how easy it is to use it for yourself or your team.
If you have any questions, feel free to contact ZappySys support team. You can also open a live chat immediately by clicking on the chat icon below.
Download Cosmos DB Connector for Azure Data Factory (Pipeline) Documentation
More integrations
Other connectors for Azure Data Factory (Pipeline)
Other application integration scenarios for Cosmos DB
How to connect Cosmos DB in Azure Data Factory (Pipeline)?
How to get Cosmos DB data in Azure Data Factory (Pipeline)?
How to read Cosmos DB data in Azure Data Factory (Pipeline)?
How to load Cosmos DB data in Azure Data Factory (Pipeline)?
How to import Cosmos DB data in Azure Data Factory (Pipeline)?
How to pull Cosmos DB data in Azure Data Factory (Pipeline)?
How to push data to Cosmos DB in Azure Data Factory (Pipeline)?
How to write data to Cosmos DB in Azure Data Factory (Pipeline)?
How to POST data to Cosmos DB in Azure Data Factory (Pipeline)?
Call Cosmos DB API in Azure Data Factory (Pipeline)
Consume Cosmos DB API in Azure Data Factory (Pipeline)
Cosmos DB Azure Data Factory (Pipeline) Automate
Cosmos DB Azure Data Factory (Pipeline) Integration
Integration Cosmos DB in Azure Data Factory (Pipeline)
Consume real-time Cosmos DB data in Azure Data Factory (Pipeline)
Consume real-time Cosmos DB API data in Azure Data Factory (Pipeline)
Cosmos DB ODBC Driver | ODBC Driver for Cosmos DB | ODBC Cosmos DB Driver | SSIS Cosmos DB Source | SSIS Cosmos DB Destination
Connect Cosmos DB in Azure Data Factory (Pipeline)
Load Cosmos DB in Azure Data Factory (Pipeline)
Load Cosmos DB data in Azure Data Factory (Pipeline)
Read Cosmos DB data in Azure Data Factory (Pipeline)
Cosmos DB API Call in Azure Data Factory (Pipeline)