Cosmos DB Connector for Azure Data Factory (Pipeline)

Connect to your Azure Cosmos DB databases to read, query, create, update, and delete documents and more!

In this article you will learn how to quickly and efficiently integrate Cosmos DB data in Azure Data Factory (Pipeline) without coding. We will use high-performance Cosmos DB Connector to easily connect to Cosmos DB and then access the data inside Azure Data Factory (Pipeline).

Let's follow the steps below to see how we can accomplish that!

Download Documentation

Create ODBC Data Source (DSN) based on ZappySys API Driver

Step-by-step instructions

To get data from Cosmos DB using Azure Data Factory (Pipeline) we first need to create a DSN (Data Source) which will access data from Cosmos DB. We will later be able to read data using Azure Data Factory (Pipeline). Perform these steps:

  1. Download and install ODBC PowerPack.

  2. Open ODBC Data Sources (x64):

    Open ODBC Data Source
  3. Create a User data source (User DSN) based on ZappySys API Driver

    ZappySys API Driver
    Create new User DSN for ZappySys API Driver
    • Create and use User DSN if the client application is run under a User Account. This is an ideal option in design-time, when developing a solution, e.g. in Visual Studio 2019. Use it for both type of applications - 64-bit and 32-bit.
    • Create and use System DSN if the client application is launched under a System Account, e.g. as a Windows Service. Usually, this is an ideal option to use in a production environment. Use ODBC Data Source Administrator (32-bit), instead of 64-bit version, if Windows Service is a 32-bit application.
    Azure Data Factory (Pipeline) uses a Service Account, when a solution is deployed to production environment, therefore for production environment you have to create and use a System DSN.
  4. When the Configuration window appears give your data source a name if you haven't done that already, then select "Cosmos DB" from the list of Popular Connectors. If "Cosmos DB" is not present in the list, then click "Search Online" and download it. Then set the path to the location where you downloaded it. Finally, click Continue >> to proceed with configuring the DSN:

    CosmosDbDSN
    Cosmos DB
    ODBC DSN Template Selection
  5. Now it's time to configure the Connection Manager. Select Authentication Type, e.g. Token Authentication. Then select API Base URL (in most cases, the default one is the right one). More info is available in the Authentication section.

    Steps how to get and use Cosmos DB credentials : API Key [Http]
    Connecting to your Azure Cosmos DB data requires you to authenticate your REST API access. Follow the instructions below:
    1. Go to your Azure portal homepage: https://portal.azure.com/.
    2. In the search bar at the top of the homepage, enter Azure Cosmos DB. In the dropdown that appears, select Azure Cosmos DB.
    3. Click on the name of the database account you want to connect to (also copy and paste the name of the database account for later use).
    4. On the next page where you can see all of the database account information, look along the left side and select Keys: Use API key to get Cosmos DB data via REST API in Azure
    5. On the Keys page, you will have two tabs: Read-write Keys and Read-only Keys. If you are going to write data to your database, you need to remain on the Read-write Keys tab. If you are only going to read data from your database, you should select the Read-only Keys tab.
    6. On the Keys page, copy the PRIMARY KEY value and paste it somewhere for later use (the SECONDARY KEY value may also be copied and used).
    7. Now go to SSIS package or ODBC data source and use this PRIMARY KEY in API Key authentication configuration.
    8. Enter the primary or secondary key you recorded in step 6 into the Primary or Secondary Key field.
    9. Then enter the database account you recorded in step 3 into the Database Account field.
    10. Next, enter or select the default database you want to connect to using the Defualt Database field.
    11. Continue by entering or selecting the default table (i.e. container/collection) you want to connect to using the Default Table (Container/Collection) field.
    12. Select the Test Connection button at the bottom of the window to verify proper connectivity with your Azure Devops account.
    13. If the connection test succeeds, select OK.
    14. Done! Now you are ready to use Asana Connector!

    Fill in all required parameters and set optional parameters if needed:

    CosmosDbDSN
    Cosmos DB
    API Key [Http]
    https://[$Account$].documents.azure.com
    Required Parameters
    Primary or Secondary Key Fill-in the parameter...
    Account Name (Case-Sensitive) Fill-in the parameter...
    Database Name (keep blank to use default) Case-Sensitive Fill-in the parameter...
    API Version Fill-in the parameter...
    Optional Parameters
    Default Table (needed to invoke #DirectSQL)
    ODBC DSN HTTP Connection Configuration

  6. Once the data source connection has been configured, it's time to configure the SQL query. Select the Preview tab and then click Query Builder button to configure the SQL query:

    ZappySys API Driver - Cosmos DB
    Connect to your Azure Cosmos DB databases to read, query, create, update, and delete documents and more!
    CosmosDbDSN
    Open Query Builder in API ODBC Driver to read and write data to REST API
  7. Start by selecting the Table or Endpoint you are interested in and then configure the parameters. This will generate a query that we will use in Azure Data Factory (Pipeline) to retrieve data from Cosmos DB. Hit OK button to use this query in the next step.

    #DirectSQL SELECT * FROM root where root.id !=null order by root._ts desc
    Configure table/endpoint parameters in ODBC data source based on API Driver
    Some parameters configured in this window will be passed to the Cosmos DB API, e.g. filtering parameters. It means that filtering will be done on the server side (instead of the client side), enabling you to get only the meaningful data much faster.
  8. Now hit Preview Data button to preview the data using the generated SQL query. If you are satisfied with the result, use this query in Azure Data Factory (Pipeline):

    ZappySys API Driver - Cosmos DB
    Connect to your Azure Cosmos DB databases to read, query, create, update, and delete documents and more!
    CosmosDbDSN
    #DirectSQL SELECT * FROM root where root.id !=null order by root._ts desc
    API ODBC Driver-based data source data preview
    You can also access data quickly from the tables dropdown by selecting <Select table>.
    A WHERE clause, LIMIT keyword will be performed on the client side, meaning that the whole result set will be retrieved from the Cosmos DB API first, and only then the filtering will be applied to the data. If possible, it is recommended to use parameters in Query Builder to filter the data on the server side (in Cosmos DB servers).
  9. Click OK to finish creating the data source.

Video Tutorial

Read data in Azure Data Factory (ADF) from ODBC datasource (Cosmos DB)

  1. To start press New button:

    Create new Self-Hosted integration runtime
  2. Select "Azure, Self-Hosted" option:

    Create new Self-Hosted integration runtime
  3. Select "Self-Hosted" option:

    Create new Self-Hosted integration runtime
  4. Set a name, we will use "OnPremisesRuntime":

    Set a name for IR
  5. Download and install Microsoft Integration Runtime.

  6. Launch Integration Runtime and copy/paste Authentication Key from Integration Runtime configuration in Azure Portal:

    Copy/paste Authentication Key
  7. After finishing registering the Integration Runtime node, you should see a similar view:

    Check Integration Runtime node status
  8. Go back to Azure Portal and finish adding new Integration Runtime. You should see it was successfully added:

    Integration Runtime status
  9. Go to Linked services section and create a new Linked service based on ODBC:

    Add new Linked service
  10. Select "ODBC" service:

    Add new ODBC service
  11. Configure new ODBC service. Use the same DSN name we used in the previous step and copy it to Connection string box:

    CosmosDbDSN
    DSN=CosmosDbDSN
    Configure new ODBC service
  12. For created ODBC service create ODBC-based dataset:

    Add new ODBC dataset
  13. Go to your pipeline and add Copy data connector into the flow. In Source section use OdbcDataset we created as a source dataset:

    Set source in Copy data
  14. Then go to Sink section and select a destination/sink dataset. In this example we use precreated AzureBlobStorageDataset which saves data into an Azure Blob:

    Set sink in Copy data
  15. Finally, run the pipeline and see data being transferred from OdbcDataset to your destination dataset:

    Run the flow

Actions supported by Cosmos DB Connector

Learn how to perform common Cosmos DB actions directly in Azure Data Factory (Pipeline) with these how-to guides:

Conclusion

In this article we showed you how to connect to Cosmos DB in Azure Data Factory (Pipeline) and integrate data without any coding, saving you time and effort. It's worth noting that ZappySys API Driver allows you to connect not only to Cosmos DB, but to any Java application that supports JDBC (just use a different JDBC driver and configure it appropriately).

We encourage you to download Cosmos DB Connector for Azure Data Factory (Pipeline) and see how easy it is to use it for yourself or your team.

If you have any questions, feel free to contact ZappySys support team. You can also open a live chat immediately by clicking on the chat icon below.

Download Cosmos DB Connector for Azure Data Factory (Pipeline) Documentation

More integrations

Other connectors for Azure Data Factory (Pipeline)

All
Big Data & NoSQL
Database
CRM & ERP
Marketing
Collaboration
Cloud Storage
Reporting
Commerce
API & Files

Other application integration scenarios for Cosmos DB

All
Data Integration
Database
BI & Reporting
Productivity
Programming Languages
Automation & Scripting
ODBC applications

  • How to connect Cosmos DB in Azure Data Factory (Pipeline)?

  • How to get Cosmos DB data in Azure Data Factory (Pipeline)?

  • How to read Cosmos DB data in Azure Data Factory (Pipeline)?

  • How to load Cosmos DB data in Azure Data Factory (Pipeline)?

  • How to import Cosmos DB data in Azure Data Factory (Pipeline)?

  • How to pull Cosmos DB data in Azure Data Factory (Pipeline)?

  • How to push data to Cosmos DB in Azure Data Factory (Pipeline)?

  • How to write data to Cosmos DB in Azure Data Factory (Pipeline)?

  • How to POST data to Cosmos DB in Azure Data Factory (Pipeline)?

  • Call Cosmos DB API in Azure Data Factory (Pipeline)

  • Consume Cosmos DB API in Azure Data Factory (Pipeline)

  • Cosmos DB Azure Data Factory (Pipeline) Automate

  • Cosmos DB Azure Data Factory (Pipeline) Integration

  • Integration Cosmos DB in Azure Data Factory (Pipeline)

  • Consume real-time Cosmos DB data in Azure Data Factory (Pipeline)

  • Consume real-time Cosmos DB API data in Azure Data Factory (Pipeline)

  • Cosmos DB ODBC Driver | ODBC Driver for Cosmos DB | ODBC Cosmos DB Driver | SSIS Cosmos DB Source | SSIS Cosmos DB Destination

  • Connect Cosmos DB in Azure Data Factory (Pipeline)

  • Load Cosmos DB in Azure Data Factory (Pipeline)

  • Load Cosmos DB data in Azure Data Factory (Pipeline)

  • Read Cosmos DB data in Azure Data Factory (Pipeline)

  • Cosmos DB API Call in Azure Data Factory (Pipeline)