ElasticSearch Connector for Azure Data Factory (Pipeline)Read / write ElasticSearch data inside your app; perform many ElasticSearch operations without coding, just use easy to use high performance API Connector for ElasticSearch In this article you will learn how to quickly and efficiently integrate ElasticSearch data in Azure Data Factory (Pipeline) without coding. We will use high-performance ElasticSearch Connector to easily connect to ElasticSearch and then access the data inside Azure Data Factory (Pipeline). Let's follow the steps below to see how we can accomplish that! ElasticSearch Connector for Azure Data Factory (Pipeline) is based on ZappySys API Driver which is part of ODBC PowerPack. It is a collection of high-performance ODBC drivers that enable you to integrate data in SQL Server, SSIS, a programming language, or any other ODBC-compatible application. ODBC PowerPack supports various file formats, sources and destinations, including REST/SOAP API, SFTP/FTP, storage services, and plain files, to mention a few. |
Connect to ElasticSearch in other apps
|
Create ODBC Data Source (DSN) based on ZappySys API Driver
Step-by-step instructions
To get data from ElasticSearch using Azure Data Factory (Pipeline) we first need to create a DSN (Data Source) which will access data from ElasticSearch. We will later be able to read data using Azure Data Factory (Pipeline). Perform these steps:
-
Download and install ODBC PowerPack.
-
Open ODBC Data Sources (x64):
-
Create a User data source (User DSN) based on ZappySys API Driver
ZappySys API Driver-
Create and use User DSN
if the client application is run under a User Account.
This is an ideal option
in design-time , when developing a solution, e.g. in Visual Studio 2019. Use it for both type of applications - 64-bit and 32-bit. -
Create and use System DSN
if the client application is launched under a System Account, e.g. as a Windows Service.
Usually, this is an ideal option to use
in a production environment . Use ODBC Data Source Administrator (32-bit), instead of 64-bit version, if Windows Service is a 32-bit application.
Azure Data Factory (Pipeline) uses a Service Account, when a solution is deployed to production environment, therefore for production environment you have to create and use a System DSN. -
Create and use User DSN
if the client application is run under a User Account.
This is an ideal option
-
When the Configuration window appears give your data source a name if you haven't done that already, then select "ElasticSearch" from the list of Popular Connectors. If "ElasticSearch" is not present in the list, then click "Search Online" and download it. Then set the path to the location where you downloaded it. Finally, click Continue >> to proceed with configuring the DSN:
ElasticsearchDSNElasticSearch -
Now it's time to configure the Connection Manager. Select Authentication Type, e.g. Token Authentication. Then select API Base URL (in most cases, the default one is the right one). More info is available in the Authentication section.
Steps how to get and use ElasticSearch credentials
For Local / Hosted Instance by you
- Get your userid / password and enter on the connection UI
For Managed Instance (By Bonsai search)
If your instance is hosted by bonsai then perform these steps to get your credentials for API call- Go to https://app.bonsai.io/clusters/{your-instance-id}/tokens
- Copy Access Key and Access Secret and enter on the connection UI. Click Test connection.
- If your Cluster has no data you can generate sample data by visiting this URL and click Add Sample Data https://{your-cluster-id}.apps.bonsaisearch.net/app/home#/tutorial_directory
Fill in all required parameters and set optional parameters if needed:
ElasticsearchDSNElasticSearchBasic Authentication (UserId/Password) [Http]http://localhost:9200Optional Parameters User Name (or Access Key) Password (or Access Secret) Ignore certificate related errors Fill in all required parameters and set optional parameters if needed:
ElasticsearchDSNElasticSearchWindows Authentication (No Password) [Http]http://localhost:9200Optional Parameters Ignore certificate related errors -
Once the data source connection has been configured, it's time to configure the SQL query. Select the Preview tab and then click Query Builder button to configure the SQL query:
ZappySys API Driver - ElasticSearchRead / write ElasticSearch data inside your app; perform many ElasticSearch operations without coding, just use easy to use high performance API Connector for ElasticSearchElasticsearchDSN -
Start by selecting the Table or Endpoint you are interested in and then configure the parameters. This will generate a query that we will use in Azure Data Factory (Pipeline) to retrieve data from ElasticSearch. Hit OK button to use this query in the next step.
SELECT * FROM Indexes
Some parameters configured in this window will be passed to the ElasticSearch API, e.g. filtering parameters. It means that filtering will be done on the server side (instead of the client side), enabling you to get only the meaningful datamuch faster . -
Now hit Preview Data button to preview the data using the generated SQL query. If you are satisfied with the result, use this query in Azure Data Factory (Pipeline):
ZappySys API Driver - ElasticSearchRead / write ElasticSearch data inside your app; perform many ElasticSearch operations without coding, just use easy to use high performance API Connector for ElasticSearchElasticsearchDSNSELECT * FROM Indexes
You can also access data quickly from the tables dropdown by selecting <Select table>.AWHERE
clause,LIMIT
keyword will be performed on the client side, meaning that thewhole result set will be retrieved from the ElasticSearch API first, and only then the filtering will be applied to the data. If possible, it is recommended to use parameters in Query Builder to filter the data on the server side (in ElasticSearch servers). -
Click OK to finish creating the data source.
Video Tutorial
Read data in Azure Data Factory (ADF) from ODBC datasource (ElasticSearch)
-
To start press New button:
-
Select "Azure, Self-Hosted" option:
-
Select "Self-Hosted" option:
-
Set a name, we will use "OnPremisesRuntime":
-
Download and install Microsoft Integration Runtime.
-
Launch Integration Runtime and copy/paste Authentication Key from Integration Runtime configuration in Azure Portal:
-
After finishing registering the Integration Runtime node, you should see a similar view:
-
Go back to Azure Portal and finish adding new Integration Runtime. You should see it was successfully added:
-
Go to Linked services section and create a new Linked service based on ODBC:
-
Select "ODBC" service:
-
Configure new ODBC service. Use the same DSN name we used in the previous step and copy it to Connection string box:
ElasticsearchDSNDSN=ElasticsearchDSN -
For created ODBC service create ODBC-based dataset:
-
Go to your pipeline and add Copy data connector into the flow. In Source section use OdbcDataset we created as a source dataset:
-
Then go to Sink section and select a destination/sink dataset. In this example we use precreated AzureBlobStorageDataset which saves data into an Azure Blob:
-
Finally, run the pipeline and see data being transferred from OdbcDataset to your destination dataset:
Actions supported by ElasticSearch Connector
Learn how to perform common ElasticSearch actions directly in Azure Data Factory (Pipeline) with these how-to guides:
- Count documents
- Create Index
- Delete documents
- Delete Index
- Get document by ID from Index or Alias
- Get documents from Index or Alias
- Get Index or Alias metadata
- Insert documents
- List aliases
- List indexes
- Search / Query documents
- Update documents
- Upsert documents
- Generic Request
- Generic Request (Bulk Write)
Conclusion
In this article we showed you how to connect to ElasticSearch in Azure Data Factory (Pipeline) and integrate data without any coding, saving you time and effort. It's worth noting that ZappySys API Driver allows you to connect not only to ElasticSearch, but to any Java application that supports JDBC (just use a different JDBC driver and configure it appropriately).
We encourage you to download ElasticSearch Connector for Azure Data Factory (Pipeline) and see how easy it is to use it for yourself or your team.
If you have any questions, feel free to contact ZappySys support team. You can also open a live chat immediately by clicking on the chat icon below.
Download ElasticSearch Connector for Azure Data Factory (Pipeline) Documentation
More integrations
Other connectors for Azure Data Factory (Pipeline)
Other application integration scenarios for ElasticSearch
How to connect ElasticSearch in Azure Data Factory (Pipeline)?
How to get ElasticSearch data in Azure Data Factory (Pipeline)?
How to read ElasticSearch data in Azure Data Factory (Pipeline)?
How to load ElasticSearch data in Azure Data Factory (Pipeline)?
How to import ElasticSearch data in Azure Data Factory (Pipeline)?
How to pull ElasticSearch data in Azure Data Factory (Pipeline)?
How to push data to ElasticSearch in Azure Data Factory (Pipeline)?
How to write data to ElasticSearch in Azure Data Factory (Pipeline)?
How to POST data to ElasticSearch in Azure Data Factory (Pipeline)?
Call ElasticSearch API in Azure Data Factory (Pipeline)
Consume ElasticSearch API in Azure Data Factory (Pipeline)
ElasticSearch Azure Data Factory (Pipeline) Automate
ElasticSearch Azure Data Factory (Pipeline) Integration
Integration ElasticSearch in Azure Data Factory (Pipeline)
Consume real-time ElasticSearch data in Azure Data Factory (Pipeline)
Consume real-time ElasticSearch API data in Azure Data Factory (Pipeline)
ElasticSearch ODBC Driver | ODBC Driver for ElasticSearch | ODBC ElasticSearch Driver | SSIS ElasticSearch Source | SSIS ElasticSearch Destination
Connect ElasticSearch in Azure Data Factory (Pipeline)
Load ElasticSearch in Azure Data Factory (Pipeline)
Load ElasticSearch data in Azure Data Factory (Pipeline)
Read ElasticSearch data in Azure Data Factory (Pipeline)
ElasticSearch API Call in Azure Data Factory (Pipeline)