How to integrate HTML Table using Talend Studio
Learn how to quickly and efficiently connect HTML Table with Talend Studio for smooth data access.
Read HTML table data effortlessly. Extract, sync, and manage web table content for analytics, reporting, and data pipelines — almost no coding required. You can do it all using the high-performance HTML Table ODBC Driver for Talend Studio (often referred to as the HTML Table Connector). We'll walk you through the entire setup.
Ready to dive in? Download the product to jump right in, or follow the step-by-step guide below to see how it works.
Create data source in ZappySys Data Gateway
In this section we will create a data source for HTML Table in the Data Gateway. Let's follow these steps to accomplish that:
-
Download and install ODBC PowerPack (if you haven't already).
-
Search for
gatewayin the Windows Start Menu and open ZappySys Data Gateway Configuration:
-
Go to the Users tab and follow these steps to add a Data Gateway user:
- Click the Add button
-
In the Login field enter a username, e.g.,
john - Then enter a Password
- Check the Is Administrator checkbox
- Click OK to save
-
Now we are ready to add a data source:
- Click the Add button
- Give the Data source a name (have it handy for later)
- Then select Native -
- Finally, click OK
HtmlTableDSN
-
When the Configuration window appears give your data source a name if you haven't done that already, then select "HTML Table" from the list of Popular Connectors. If "HTML Table" is not present in the list, then click "Search Online" and download it. Then set the path to the location where you downloaded it. Finally, click Continue >> to proceed with configuring the DSN:
HtmlTableDSNHTML Table
-
Now it's time to configure the Connection Manager. Select Authentication Type, e.g. Token Authentication. Then select API Base URL (in most cases, the default one is the right one). More info is available in the Authentication section.
-
Once the data source connection has been configured, it's time to configure the SQL query. Select the Preview tab and then click Query Builder button to configure the SQL query:
- HTML TableRead HTML table data effortlessly. Extract, sync, and manage web table content for analytics, reporting, and data pipelines — almost no coding required.HtmlTableDSN
-
Start by selecting the Table or Endpoint you are interested in and then configure the parameters. This will generate a query that we will use in Talend Studio to retrieve data from HTML Table. Hit OK button to use this query in the next step.
SELECT * FROM Orders
Some parameters configured in this window will be passed to the HTML Table API, e.g. filtering parameters. It means that filtering will be done on the server side (instead of the client side), enabling you to get only the meaningful datamuch faster . -
Now hit Preview Data button to preview the data using the generated SQL query. If you are satisfied with the result, use this query in Talend Studio:
- HTML TableRead HTML table data effortlessly. Extract, sync, and manage web table content for analytics, reporting, and data pipelines — almost no coding required.HtmlTableDSNSELECT * FROM Orders
You can also access data quickly from the tables dropdown by selecting <Select table>.AWHEREclause,LIMITkeyword will be performed on the client side, meaning that thewhole result set will be retrieved from the HTML Table API first, and only then the filtering will be applied to the data. If possible, it is recommended to use parameters in Query Builder to filter the data on the server side (in HTML Table servers). -
Click OK to finish creating the data source.
-
Once done, go to the Network Settings tab and Add a firewall rule for inbound traffic:
- This will initially allow all inbound traffic.
- Click Edit IP filters to restrict access to specific IP addresses or ranges.
-
Crucial Step: After creating or modifying the data source, you must:
- Click the Save button to persist your changes.
- Hit Yes when prompted to restart the Data Gateway service.
This ensures all changes are properly applied:
Skipping this step may cause the new settings to fail, preventing you from connecting to the data source.
Read HTML Table data in Talend Studio
To read HTML Table data in Talend Studio, we'll need to complete several steps. Let's get through them all right away!
Create connection for input
- First of all, open Talend Studio
-
Create a new connection:
-
Select Microsoft SQL Server connection:
-
Name your connection:
-
Fill-in connection parameters and then click Test connection:
HtmlTableDSN
-
If the List of modules not installed for this operation window shows up, then download and install all of them:
Review and accept all additional module license agreements during the process -
Finally, you should see a successful connection test result at the end:
Add input
-
Once we have a connection to ZappySys Data Gateway created, we can proceed by creating a job:
-
Simply drag and drop ZappySys Data Gateway connection onto the job:
-
Then create an input based on ZappySys Data Gateway connection:
-
Continue by configuring a SQL query and click Guess schema button:
-
Finish by configuring the schema, for example:
Add output
We are ready to add an output. From Palette drag and drop a tFileOutputDelimited output and connect it to the input:
Run the job
Finally, run the job and integrate your HTML Table data:
Conclusion
In this article we showed you how to connect to HTML Table in Talend Studio and integrate data without writing complex code — all of this was powered by HTML Table ODBC Driver.
Download ODBC PowerPack now or ping us via chat if you have any questions or are looking for a specific feature (you can also reach out to us by submitting a ticket):