How to get data from HubSpot API with SSIS

Introduction

UPDATE: ZappySys has released a brand new API Connector for Hubspot Connector which makes it much simpler to Read/Write Hubspot Data in SSIS compared to the steps listed in this article. You can still use steps from this article but if you are new to API or want to avoid learning curve with API then use newer approach.

Please visit this page to see all Pre-Configured ready to use API connectors which you can use in SSIS API Source / SSIS API Destination OR API ODBC Driver (for non-SSIS Apps such as Excel, Power BI, Informatica).

HubSpot is a popular platform where CRM, Marketing and Sales meet in one place. It’s a good piece of service, yet sometimes you need to integrate it with your other system. In this article you will learn how to pull contacts from HubSpot API and load them into a SQL Server Database using SSIS and ZappySys SSIS PowerPack.

 

Prerequisites

  1. HubSpot Account registered (an account you will want to get data from).
  2. SSIS designer installed. Sometimes it is referred as BIDS or SSDT (download it from Microsoft site).
  3. Basic knowledge of SSIS package development using Microsoft SQL Server Integration Services.
  4. An existing SQL Server Database and table created (we will use it to load data into).
  5. ZappySys SSIS PowerPack installed.

Various methods to call HubSpot API

HubSpot API supports two kinds of authentication:

  1. OAuth 2.0.
  2. API Key in URL.

API Key in URL is simple, with minimal configuration, but it ought to be used only for development and testing purposes.

In this tutorial we will show how to authenticate using OAuth 2.0, which you should use in your production environment.

Step-by-Step – Get data from HubSpot API into SQL Server database

  1. Create HubSpot Developer Account (https://app.hubspot.com/signup/developers). It’s an account you will need to use to create HubSpot Apps.
  2. Login with your HubSpot Developer Account (https://app.hubspot.com/login).
  3. Select your newly created HubSpot Developer Account:
    Logging into HubSpot - select HubSpot developer account

    Logging into HubSpot – Select HubSpot developer account

    NOTE: Account name should contain “hubspot-developers” text.
  4. Register HubSpot OAuth App for REST API Call for use with OAuth 2.0 by simply pressing “Create application” button:
    Register HubSpot OAuth App for REST API Call for use with OAuth 2.0

    Register HubSpot OAuth App for REST API Call for use with OAuth 2.0

  5. Name the application (e.g. “ContactsApp”) and choose “Private” to make the application hidden from the outer and dangerous world 🙂
    Name HubSpot REST API App which will be used from within SSIS

    Name HubSpot REST API App which will be used from within SSIS

  6. Press “Edit” button on the app you just created.
  7. Write down Client ID and Client secret. We will need them later for authentication:
    Write down Client ID and Client Secret which will be used in OAuth Connection SSIS component

    Write down Client ID and Client Secret which will be used in OAuth Connection SSIS component

  8. Open SSIS Designer and create a new package.
  9. Drag a new Data Flow task from SSIS Toolbox into the Control Flow:

    Dragging and dropping Data Flow Task into Control Flow

  10. Edit Data Flow task by double clicking. Then drag ZS JSON Source and OLE DB Destination components from SSIS Toolbox into the Data Flow, but don’t connect them yet:
    Drag ZappySys JSON Source SSIS component into the Data Flow

    Drag ZappySys JSON Source SSIS component into the Data Flow

  11. Start editing JSON Source component:
    1. Set Access Mode to File path or web URL.
    2. Set Web URL to  https://api.hubapi.com/contacts/v1/lists/all/contacts/all?count=100.
      NOTE: We are adding URL parameter with value ?count=100, because by default HubSpot API returns 20 contacts and we want to maximize result set and minimize the number of web requests.
    3. Press Select Filter and select contacts node or simply enter $.contacts[*]  into the text box.
    4. Check Use Credentials checkbox.
    5. Select <New ZS-OAUTH Connection> menu item to create a new connection:
      JSON Source SSIS component configuration – Get data from HubSpot REST API using OAuth authentication

      JSON Source SSIS component configuration – Get data from HubSpot API using OAuth authentication

  12. Configure OAuth 2.0 connection for HubSpot API:
    1. Select HubSpot as OAuth Provider.
    2. Select “Use Custom OAuth (App created by you)” option.
      NOTE: You can also use Default OAuth App option for a quick test. Select it and press “Generate Token” button after you enter the scope.
    3. Paste Client Id and Client Secret you also saved in Step #7.
    4. Input contacts as our scope.
    5. Finally, press “Generate Token” button:
      OAuth Connection SSIS component configuration - Connect to HubSpot API using OAuth 2.0

      OAuth Connection SSIS component configuration – Connect to HubSpot API using OAuth 2.0

  13. Select your ordinary HubSpot account you want to get contacts from:
    OAuth Connection SSIS component configuration - Select a HubSpot account to get data from

    OAuth Connection SSIS component configuration – Select a HubSpot account to get data from

    NOTE: Account name should NOT contain “hubspot-developers” text.
  14. Grant access to your HubSpot App:
    OAuth Connection SSIS component configuration - Grant access to HubSpot REST API App

    OAuth Connection SSIS component configuration – Grant access to HubSpot API App

  15. You will get a prompt to save Access and Refresh tokens to a backup file. Press “Yes” to save them.
    NOTE: You will need to pass Refresh Token as the parameter, when you run the package in another machine. Check Deployment to Production section for more info.
  16. Configure HubSpot API pagination
    1. Set Next Link/Cursor Expression to $.vid-offset  by pressing “…” button or simply pasting the value into the text box.
    2. In the same way set Stop Indicator/Attribute to  $.has-more.
    3. Input  false as Stop indicator value/Regular expression.
    4. Set Suffix for Next Url to &vidOffset=<%nextlink%>.
      JSON Source SSIS component pagination configuration for HubSpot REST API call

      JSON Source SSIS component pagination configuration for HubSpot API call

  17. Press “Preview” button to… well, obviously to preview the results 🙂
    JSON Source SSIS component data preview of HubSpot API call result

    JSON Source SSIS component data preview of HubSpot API call result

  18. Close “Data Preview” dialog and click “OK” button to save and close JSON Source component.
  19. A dialog Metadata Scan Options should appear, to which you should respond “Ok” 🙂
  20. Now you can connect JSON Source component with OLE DB Destination:
    Connect JSON Source SSIS component with OLE DB Destination for HubSpot data.

    Connect JSON Source SSIS component with OLE DB Destination for HubSpot data.

  21. Now modify OLE DB Destination to have a connection to a SQL Server and point to a table, where you want contacts to be stored.
  22. It’s time to map the columns of JSON Source with destination table columns. With your mouse simply connect corresponding columns or simply map Input Column with Destination Column. Press “OK” to close the window.
    Map JSON Source SSIS component columns with destination table columns you will store HubSpot data in

    Map JSON Source SSIS component columns with destination table columns you will store HubSpot data in

  23. Finally, funtime! Press “F5” to execute the package. You should see a similar view as below. The number you see is the number of contacts were loaded from HubSpot API to your table:
    The result of getting HubSpot REST API data with SSIS

    The result of getting HubSpot API data with SSIS

API Rate Limit (Throttling)

HubSpot places limits on using its API. Basically, it doesn’t allow you to make endless number of calls per second. Also it has a ceiling for requests made per day. So if you have paging enabled in JSON Source component, it will make as many requests as it has to retrieve all data. And in this step you may get an error, telling that you exceeded calling rate limit of the API. To avoid that, simply go to the Throttling tab and set the amount of milliseconds component has to sleep after making each call:

HubSpot API calls throttling configuration

HubSpot API calls throttling configuration

Making things Dynamic (e.g. URL)

The URL we used (https://api.hubapi.com/contacts/v1/lists/all/contacts/all) in addition to count parameter, has others as well. Let’s say in some scenarios you want to fetch list memberships  for contacts, so you must include &showListMemberships=true  in the URL as well. To make things more generic, instead of using hard-coded true value, you can use SSIS variables, then the URL will look like this:

Common Errors

For detailed instruction to fix common metadata related errors read this article.

Truncation related error

Most common error you may face when you run SSIS Package is truncation error. During design time only 300 rows scanned to detect datatype but at runtime you may extract far more records than 300. So its possible that you will get longer strings than initially expected. If you get error just copy that to notepad and read error carefully. It will explain the steps needed to fix the problem. Basically you can edit Columns on JSON Source UI > Change Data Length or Datatype (if you need more than 4000 chars change to the type to DT_NTEXT and click on Column Name cell to save change). After you edit default settings then make sure to check Lock option (found all on the right side in same row).

Authentication related error (happens when you run package from SQL Job Agent)

So many times when you run from Visual Studio it works fine but when you deploy to Production and run from SQL Job it fails. If this is your issue then check you passing Token (i.e. Password Property of HTTP connection) using some sort of parameter approach (e.g. from command line or from config file). By default password is encrypted using your account so when you run it from SQL Job it will fail to decrypt it. See more on Package Protection Level.

Deployment to Production

When you are ready to deploy the package to Production consider passing Refresh Tokens and other sensitive data as passwords via SSIS parameters. In this way you don’t face authentication related error described in above section. See more on Package Protection Level.

Download Sample Package

Be sure to download a sample SQL Server 2012 SSIS package, in case you encounter problems that you can’t go through in the step-by-step part.

NOTE: Once you open the package, you won’t be able to run it immediately, but don’t panic. Firstly, it’s because you have a different SQL destination table. Secondly, you won’t see any sensitive data in the package, like tokens. Therefore you will have to perform OAuth authentication process to get those. So fix errors one by one and you are good to go.

Conclusion

In this article we learned how to call HubSpot API RESTful method and retrieve all contacts. Actually, HubSpot API consists of many small APIs and we used Contacts API method particularly. If you need anything else than simply retrieving the contacts of your account, delve into the vastness of HubSpot API methods!

Also it’s worth mentioning that JSON Source and OAuth Connection SSIS components saved us some valuable time, because we avoided using default SSIS Script Component and doing all the hard work ourselves.

Keywords

How to fetch data from HubSpot API | Pull data from HubSpot API into SQL Server | Call HubSpot API with SSIS | Sync HubSpot data to SQL Server | HubSpot data integration with SQL Server database | Load data from HubSpot REST API using SSIS

Posted in REST API Integration, SSIS Components, SSIS JSON Source (File/REST), SSIS PowerPack and tagged , , , , , , , , .