Introduction
In this blog post you will learn how to call Amazon AWS API using SSIS (virtually any API) without a single line of code (No more JAVA, C#, Ruby, Python). Yes you heard it right 🙂 . If you are a SSIS / ETL Developer or even coder everyone loves drag & drop interface. SSIS has many advantages over other approaches such as Programming SDKs, Command Lines. Main advantage is ease of use, security and long term maintenance without learning expensive coding approach.
In this post we will use following components to show various possibilities to implement Amazon AWS API integration scenarios inside your ETL workflows.
JSON / REST API Source Connector (REST API, JSON File or OData Service): Use this dataflow component when you have to fetch data from REST API webservice like a table. This component allows you to extract JSON data from webservice and de-normalize nested structure so you can save to Relational database such as SQL Server or any other target (Oracle, FlatFile, Excel, MySQL). This component also supports reading local JSON files or direct JSON string (Wildcard pattern supported too e.g. c:\data\file*.json). | |
Web API Destination Connector (POST data to API URL) : Use this dataflow component when you have to call API inside Dataflow (see POST to URL). Possible Use case: Call Lambda function with various parameters for each input record found in database. | |
REST API Task : Use this task when you don’t want to pull REST API data in tabular format but want to call rest API for POST data to server, DELETE data from server or things like download HTML page, extract Authentication tokens etc where you not necessarily dealing data in tabular format. This task also allows you many other options such as saving RAW response into variable or file. | |
XML Source Connector (SOAP, File, REST) : Use this dataflow component when you have to fetch data from XML or SOAP webservice and consume data like a table. This component allows you to extract data from webservice and save to SQL Server or any other target (Oracle, FlatFile, Excel, MySQL). This component also supports reading local XML files or direct XML string. |
Prerequisites
Before we do hello world demo for calling Amazon AWS API, you will need to make sure following prerequisites are met.
- SSIS designer installed. Sometimes it is referred as BIDS or SSDT (download it from Microsoft site).
- Basic knowledge of SSIS package development using Microsoft SQL Server Integration Services.
- Access to valid AWS credentials (Access Key, Secret Key for your IAM User). Click here to learn more about IAM users and Access Key/Secret Key
- Make sure SSIS PowerPack is installed. Click here to download.
Step-By-Step Example-1 (Call AWS API)
Now lets call some simple GET API call using SSIS REST API Task.
- Install SSIS PowerPack (Skip this step if you already installed SSIS PowerPack.
- Open Visual Studio and create new Integration Services Project
- Open SSIS Package and check your SSIS Toolbox you will see many tasks/components starting with ZS
- From Control flow SSIS Toolbox Drag & drop ZS REST API Task.
- Double click the REST API task to configure it.
- On REST API Task change URL Access mode drop down to URL from Connection
- Now in the connection dropdown click New ZS-OAUTH connection and configure connection as below
- Once you see OAuth connection dialog box change Provider Type from Custom to Amazon AWS API (v4)
- In the ClientId enter your AWS Access Key
- In the ClientSecret enter your AWS Secret Key
- Click OK to save connection
- On REST API Task change few more settings as below
- Enter API URL you like to call (In our case we will use S3 API (Simple Storage Service). We assume you have ListBucket permission to make this call. If you dont have such permission try to get Full path of File (choose small file) . You have to tweak API url to adjust Service Type, Region,Bucket, Path
123https://s3.us-east-1.amazonaws.com-- OR -- Use below (list files) if you have single bucket permission---https://s3.us-east-1.amazonaws.com/YOUR-BUCKET - Click Test Request. If you have valid Permission and setup looks ok then you will see Response window like below. Using this technique you can call any API to execute AWS operations (E.g. start EC2 VM, Create SQS Queue, Call Lambda Function, Drop or Update resource)
- Enter API URL you like to call (In our case we will use S3 API (Simple Storage Service). We assume you have ListBucket permission to make this call. If you dont have such permission try to get Full path of File (choose small file) . You have to tweak API url to adjust Service Type, Region,Bucket, Path
Step-By-Step Example-2 (Loading data from AWS API to SQL Server)
Now lets do more interesting scenario. We will call AWS S3 API to get S3 File list from Bucket. After extract we will save that list to SQL Server Table. Since Amazon S3 API is XML based API we will use ZappySys SSIS XML Source. For JSON based API use JSON Source instead. XML Source / JSON Source both can parse API response into Rows and Columns so you can easily store it into SQL Server. Now lets see how to do this.
- From Control flow SSIS Toolbox Drag & drop Data Flow Task
- Double click Data Flow Task. From SSIS Toolbox Drag & drop ZS SSIS XML Source.
- Double click the XML Source to configure it.
- In the URL text box enter API URL like below to list S3 Files for specified bucket (Change YOUR-BUCKET to your own name)
1https://s3.us-east-1.amazonaws.com/YOUR-BUCKET - Now check Use Credentials and Select same Amazon API connection we created in previous example.
- Click on Select Filter (This step allows us to flatten the XML hierarchy. Select the node which is Array icon like below. If prompted to treat selected node as array click Yes.
- Click Preview to see your data. Click OK to save.
- Now attach your XML source to target like OLEDB Destination to load data to SQL Server or other Target (e.g. Oracle, MySQL)
- Execute SSIS Package to load data from Amazon AWS API to SQL Server.
File Upload Example – Low level API – Call PUT request
There will be a time when you want to take total control of your AWS API calls. One example is if you wish to Upload / Write data to S3 then components like ZappySys Amazon S3 CSV Destination or Amazon Storage Task might need additional permission such as HeadObject . If you have only write permission on bucket then this will fail to execute Task. In such case you can use REST API Task like below way.
Basically we exported data from Relation Database to CSV File using Export CSV File Task and then we uploaded file content using REST API Task.
Upload Text File to S3 (i.e. JSON, CSV, XML …)
Your URL Format can be like this. Region codes can be found here
1 2 |
https://YOUR-BUCKET.s3.YOUR-REGION.amazonaws.com/YOUR-FILE https://YOUR-BUCKET.s3.YOUR-REGION.amazonaws.com/SOME-FOLDER/SOME-SUB-FOLDER/YOUR-FILE |
Upload Binary File to S3 (i.e. Zip, mp3, gzip, png, jpeg…)
Above method only works for Text Files. If you have Binary files (e.g. Zip file, mp3, png, jpeg) then you can use below workaround.
- Check Is Multi Part / File Upload Option next to the Body editor (Read this Post for more info)
- In the Body enter file path like below
1@c:\folder\some-file.xyz - That’s it, now your binary file can be uploaded to S3 same way as Text file we uploaded in earlier section.
NOTE: Binary file option only works in the latest Build Uploaded after 5/21/2020
Debugging AWS API Command Line Requests using Fiddler (Web Proxy)
Before we see more examples of calling AWS API lets first learn how to capture Request data using aws command line (CLI). We will use Fiddler to capture AWS API Requests.
- Download and Install Fiddler (Free Tool)
- Install aws command line
- Open command prompt and type aws configure command to set credentials. For more info see configure aws credentials .. see below example.
12345c:> aws configureAWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLEAWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEYDefault region name [None]: us-west-2Default output format [None]: json - Once you set credentials launch Fiddler
- On the Fiddler Tools menu > Click Fiddler Option > HTTPS > Check Decrypt HTTPS Traffic
- Once you do that you may be asked to Trust Fiddler Certificate click OK
- Close and Open Fiddler to apply setting
- Now any command you type in aws command line will show up in fiddler. This command we can use in SSIS REST API Task or XML Source to call virtually Any API AWS supports.
- For example to call some lambda function (supply input json data from file) use below command and watch Fiddler Trace. Notice we added –no-verify-ssl option so we can see requests in custom web proxy like fiddler (This option will trust Fiddler certificate).
1c:\>aws lambda invoke --function-name HelloWorld c:\temp\outputfile.txt --no-verify-ssl --payload file://c://test/customer.json - Once you have this information you can use it inside ZappySys Components which supports API calls (e.g. REST API Task, JSON Source, XML Source). For this example we will use REST API Task to call same Lambda Function. Things to change to call any API is URL, Method, ContentType, Body. If its GET call then you wont have Body.
- That’s it.. You can now take this same concept and call virtually API AWS API right insight SSIS without any SDK or command line tools.
Call API Gateway Endpoint (Default URL)
If you wish to call API hosted on Amazon API Gateway Service then enter direct URL. You have to use OAuth Connection (AWS v4 Provider) as previous section.
Typical URL may look like as below if you calling API gateway Endpoint.
1 |
https://c5hhigf5mh.execute-api.us-east-1.amazonaws.com/prod/pets |
Call AWS API with Region and Custom Service Name
There will be a time when you will have to call AWS API URL and it doesnt indicate Region / Service but you have to supply part of signature for Authentication. In this case just enter AWS Service and Region along with ClientID, Secret Fields (Custom Service and Region attributes were introduced in v3.0 or higher so if you are not seeing on OAuth Connection UI then you probably running older version).
Call Amazon Athena API
Check this article for detailed instructions
Call AWS EC2 API
Here is an example of calling EC2 API. This is example of listing EC2 instances.
Request:
1 2 3 4 |
POST https://ec2.us-east-1.amazonaws.com/ Content-Type: application/x-www-form-urlencoded; charset=utf-8 Action=DescribeInstances&Version=2016-11-15 |
Call AWS Lambda API
Here is some example of calling Lambda function
Request:
1 2 3 |
POST https://lambda.YOUR-REGION.amazonaws.com/2015-03-31/functions/YOUR-FUNCTION/invocations {you-json-input-goes-in-body} |
Call AWS ElasticSearch
If you are using AWS Hosted Managed ElasticSearch then also you can use OAuth AWS v4 Provider as below. Assuming you have configured correct Policy to allow your IAM User Account / IP address.
Call AWS API using Native Task/Components
ZappySys provides many High quality Tasks / Components for AWS integration. See below list.
Amazon AWS Cloud Integration | ||
---|---|---|
Amazon S3 Task | Amazon Redshift Data Transfer Task | Amazon Redshift ExecuteSql Task |
Amazon Redshift Cluster Management Task | Amazon DynamoDB Source | Amazon DynamoDB Destination |
Amazon Redshift Source | Amazon SQS Queue Source | Amazon SQS Queue Destination |
Conclusion
Amazon AWS Cloud integration from SSIS packages becoming more and more common scenario. ZappySys provides easy to use no coding connectors to achieve many time consuming scenarios. Having clean drag and drop approach is not only faster but more secure using inbuilt SSIS framework. Try SSIS PowerPack to explore many other scenarios not discussed in this article.