Remarks
Amazon DynamoDB API doesn't support SQL Queries like traditional RDBMS so we have implemented light weight parser which takes your SQL Queries and turn into Native DynamoDB query. Here is simple example
SELECT TOP 2 * FROM Customer WHERE (CustomerID='ALFKI' OR CustomerID='BOLID')
SELECT Age, CustomerID FROM Customer WHERE CustomerID IN ('ALFKI','BOLID','C3')
DynamoDB SQL Query Language
If you work with traditional RDBMS and you recently come to NoSQL world you will quickly realize that most of NoSQL database including
DynamoDB is missing familiar SQL query language.
But no worry we got you covered. We have implemented custom query parser which will do hard work for you so you don't have to spend time learning new query language for DynamoDB. Our query engine
will
convert SQL query to Native DynamoDB query language.
SQL Grammar for DynamoDB:
<Select Statement> ::= SELECT [{TOP Clause}] {Column List} {From Clause} [{Where Clause}] [{With Clause}]
<TOP Clause> ::= TOP IntegerLiteral
<Column List> ::= * | IDENTIFIER [ [,IDENTIFIER] [,IDENTIFIER] ... ]
<From Clause> ::= FROM IDENTIFIER
<Where Clause> ::= WHERE <Expression List>
<With Clause> ::= WITH ( [SCAN|QUERY] [,INDEX=your_index_name] [,LIMIT=some_integer] [,SCAN_INDEX_FORWARD=true|false] [,CONSISTENT_READ] )
! =============================================================================
! Expressions
! =============================================================================
! Mixing AND, OR not allowed. Just one type of Logical Operator can be used across multiple conditions in your filter
<Expression List> ::=
| <Expression> [[AND <Expression>] [AND <Expression>] ...]
| <Expression> [[OR <Expression>] [OR <Expression>] ...]
<Expression> ::= IDENTIFIER = <Value>
<Expression> ::= IDENTIFIER != <Value>
<Expression> ::= IDENTIFIER <> <Value>
<Expression> ::= IDENTIFIER > <Value>
<Expression> ::= IDENTIFIER >= <Value>
<Expression> ::= IDENTIFIER < <Value>
<Expression> ::= IDENTIFIER <= <Value>
<Expression> ::= IDENTIFIER IN (<Value> [[, <Value>] [, <Value>] ...])
<Expression> ::= IDENTIFIER BETWEEN <Value> AND <Value>
<Expression> ::= IDENTIFIER LIKE <Value>
<Expression> ::= IDENTIFIER NOT LIKE <Value>
<Expression> ::= IDENTIFIER CONTAINS <Value>
<Expression> ::= IDENTIFIER IS NULL
<Expression> ::= IDENTIFIER IS NOT NULL
<Value> ::= IntegerLiteral | StringLiteral
Supported Operators:
- AND
- OR
- LIKE
- NOT LIKE
- IN
- BETWEEN
- = (Equal)
- !=, <> (Not Equal)
- < (Less than)
- > (Greater than)
- <= (Less than or equal to)
- >= (Greater than or equal to)
Supported Options for WITH clause:
Option |
Description |
SCAN |
A Scan operation reads every item in a table or a secondary index. By default, a Scan operation returns all of the data attributes for every item in the table or index. You can specify column list so that Scan only returns some of the attributes, rather than all of them.
Scan always returns a result set. If no matching items are found, the result set will be empty.
A single Scan request can retrieve a maximum of 1 MB of data; DynamoDB can optionally apply a filter expression to this data, narrowing the results before they are returned to the user.
|
QUERY (This is default) |
A Query operation finds items in a table or a secondary index using only primary key attribute values.
You must provide a hash key attribute name and a distinct value to search for. You can optionally provide a range key attribute name and value,
and use a comparison operator to refine the search results. By default, a Query operation returns all of the data
attributes for items with the specified primary key(s); however, you can use the select column list rather than * so that the Query
operation only returns some of the attributes, rather than all of them.
These are the possible ways to Query DynamoDB table:
- By Hash Column
- By Hash Column + Range Column
- By Hash Column + Local Index Column
- By Global Index Column
- By Global Index Column + Range Index
Read here to learn more about DynamoDB Indexing consideration
In a Query operation, you use one or more filter conditions to determine the items to be read from the table or index. You must specify the hash key attribute name and value as an equality condition. You can optionally provide a second condition for the range key attribute (if present). The range key condition must use one of the following comparison operators:
-
a = b - true if the attribute a is equal to the value b
-
a < b - true if a is less than b
-
a <= b - true if a is less than or equal to b
-
a > b - true if a is greater than b
-
a >= b - true if a is greater than or equal to b
-
a BETWEEN b AND c - true if a is greater than or equal to b , and less than or equal to c .
The following function is also supported:
|
LIMIT |
The maximum number of items to evaluate (not necessarily the number of matching items). If DynamoDB processes the number of items up to the limit while processing the results, it stops the operation and returns the matching values up to that point, and a key in LastEvaluatedKey to apply in a subsequent operation, so that you can pick up where you left off. Also, if the processed data set size exceeds 1 MB before DynamoDB reaches this limit, it stops the operation and returns the matching values up to the limit, and a key in LastEvaluatedKey to apply in a subsequent operation to continue the operation. For more information, see Query and Scan in the Amazon DynamoDB Developer Guide. |
INDEX |
The name of an index to query. This index can be any local secondary index or global secondary index on the table. Note that if you use the IndexName parameter, you must also provide TableName. |
SCAN_INDEX_FORWARD [=true|false] |
Applicable when QUERY Mode is used. Determines the read consistency model: If set to true, then the operation uses strongly consistent reads; otherwise, the operation uses eventually consistent reads.Strongly consistent reads are not supported on global secondary indexes. If you query a global secondary index with ConsistentRead set to true, you will receive an error message. |
CONSISTENT_READ |
Applicable when QUERY Mode is used. Determines the read consistency model: If set to true, then the operation uses strongly consistent reads; otherwise, the operation uses eventually consistent reads.Strongly consistent reads are not supported on global secondary indexes. If you query a global secondary index with ConsistentRead set to true, you will receive an error message. |
Understanding Scan vs Query operation:
By default every request will be treated as QUERY operation to reduce the cost associated with READ operation. Amazon provided some guidelines here for
Difference between DynamoDB QUERY vs Scan and Cost calculation
Some points to remember for
DynamoDB QUERY vs SCAN operation
- QUERY operation is faster operation and incurs lower cost compared to SCAN
- QUERY operation must include HASH KeyColumn in one or more conditions in your filter, you can optionally also include RANGE KeyColumns or Index Columns
- SCAN is slower operation and incurs higher cost compared to QUERY
- SCAN operation scans your table row by row to match your condition from filter, so be careful if you have very large table. Minimum data scan in each request is 1MB (unless you have table less than 1MB data)
- SCAN operation can include any column in Filter condition unlike QUERY operation where HASH KeyColumn must present
- Use query HINTS in WITH clause to indicate SCAN or QUERY operation. If you don't specify it then QUERY operation will be assumed in which you must include HASH KeyColumn in Condition along with Range or Index KeyColumns
Using Date/Time in your DynamoDB Query:
There is no Datetime datatype in DynamoDB but if you are storing your date values in
ISO 8601 DateTime Format as string then your query will understand automatically if you specify
>, >=, <, <= or BETWEEN Filter Condition
select * from Orders where OrderDate > '2015-12-31T23:59:59.000Z' WITH(SCAN)
Using variable in Date/Time query
If you want to make your query dynamic then you can use SSIS variable placeholder (use quote around it). If your variable DataType is datetime then you can format it using following way so it has ISO date format. If your variable DataType is String then make sure its correctly formatted.
select * from Orders where OrderDate > '{{User::vMyDate,yyyy-MM-ddTHH:mm:ssZ}}' WITH(SCAN)
Click here to see various ISO date formats
Specifying Options in WITH Clause of DynamoDB Query:
-
Specify SCAN mode using SCAN option
select * from mytable where SomeColumn > 55 WITH(SCAN)
-
Specify SCAN mode with LIMIT of 1000 rows
select * from mytable where SomeColumn > 55 WITH(SCAN, LIMIT=1000)
-
Specify QUERY mode using QUERY option
select * from mytable where MyHashKeyColumn = 'XYZ' And MyRangeKey > '1223' WITH(QUERY)
-
Specify QUERY mode using QUERY option, and use of INDEX
select * from mytable where MyHashKeyColumn = 'XYZ' And MyRangeKey > '1223' and Phone LIKE '111-%' WITH(QUERY, INDEX=[index-phone-number])
Names with space or other special characters:
Use brackets around identifier if it has space or dash or dot.
select * from mytable where MyHashKeyColumn = 'XYZ' And MyRangeKey > '1223' and Phone LIKE '111-%' WITH(QUERY, INDEX=[index-phone-number])
Amazon DynamoDB Query Examples:
-- Selecting all columns
select * from CustomerTest
-- Selecting multiple columns
select CustomerID,CompanyName from CustomerTest
-- Selecting multiple columns, where some attribute name has space in it
select CustomerID,[Contact Title] from CustomerTest
-- Find records where City attribute is missing (Missing is different than actual NULL value {City: null, Id:1} -vs- {Id:1} )
select * from CustomerTest Where City IS NULL
-- Find records where City attribute is NULL (i.e. attribute is found but value is NULL e.g. {City: null, Id:1} )
select * from CustomerTest Where City = NULL
-- Using Where clause with simple condition, select all columns
select * from CustomerTest Where CustomerID='ALC3R'
-- Limiting records returned from query using TOP clause (Similar as LIMIT found in some RDBMS query engine)
select top 7 * from CustomerTest
-- Using OR
select * from CustomerTest Where CustomerID = 'ALFKI' OR CustomerID = 'BOLID'
-- Using AND
select
* from CustomerTest Where CustomerID = 'ALFKI' AND Age >
3
-- Using comparison operators
select * from CustomerTest Where CustomerID <> 'ALFKI'
select * from CustomerTest Where CustomerID != 'ALFKI'
select * from CustomerTest Where Age > 5
select * from CustomerTest Where Age >= 5
select * from CustomerTest Where Age < 5
select * from CustomerTest Where Age = 5 AND CustomerID = 'C5'