Data Modelling with AWS DynamoDB

What's a NoSQL Database

It's a lot different from a relation database. It's not a set of tables that we're going to join data across. It's not a table where each row is restricted by a schema.

I'm going to show how we can use a single DynamoDB table that we're going use to index an object collection. We're going to put all data items into one table. One table, because a NoSQL database is schema-less, meaning that it doesn't require an enforced schema for the items. The only enforced constraint on a NoSQL database is that of a unique primary key.

Primary key (Item Identifier)

There are two types of primary keys in DynamoDB:

Partition key: This is a simple primary key where no two items can have the same partition key value.

Composite primary key: Combination of partition key and sort key where two items might have the same partition key value, however, those items must have different sort key values.

When we use a partition key as the primary key it defines a single item. When we use a composite key, the partition key defines the container and the sort key defines the uniqueness of the item within that partition. (similar to a folder and file system)

In this case the combination of the partition key and the sort key needs to be unique. Later I will show how to model "one-to-many" and "many-to-many" complex relationships using these keys. The sort key gives you the ability to query data within a given partition based on the conditions that the sort key meet e.g: All items in a partition using the following operators:

  1. ==, <, >, >=, <=
  2. "begins with"
  3. "between"
  4. "in"
  5. sorted results
  6. counts
  7. top/bottom N values

At this point let's show an example of how we can model the following ERD in DynamoDB

When modelling our data with DynamoDB we need to know the relationships between entities and how we are going to access that data. It is helpful to know this upfront so that we can model and organise our data for efficiency. Our application is a simple order system. Where a customer orders a one or more products. The most common access patterns for this type of application are:

  • Get all customers
  • Get customer by id
  • Get all orders for a customer
  • Get order by id, including order lines
  • Get all products
  • Get product by id

Ok now that we have an idea of all the relationships and all the access patterns, let's model the way we store data by using NoSQL Workbench Click Here

We typically model our standalone entities first so let's model our customers and products

Within NoSQL Workbench, create a table of name "Data". We also define a composite primary key (PK and SK) for partition key and sort key respectively. We use generic key (PK,SK) names as we will be storing data of different types.

Let's insert some data into the model to illustrate how they will live together in the table.

Let's now add some order information to our table

By simply using the same partition of ORDER#789, we can see all related items are stored in this partition and through one query we can retrieve all items related to that ORDER partition. I like to use a id to show the reference back to the related item

Get order by id

Table: Data
PK=ORDER#789

Querying and Filtering Data

Querying is a very powerful operation in DynamoDB. It allows you to select multiple Items that have the same partition key (PK) but different sort keys (SK). You always want to use queries and not filter as filtering happens after queries have read the items and used RCUs (read capacity unit).

One read capacity unit represents one strongly consistent read per second, or two eventually consistent reads per second, for items up to 4 KB in size. If you need to read an item that is larger than 4 KB, DynamoDB will need to consume additional read capacity units. The total number of read capacity units required depends on the item size, and whether you want an eventually consistent or strongly consistent read.

One write capacity unit represents one write per second for items up to 1 KB in size. If you need to write an item that is larger than 1 KB, DynamoDB will need to consume additional write capacity units. The total number of write capacity units required depends on the item size.

Let's illustrate the queries for some of the access patterns modelled by the table:

Get customer by id

Table: Data
PK=CUSTOMER#456, SK=CUSTOMER#456

Get product by id

Table: Data
PK=PRODUCT#124, SK=PRODUCT#124

So how do we satisfy the other access patterns. We use Global secondary indexes. GSI allows us to choose another combination of properties as our primary key. This time we'll choose entity_type as our partition key and PK as our sort key. Let's name this index GSI1. If we do a quick look at our model we see the items partition under entity_type

Now we can satisfy our get all access patterns:

Get all products

Index:GSI1
entity_type=PRODUCT

Get all customers

Index:GSI1
entity_type=CUSTOMER

for our last access pattern, we need to create another GSI. This time we'll swop the SK and PK.

Get all orders for a customer

Index:GSI2
SK=CUSTOMER#456, PK begins_with ORDER

And that is it. All our access patterns are now satisfied and we've illustrated how to model relation data in a DynamoDB.

33