Local Project Development With Dynamo DB

08 Mar 2019, 00:00

DynamoDB Core Concepts

A little theory behind DynamoDB and NoSQL. DynamoDB does not implement a sql like language and how it stores data is fundamentaly different. Data is stored in the form of JSON documents and “fields” can be arbitrary (More on this later). DynamoDB has 3 core components:

Tables - This is a collection of our json documents. Here our data will share common attributes (but we are not limited to pre-defined attributes).
Items - Each table will contain zero or more items. If you’re coming from a sql background an item could be synonymous with a row. It is just a single group of attributes uniquely different from other items.
Attributes - An attribute is a fundamental element of data. An Item in a cars table will have attributes like year, make, and model. Could be thought of like a field in SQL.

Installing DynamoDB

Installing DynamoDB is simple. First obtain the .jar file from the following location

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DynamoDBLocal.DownloadingAndRunning.html

To start dynamo run the following command:

java -Djava.library.path=./DynamoDBLocal_lib -jar DynamoDBLocal.jar -sharedDb

If you do not already have you aws-cli setup with an existing aws account we will have to add some dummy info before we can acccess our new DynamoDB instance locally.

See the following command and output for a offline dummy config

$ aws configure
AWS Access Key ID [None]: 
AWS Secret Access Key [None]: 
Default region name [None]: us-east-2
Default output format [None]: json

By default our DynamoDB instance is bound to port 8000. We can list our tables with the following aws-cli command (This will not return any tables since we have not created any yet!).

aws dynamodb list-tables --endpoint-url http://localhost:8000

Adding Data

So we installed dynamo and confirmed its working, so how do we use it? This is where the aws-cli comes in. It will be our interface between dynamo and us. Since Dynamo is a NoSQL database it does not use traditional SQL statements. Instead we will create a table with the following command like syntax.

aws dynamodb create-table \
    --table-name Users \
    --attribute-definitions \
        AttributeName=Name,AttributeType=S \
        AttributeName=Title,AttributeType=S \
    --key-schema AttributeName=Name,KeyType=HASH AttributeName=Title,KeyType=RANGE \
    --provisioned-throughput ReadCapacityUnits=1,WriteCapacityUnits=1 \
    --endpoint-url http://localhost:8000

You can see in the previous command we created a table named “Users” with two attributes, Name and Title with the type set to string. These will act as our Primary key.

Primary Keys

The purpose of DynamoDB primary keys is to uniquely identify our data. Here we created what’s called a composite Primary key. This type of key is composed of two attributes. The first attribute is the partition key, and the second attribute is the sort key. DynamoDB uses the partition key value as input to an internal hash function. The output from the hash function determines the partition (physical storage internal to DynamoDB) in which the item will be stored. All items with the same partition key value are stored together, in sorted order by sort key value.

Now lets add a User!

aws dynamodb put-item \
    --table-name Users \
    --item \
        '{"Name": {"S": "Jane Doe"}, "Title": {"S": "President"}, "Email": {"S": "[email protected]"}}' \
    --return-consumed-capacity TOTAL \
    --endpoint-url http://localhost:8000

If you compare the commands we used to create our table and attributes to our user data you will see that we did not create an attribute for our user email. This is because DynamoDB will take any number of attributes as input as long as we have our primary attributes. Our item “schema” can change on the fly, pretty neat right?

Lets create a query to view our data and make sure it uploaded correctly.

#Query sample data
aws dynamodb scan \
    --table-name Users \
    --endpoint-url http://localhost:8000

This should give us output similar to:

{  
   "Name":{  
      "S":"Jane Doe"
   },
   "Title":{  
      "S":"President"
   },
   "Email":{  
      "S":"[email protected]"
   }
}

Making the transition from relational databases to NoSQL can be confusing. DynamoDB can be great due to its flexibility, Scalability, and performance. As great as DynamoDB/NoSQL is, it is not a solution for all applications. To determine which best suits your use case vist:

https://aws.amazon.com/nosql/#SQL_.28relational.29_vs._NoSQL_.28nonrelational.29_databases.