Amazon DynamoDB is a managed NoSQL database in the AWS cloud that delivers a key piece of infrastructure for use cases ranging from mobile application back-ends to ad tech. DynamoDB is optimized for transactional applications that need to read and write individual keys but do not need joins or other RDBMS features. For this subset of requirements, DynamoDB offers a way to have a virtually infinitely scalable datastore that requires minimal maintenance.
While DynamoDB is quite popular, one common complaint we often hear from developers is that DynamoDB is expensive. In particular, costs can scale sharply as usage grows in an almost surprising manner. In this post, we will examine three reasons why DynamoDB is perceived as being expensive at scale, and outline steps that you can take to make DynamoDB costs more reasonable.
DynamoDB partition keys
Given the simplicity in using DynamoDB, a developer can get pretty far in a short time. But there are some latent pitfalls that come from not thinking through the data distribution before starting to use it. To manage your data in DynamoDB effectively, an understanding of some DynamoDB internals—of how data is stored under the hood—is important.
As we mentioned before, DynamoDB is a NoSQL datastore, which means the operations it supports efficiently are GET (by primary key or index) and PUT. Every record you store in DynamoDB is called an item, and these items are stored within partitions. These partitions are all managed automatically and not exposed to the user. Every item has a partition key that is used as input to an internal hash function to determine which partition the item will live within. The partitions themselves are stored on SSD and replicated across multiple Availability Zones in a region.