Polyglot Persistence

April 25, 2023 by Sangeeth G V

Why Polyglot Persistence?

Until recently, most large enterprise applications relied on a single database to store all relevant data. Typically, this will be a relational database like MSSQL, Oracle etc.

Assume the application is an ecommerce platform with thousands of products that have associated images that are stored as binary in a relational database. Is this really a better storage mechanism?

Consider another scenario in which a user visits the same application to look for different products. Is a relational database necessary, or can we use a different database that is search-optimized?

To answer these questions, we must first comprehend the Polyglot Persistence.

Polyglot Persistence

Polyglot persistence is a concept of using different data storage technologies to handle varying data storage needs by a single application. Most monolithic applications are now being migrated to microservices. The real difficulty or concern is in designing for data persistence. You can organize the database for your microservice application in two ways.

Shared Database
Database per Service.

Shared Database

Database per Service

The benefits of shared databases include ACID compliance, no latency overhead, and ease of use for developers.
However, there are some drawbacks to using this model.

Single point of failure
Runtime coupling
Design time coupling.

So, having a database per service is the preferred option, but implementing this is not simple because we must consider the following.

How to handle transactions that span across multiple services?
How to do joins?

There are numerous best patterns and practices that address the issues listed above like using SAGA pattern, API Composition etc. So, let’s keep going and see how we can implement polyglot persistence for each service.

This can be well explained using an ecommerce application. Let’s consider some of microservice like Recommendation Service, Search Service & Order Service which are part of the ecommerce application.

The following diagram depicts how best storage model looks like for each of these services.

A recommendation service would be best served by a graph database that is optimized for joins across relationships and quick traversal of relationships.

Since the search service is mainly used for querying the preferred solution will be a search database which is optimized for word-based search.

The appropriate database for an order service will be an RDMS because most of them involve business transactions and are also ACID compliant.

Data Storage Types

Polyglot persistence employs a diverse set of data storage types. Here are a few examples:

Relational (RDBMS)
Document
Graph
Key Value
Time Series
Search
Blob
Column Store
NewSQL

Following are some of the use case & example database for each data storage type.

Data Storage	Use Case(s)	Example Databases
Relational (RDBMS)	Transactional updates. Tabular structure data.	SQL /Oracle & Postgre SQL
Document	Lots of reads. Infrequent Writes.	MongoDB / Cassandra
Graph	Social Networks (Relationships)	Neo4j
Key Value	Caching, Configuration	Redis / memcache
Time Series	Senor Events (Time based)	kdb /influxDB
Search	Searching	Elastic Search/Splunk
Blob	Big Files (Storage)	Amazon S3 / Azure Blob
Column Store	Analytics	Azamon redshift / Axure SQL data warehouse/ Apache Kudu
NewSQL	Globally Distributed	CockroachDB / Google Spanner

Conclusion

We must first comprehend the nature of the requirement before selecting an appropriate data model. This way, we can be certain that we are not making unending compromises.

Leave a Reply Cancel reply