September 06, 2016
Databases, Primary Keys & Microservices
We have been talking a lot about microservices on this blog and how to work inside the code of the service itself. In the last article, we talked about containers communicating with each other. In this article, I am going to talk about the database side of the micro-service architecture.
A popular approach in traditional applications is to store all application data in a single database. This allows for queries that involve tables from multiple parts of your application at once. The referential integrity between these different tables is enforced by the database server using foreign key constraints.
Once you extract those domains into individual microservices, the data behind each microservice should only be accessed by that service. No single database query in your application should involve tables from more than one microservice. In this way you are effectively bringing the business logic that was being enforced by the constraints into the application layer.
This pattern of moving the business logic for referential integrity to the code has some advantages. The location of the business logic is no longer dependent on how we structure our database server infrastructure. Also, following the logic from the code to the database and back to the code can become a bit of a maze. When we move the logic to the code and leave the data in the database, we get a nice separation. This separation helps avoid joins and allows for simpler queries. The advantage here is that simple queries are easier to cache, debug, and are faster to process.
Another complexity that arises from having separate databases is where services require information from other services. For example, if you have a service that processes payments you may want to save if the payment was successful but if payments are in one service and items in another this will be difficult. We can cache the messages we receive from the other service and store them in a cache table as our solution to the problem.
When we receive a message that a new user is registered in the Authentication API, we can save the fact that we received that message. We can also store information in that cache table that is specific to this service or data from the message itself. Then, when we are inside our service, and we need to know if a user exists, we can query the UserCacheTable for information. If we receive a AuthApi.UserDeleted message then we can either remove that row from our cache table or soft delete it with a flag. (NOTE: a major anti-pattern would be to store too much information about the foreign object)
Because you have many containers running that are of the same service, all which access the same database, the database handles the race conditions using row locking. We get nice safe systems working together through messaging and round robin message distribution.
One last consideration is the type of primary key we use to identify unique rows across the many databases. By using UUIDS as our primary key, we ease continuity and make generating the primary key extremely simple – whether in the database or in the code. Many modern databases today offer a UUID or GUID which is optimized to use for the primary key. Using the UUID primary key also makes adding fixture data to your system easy and consistent, working with data from different running environments simple and increases data consistency and quality across the board.
After building out some of this code for our internal product, I have seen that these patterns work. I find the use of the UUID an easy win and the use of the messaging cache tables as a great way to keep the boundaries of our micro service architecture simple and concise.