By Pham Manh (tifam@edu.hse.ru)
Thought processes as well as application development and maintenance have been influenced greatly by microservices architecture owing to the architecture design pattern’s emphasis on using smaller, self-sufficient and independent services, rather than larger, self-contained units. However, coordinating diverse states across services raises considerable problems concerning consistency, conflicts and fault tolerance, particularly when several services with shared data are performed together.
To handle the problem of maintaining consistency of system state, which is defined as the ability for multiple system components to function at a given moment, numerous systems of axes are being developed, including CRDTs. CRDTs refer to Multiple Asynchronous Systems with No Centralized Control, which allows for eventual Consistency and Concurrent Writes from Multiple Locations for the Purpose of Conflict Free Resolution. In this essay, we will discuss possible applications of CRDTs in design of APIs for microservices applications to ensure consistency, avoid conflicts and enhance efficiency in data management.
Enterprise applications are required to evolve on a regular, stable and measurable basis, in accordance with DORA metrics, so as to compete in today's fast paced and highly uncertain business environment. In order to do this, engineering units are organized into small, autonomous, multi-disciplinary squads according to the Team Topologies model. These squads utilize DevOps methodology and its tools, including, but not limited to, moving thousands of codes to production multiple times a day where the code is designed and tested using automation technology.
The application is broken out into subdomains, every one of which corresponds to a chunk of business capability. A subdomain embeds some business logic as well as telecommunication protocols with the external world. These subdomains specify the business operations of the application as action taking place in response to some predefined synchronous/asynchronous requests, events, or time tick.
However the original purpose of the term was to define other people, groups and organizations. Hence, stakeholders are “those groups who are vital to the survival and success of the corporation”. Software development projects cannot live without stakeholders. The stakeholders have various views, desires, and needs that will have an impact on the course of the project. During requirements engineering, learning about and implementing these views helps ensure that the end product meets the objectives of the stakeholders, thereby increasing the level of satisfaction of the intended users as well as the probability for success of the project. [2]
Fig. 1: Example of Microservice
Advantages of Microservices:
More flexible deployment options, increased efficiency, and better alignment with evolving business needs make Microservices a strong candidate architecture for enterprise applications in today’s world.
Cross-Microservice State Synchronization
For microservices architecture, having the same state across different nodes is a hard nut to crack. CRDT also allows APIs to provide a current state to the requesting process by resolving the problem of loss due to concurrent updates at different nodes. For instance, an inventory system deployed across different warehouses can make use of a PN-Counter. The API allows increases or decreases of stock, while CRDT ensure all warehouse services achieve stock equalization.
Distributed Systems
APIs for collaborative applications, like editing documents or working in a shared space, need a system, like CRDTs. For example, in a collaborative text editor many users can edit at the same time and so a list based CRDT like RGA can be used to reduce concurrent changes. The user performs an operation (e.g. adding or editing some text) and the API sends it to relevant CRDTs that will integrate the API results while maintaining the immutable document for all users.
Idempotency in Event-Driven Environments
In the event-driven approach state change is achieved by keeping track of the logs. CRDTs, on the other hand, help in simplifying the design of the API by adhering to an idempotent logic. So the active state is not changed as a result of retrying operations or any other system crashes which in most cases only lead to the generation of duplicate events. For example, when an OR-Set CRDT is using a notification system, users can be notified without worrying about duplication in the system due to their acknowledgments.
Integration Difficulty
Currently, there are APIs that are crowdfunded using CRDTs. Developers should choose the merge strategies appropriate for the application and get the desired CRDT types. At the same time, owing to the nature of the structure of CRDTs, it could get tricky to troubleshoot its workings also.
Proper Trade-offs and Consistency Models
CRDTs are said to be useful, specifically for applications such as those that involve data sharing, that do not ideally require any strict concurrent controls, for it will tend to ensure relaxation consistency. But in circumstances where strict consistency is required, then it's quite different.
Increasing Number of Replicas
The mass exchange of states pertaining to state-based CRDTs will degrade the computational and storage strain which without a doubt could have increased with the number of replicas.
Traceability Issues and Debugging
The long distance communication in distributed systems makes it complicated in terms of DRDC for CRDTs. People developing systems are still working on creating better systems that can monitor and visualize states of CRDTs.
It can be said that with the changing technological landscape, CRDTs are bound to become an integral part of the future systems, especially in the domains which require cutting edge technologies such as Block Chain. A potential investment could be blockchain. Traditionally, such systems have struggled to maintain consistency of data across several distributed ledgers. For certain sets of updates, the implementation of CRDTs allows developers to reduce the level of agreement requirements across the nodes, thereby decreasing the time it takes to respond and increasing the throughput of the system. For example, in DeFi protocols, CRDT-based token counters could facilitate more efficient procedures for staring token issuance processes.
Also, CRDTs have a potential application in the field of Edge Computing. With the movement of Data Near the source, the Close Edge Nodes needs to be Consistent. Latency and reliability are two fundamental requirements for certain applications like autonomous vehicles, real-time state synchronization is a necessity, and CRDTs can help support this. TSP could be an example in which fleet owners manage their fleets based on the decisions made by their vehicles with assurances of consistent global information.
Within AI focusing APIs, CRDTs could, for instance, assist in the training of neural networks across several systems. Building machine learning models is a widely known process that is not done on one machine but on several pre-provisioned nodes that work in parallel scraping and processing data, in most cases, to create the same model. CRDTs can help maintain these datasets to allow for easy merging of models. For instance, using a CRDT-enabled data fusion API, distributed AIs would be able to normalize the training datasets that would be used for model definition.
Further research opportunities could also comprise:
Case studies investigating these use cases might be useful, making it easier to justify the widespread use of CRDTs in future developments.
The combination of CRDTs with new technologies such as blockchain, edge computing, AI enabled APIs, etc. creates new opportunity spaces. Future work can consider the use of CRDTs in a decentralized system for resilience or the incorporation of CRDTs with machine learning algorithms for efficient data synchronization.
CRDTs stand out as a unique approach to the problem of supplying distributed state in microservices based systems. Using the conflict free replicated data types, APIs can achieve a highly available and fault tolerant system; one that guarantees eventual consistency. With some barriers to adoption in mind, the justification to use CRDTs is strong, doing so opens the doors to building resilient and highly available APIs.