Domain-driven CRDT patterns, are we CALM yet?

By Sigal Lev (lzsigal@edu.hse.ru)

Introduction

Conflict-Free Replicated Data Types (CRDT) represent a groundbreaking approach to the development of distributed systems, where achieving data consistency is both a critical and challenging task.[2] Distributed systems are inherently complex due to their nature of operating across multiple nodes, often located in different geographical regions. Ensuring consistency in such systems has traditionally required the use of consensus algorithms, such as Paxos or Raft, which, while effective, introduce significant complexity, latency, and operational overhead.[1]

The emergence of CRDTs has revolutionized this space by offering a way to maintain strong consistency without the need for complex coordination mechanisms. By leveraging mathematical principles such as commutativity, associativity, and idempotence, CRDTs allow systems to independently process updates on multiple nodes and later merge these updates seamlessly. This ability to resolve conflicts deterministically and automatically makes CRDTs an ideal solution for modern distributed applications, especially those requiring high availability and low latency.[3]

In recent years, the importance of CRDTs has grown significantly in both academia and industry. With the rise of cloud computing, edge computing, and global-scale applications, developers are increasingly faced with the challenge of designing systems that can handle concurrent updates, network partitions, and eventual consistency. CRDTs address these challenges by providing a solid theoretical foundation and practical tools for building resilient systems.

This article delves into the intersection of CRDTs and Domain-Driven Design (DDD), a software development methodology that emphasizes the alignment of software architecture with business domains. By combining CRDTs with DDD principles, developers can create systems that not only maintain data consistency but also reflect the complexities of real-world business processes. Additionally, we will explore the concept of CALM (Consistency as Logical Monotonicity), which offers a theoretical framework for understanding how distributed systems can achieve consistency without relying on consensus protocols. Together, these ideas provide a powerful toolkit for building the next generation of distributed applications.

CRDT: Overview and Key Concepts

CRDTs are a class of algorithms that enable the development of distributed systems with strong data consistency. The core idea is that each node in the system can modify data independently of other nodes, and these changes are automatically merged to reach a consistent state without conflicts. CRDTs are divided into two main categories: state-based and operation-based. [4]

State-based CRDTs

In state-based CRDTs, each node in the system maintains the full state of the data and periodically sends this state to other nodes. Nodes then merge the received states with their current states using commutative and associative operations. An example of such a CRDT is a G-Counter (Grow-only Counter), where counter values are simply added.

Operation-based CRDTs

In operation-based CRDTs, nodes exchange only the changes, rather than the entire state. Each operation must be commutative and associative to ensure that the order of applying operations does not affect the final state. An example is an OR-Set (Observed-Remove Set), which allows adding and removing elements in a set without conflicts.

Domain-Driven Design (DDD) and CRDTs

Domain-Driven Design (DDD) is a software development approach that focuses on modeling the domain and business logic. In the context of CRDTs, DDD helps structure data and logic in a way that allows them to be effectively replicated and merged in distributed systems.

Aggregates and CRDTs

In DDD, an aggregate is a group of related objects that are treated as a single unit. Aggregates ensure data integrity within their boundaries. When applying CRDTs to aggregates, we can ensure that each change to an aggregate will be properly replicated and merged. For instance, if an aggregate represents a shopping cart, CRDTs can be used to ensure the consistency of the cart’s state across multiple user devices.

Event Sourcing and CRDTs

Event Sourcing is a pattern where all changes to the state of a system are stored as a sequence of events. Instead of storing the current state of an object, the system stores the history of all events that led to this state. In the context of CRDTs, Event Sourcing can be used to replicate events between nodes. This allows the system to reconstruct the current state of an object based on the sequence of events, even if these events arrive in different orders on different nodes.[5]

The CALM Concept

Consistency as Logical Monotonicity (CALM) is a concept proposed to describe the conditions under which a distributed system can achieve data consistency. At its core, CALM suggests that if all operations in a system are monotonic, the system can achieve a consistent state without requiring complex consensus algorithms.

Logical Monotonicity

Logical monotonicity means that adding new data to the system can never make its state incorrect. In other words, if a certain logical formula was true in the system before new data was added, it remains true after the addition. In the context of CRDTs, this means that the operations of adding and merging data must be monotonic to guarantee system consistency. [6]

CALM and CRDTs

CRDTs fit perfectly into the CALM concept as they are inherently designed to ensure monotonicity of operations. Operations in CRDTs always bring the system to a new consistent state, regardless of the order in which they are applied. This enables the creation of distributed systems that can achieve strong consistency without requiring complex consensus algorithms.

Domain-Driven CRDT Patterns

Let’s consider several patterns that can be used when developing distributed systems with DDD and CRDTs.

Aggregate Reconciliation Pattern

This pattern involves using CRDTs to reconcile the state of aggregates across nodes. Each change to an aggregate is represented as a CRDT operation, which is then replicated to other nodes. Nodes merge the received operations with their current aggregate states, allowing the system to achieve a consistent state without conflicts.[7]

An example could be an order management system, where each order is represented as an aggregate. By using CRDTs, the system can ensure the consistency of order states across different nodes, even if changes occur concurrently.

Event Sourcing with CRDTs

In this pattern, Event Sourcing is used to store and replicate events that represent changes to the system’s state. Each event is represented as a CRDT operation, which is replicated across nodes. Nodes can reconstruct the current state of objects based on the sequence of events, ensuring data consistency.

For instance, in a warehouse management system, each change to an item’s state (addition, removal, movement) can be represented as a CRDT event. This allows the system to ensure consistency of the warehouse state across different nodes, even if changes occur concurrently.

Additional Considerations When Using CRDTs

Limitations of CRDTs

Despite their advantages, CRDTs have limitations. They are not suitable for all tasks, as their use requires designing operations and data in a way that ensures commutativity, associativity, and idempotence. Additionally, implementing CRDTs can be challenging in systems with a high degree of data interdependence.

Performance

CRDTs can impose significant network overhead, as exchanging states or operations between nodes requires transmitting large volumes of data. This can become a serious issue in systems with limited network resources.

Integration with Existing Systems

Integrating CRDTs into existing systems can be a complex task, as they require changes to how data is managed and replicated. However, modern tools and libraries, such as Redis and Akka, provide built-in support for CRDTs, simplifying their adoption.

The Future of CRDTs and CALM

The development of CRDTs and the CALM concept opens up new possibilities for creating distributed systems that are both resilient, scalable, and easy to manage. Current research in this field focuses on extending the applicability of CRDTs, improving their performance, and integrating them with other approaches such as machine learning and microservice architectures. [8]

Conclusion

CRDTs are a powerful tool for building distributed systems with strong data consistency. In distributed computing, ensuring consistency across nodes is a major challenge, particularly in scenarios where high availability and low latency are critical. CRDTs address this challenge by providing a framework that allows data to be modified independently across multiple nodes while still converging to a consistent state. This makes them ideal for applications such as collaborative editing, real-time messaging, and distributed caching systems.[9]

By integrating Domain-Driven Design (DDD) principles with CRDTs, developers can design systems that are both resilient and aligned with the complexities of real-world business processes.[10] DDD patterns help to structure the domain model in a way that complements the inherent strengths of CRDTs, such as their ability to resolve conflicts deterministically and automatically. This combination enables the creation of systems that not only handle data changes effectively but also reflect the specific needs of the business domain.

Furthermore, the CALM (Consistency as Logical Monotonicity) concept provides a theoretical foundation for understanding how monotonic operations can ensure consistency in distributed systems. By focusing on monotonic operations, developers can avoid the overhead and complexity associated with traditional consensus algorithms like Paxos or Raft. This approach paves the way for building distributed systems that are scalable, efficient, and easier to manage in the face of concurrent updates and network partitions.

References

Naveen Negi (2017).“CRDTs: Strong Eventual Consistency without concurrency control” [https://naveennegi.medium.com/rendezvous-with-riak-crdts-part-1-e94cfc8fe091]
“Conflict-free replicated data type”(2024). [https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type]
Sam Taylor (2024). “Understanding the Role of Conflict-Free Replicated Data Types (CRDTs) in Digital Collaboration and Distributed Systems” [https://www.tildee.com/understanding-the-role-of-conflict-free-replicated-data-types-crdts-in-digital-collaboration-and-distributed-systems/]
Nuno Preguiça (2022). “Conflict-Free Replicated Data Types (CRDTs)” [https://www.researchgate.net/publication/367503614_Conflict-Free_Replicated_Data_Types_CRDTs]
Event sourcing vs CRUD (2024). [https://blog.risingstack.com/event-sourcing-vs-crud/]
ChatGPT 4o. “Logical monotony, what is it in the context of CRDTs?” [https://chatgpt.com]
Adi Polak (2024). “9 Best Practices For Handling Late-Arriving Data” [https://lakefs.io/blog/best-practices-late-arriving-data/]
Shadaj Laddad (2022). “Keep CALM and CRDT On” [https://arxiv.org/abs/2210.12605]
Hector Sanjuan (2019).“Merkle-CRDTs (DRAFT)”[https://docs.ipfs.tech/concepts/merkle-dag/]
“Domain-driven design” (2024). [https://en.wikipedia.org/wiki/Domain-driven_design]

Table of Contents