CRDTs (Conflict-free Replicated Data Types)

Collaborative applications, like Notion or Google Docs, are applications where several users can simultaneously update state that is shared amongst them.

When edits are made by two or more users at the same time, these applications need to handle convergence, or the way in which changes from multiple users are merged and reconciled.

To date, there have been two main algorithms to handle convergence: Operational Transformation (OT), and Conflict-free Replicated Data Types (CRDTs).

Operational Transformation

If we imagine a document as an array of characters in positions 0..n, edits can be defined as inserts or deletions of indices in this array.

Edits applied by one user can be passed to other users that are also viewing the document. For example, if one user inserts the letter "A" at index 3, that edit can be passed along and applied locally by all other users.

If two users are editing simultaneously, we may need to account for the local edits of user A before applying the edits of user B. For example, if we begin with the text Helo, and user A inserts an l at index 3 while user B inserts an ! at index 4.

User A's copy now looks like Hello, while user B's copy looks like Helo!. When applying user B's change to user A, we need to translate the insert position to account for user A's changes. The ! needs to be inserted at index 5 to correctly form Hello!, even though it was inserted at index 4 locally for user B.

From Martin Kleppmann's talk,

Operational transformation require that users making changes are continuously connected to the server, so it can provide resolution. The server is required to sequence the operations.

Conflict-free Replicated Data Types

CRDTs are a different solution to change resolution that don't require users to be continuously connected to a central server to resolve edits.

🚩

The fundamental requirement for convergence with CRDTs is that the same set of edits made in any order will results in the same state.

In mathematical terms, operations with CRDTs are commutative, which means the order of operations is not relevant to the end result. This is a property of addition and multiplication (but not division and subtraction), e.g. and .

CRDTs are easy to implement, but challenging to do in a way that results in the "correct" resolution (e.g. Hell!o vs Hello!).

In the example of collaborative text editing, CRDTs avoid the index transformation issue by creating a unique identifier for every character in the document. While this guarantees that edits will not be overwritten, it can be difficult to know how to apply the edits in the correct order.

image

Various CRDT algorithms have implemented solutions to this interleaving problem, but often trade off efficiency for correctness.

References