CRDT: Fractional Indexing

Published on November 12th, 2022

Collaborative peer-to-peer applications sometimes need to operate on sequences of objects with a consistent order across all peers. For example, a peer-to-peer photo album application might need to sync the order in which photos appear in an album.

The algorithm presented here is one way to do this. It comes from a family of algorithms called CRDTs, which I will not describe here. Unlike my original article about this technique, the algorithm presented here uses random offsets to avoid requiring a central server, and works in true peer-to-peer scenarios. Compared to tree-based indexing, fractional indexing is simpler but doesn't prevent interleaving of concurrently-inserted runs, which makes it inappropriate for textual data.

The algorithm:

Each object is given a fractional position between 0 and 1 (exclusive). The object order is determined by sorting the objects by their positions (using object id as a tie-breaker).
To insert an object between two other objects, set its position to a fraction in between the two other objects' positions. To insert before any other object, substitute 0 for the first object's position. To insert after before any other object, substitute 1 for the second object's position.
To prevent two peers from generating the same fraction (with high probability), add a random offset to the end of the fraction during each insert operation.

You can use any conflict-resolution strategy for syncing the fractional positions to other peers. The demo below assigns each write a timestamp and uses last-writer-wins to resolve conflicts, with the identifier of the peer that originally did the write as a tie-breaker.

Fractional positions should be represented using arbitrary-precision decimals so that they don't run out of precision. Floating-point numbers are insufficient. For example, averaging a number with another number can only be done around 50 times with 64-bit floating-point numbers before the numbers become equal.

Below is a demo of what this looks like in practice. Each quadrant represents a peer, and peers send messages to each other with a simulated network delay. Click + to insert new objects, × to delete existing objects, and drag objects to reorder them. Click Undo or Redo for a given peer to rewind or playback that peer's local edits. Temporarily disable the simulated network with the pause button to construct simultaneous editing scenarios. You can use your browser's "view source" feature to view the source code for this demo:

This technique has the following benefits and drawbacks:

Benefits:

It's easy to understand and implement.
It doesn't require "tombstones" (i.e. preserving old object positions). Objects can be reordered any number of times without leaving any additional information behind. Deleted objects are allowed to be forgotten without disrupting the order of the remaining objects.

Drawbacks:

Index length can become long in pathological scenarios. This is typically harmless in practice because it happens rarely with human inputs.
The edge case of averaging between two identical fractional positions doesn't work (since they are equal). This case is unlikely to happen in practice due to the randomized jitter that's added when generating new positions.
If two peers both simultaneously insert a run of objects at the same location, the resulting objects may be interleaved. So this algorithm is not appropriate in situations where object adjacency is critical (e.g. using this to order photos in an album would be appropriate, but using this to order characters in a text document would not be appropriate).