Linking Schemes

Linking Schemes

This document gives an introduction to binary anti-monotone linking schemes, a class of graphs highly suitable for the design of purely functional sequence data structures. None of the ideas here are my own, I am mostly paraphrasing the contents of 0. Binary anti-monotone linking schemes were initially introduced in 1.

Throughout this text, a graph is considered to be directed.

We wish to examine linking schemes, specifications for how to link from items of a sequence to earlier items within the same sequence, such that early items can be reached from all later items by transitively following links. We want to create all outgoing links of an item when that item is created, without later changes. We assume that it is only possible to link to items which are already part of the sequence, i.e., it is impossible to link to future items. These requirements arise naturally when links are implemented as cryptographic hashes.

We can capture these requirements with the following definition:

A linking scheme is a graph $G = \sequence{\Np, A}$ satisfying the following properties:

- linkedness: For all $n, m \in \Np$ with $n \leq m$ there is a path from $m$ to $n$.
- acyclicity: There are no $n, m \in \Np$ with $n \geq m$ such that there is a path from $m$ to $n$.

The most simple example of a linking scheme is that of a linked list: every item links to its immediate predecessor, and to no other item.

The graph $\sequence{\Np, \set{\sequence{n + 1, n} \mid n \in \Np}}$, in which every number has an arc to its predecessor, is a linking scheme, called the minimal linking scheme.

Let $G = \sequence{\Np, A}$ be a linking scheme.

Then $G$ is a supergraph of the minimal linking scheme, i.e., every number has to links to its predecessor. Otherwise, there would not be a path from the violating number to its predecessor.

As links between sequence items consume space, we focus on linking schemes where the number of links per item is at most two.

Let $G = \sequence{\Np, A}$ be a linking scheme.

Then $G$ is a binary linking scheme if every vertex has at most two outgoing arcs.

Because every linking scheme has to include predecessor arcs, a binary linking scheme has only one arc per vertex that is of real interest. We can thus define a binary linking scheme by giving a function that computes for each vertex the interesting out-neighbor.

Let $\fun{f}{\Np}{\Np}$ be a function.

$\newcommand{\mAssociatedGraph}[1]{\href{##associated_graph}{G}_{#1}}$The associated graph of $f$ is the graph $\mAssociatedGraph{f} \defeq \sequence{\Np, \A(\mAssociatedGraph{f})}$ with $$\cssId{A}{\A(\mAssociatedGraph{f})} \defeq \set{\sequence{n + 1, n} \mid n \in \Np} \cup \set{\sequence{n, f(n)} \mid n \in \Np, n \geq 2}.$$

We call an arc $\sequence{n + 1, n}$ a predecessor link, and an arc $\sequence{n, f(n)}$ a skip link (because it might potentially skip over multiple numbers).

Define $\fun{f}{\Np}{\Np}, f(n) \defeq \min(n - 2, 1)$.

Then $\mAssociatedGraph{f}$ is a binary linking scheme.

Define $\fun{f}{\Np}{\Np}, f(n) \defeq \min(n - 1, 1)$.

Then $\mAssociatedGraph{f}$ is the minimal linking scheme. Every skip link coincides with a predecessor link.

Given a sequence of items adhering to a binary linking scheme, one often wants to compute a shortest path between two items. Although not always producing a shortest path, a natural starting point is a greedy algorithm that always follows the link pointing to the earliest item, unless that link overshoots the target. Formally:

Let $f$ be a function such that $\mAssociatedGraph{f}$ is a binary linking scheme, and let $n, m \in \Np, n \geq m$.

The function $\fun{\gpath{f}}{\Np \times \Np}{\Np^{\ast}}$ is defined recursively:

$\begin{align*} \gpath{f}(n, m) &= \begin{cases} \sequence{n} &\mbox{if } n = m \\ \sequence{n} \circ \gpath{f}(f(n), m) & \mbox{if } n > f(n) \geq m\\ \sequence{n} \circ \gpath{f}(n - 1, m)& \mbox{otherwise}\\ \end{cases}\\ \end{align*}$Whenever $\gpath{f}$ computes a path which is not a shortest path, there must have been a situation where following the skip link skipped over an item whose skip link would have pointed to an even earlier item. If we exclude graphs where this is possible, we obtain a class of linking schemes for which $\gpath{f}$ always computes a shortest path.

Let $f$ be a function such that $\mAssociatedGraph{f}$ is a binary linking scheme.

We call $\mAssociatedGraph{f}$ a greedy-friendly graph if for all $w, x, y, z \in \Np$ with $w < x < y < z$:

$$\sequence{z, x} \in \A(\mAssociatedGraph{f}) \implies \sequence{y, w} \notin \A(\mAssociatedGraph{f})$$Let $f$ be a function such that $\mAssociatedGraph{f}$ is a greedy-friendly graph.

Then $\gpath{f}$ always computes a shortest path.

The minimal linking scheme is a greedy-friendly graph.

The scheme given in Example 0.7 is not a greedy-friendly graph, even though $\gpath{f}$ always computes a shortest path.

The definition of greedy-friendly graphs sticks close to the intuition behind situations in which $\gpath{f}$ fails to produce shortest paths, but it certainly does not win any points for elegance. There is an equivalent characterization which only needs to consider properties of $f$ rather than examining the associated graph:

Let $f$ be a function such that $\mAssociatedGraph{f}$ is a binary linking scheme.

We call $\mAssociatedGraph{f}$ a binary anti-monotone linking scheme if $f$ is anti-monotone, i.e., if for all $n, m \in \Np, n > m$:

$$f(n) \leq m \implies f(n) \leq f(m)$$Let $f$ be a function such that $\mAssociatedGraph{f}$ is a binary linking scheme.

$\mAssociatedGraph{f}$ is a binary anti-monotone linking scheme if and only if $\mAssociatedGraph{f}$ is a greedy-friendly graph.

We now introduce two specific binary anti-monotone linking schemes, both to make things more concrete, and because they contain short paths between any two numbers.

Toward a binary anti-monotone linking scheme with short paths between vertices, we examine an inductive construction that yields exponentially large graphs with paths of linear length which do not contradict any of the properties of a binary anti-monotone linking scheme.

$\newcommand{\lstwoConstruction}[1]{\href{##lstwo_construction}{G_{#1}}}$We define the graph $\lstwoConstruction{i}$ inductively:

$\lstwoConstruction{0} \defeq \sequence{\set{1}, \set{}}$

Let $i \in \N$. Then $\lstwoConstruction{i + 1}$ is obtained from $\lstwoConstruction{i}$ as follows:

- Add a disjoint copy of $\lstwoConstruction{i}$ to the graph.
- In the copy, delete all arcs originating in a vertex of the form $2^k - 1$ for some $k \in \N$.
- In the copy, add from every such vertex an arc to the greatest vertex of the original graph.
- In the copy, add $2^{i - 1} - 1$ to every vertex.
- Add a new vertex $2^i - 1$ to the graph, with an edge to the greatest vertex of the original graph and of the copy.

Increment $i$

Decrement $i$

We can develop a function that computes the skip links which result from this construction.

$\DeclareMathOperator{\lsTwoVanilla}{ls_{2}}\newcommand{\lstwo}[0]{\href{##lstwo}{\lsTwoVanilla}}$The function $\fun{\lstwo}{\Np}{\Np}$ is recursively defined:

\begin{align*} \lstwo(n) &= \begin{cases} n - 2^{k - 1} &\mbox{if } n = 2^k - 1, k \in \N \\ n - (2^{g(n)} - 1) & \mbox{otherwise}\\ \end{cases}\\ g(n) &= \begin{cases} k &\mbox{if } n = 2^{k} - 1, k \in \N \\ g(n - (2^{k - 1} - 1)) & \mbox{if } 2^{k - 1} - 1 < n < 2^{k} - 1, k \in \N \end{cases}\\ \end{align*}Let $n, m \in \Np$.

$\sequence{n, m} \in \mAssociatedGraph{\lstwo}$ if and only if there exists $i \in \N$ such that $\sequence{n, m} \in A(\lstwoConstruction{i})$.

$\mAssociatedGraph{\lstwo}$ is a binary anti-monotone linking scheme.

Let $n, m \in \Np, n \geq m$.

Then the shortest path from $n$ to $m$ in $\mAssociatedGraph{\lstwo}$ has length at most $2 \cdot \ceil{\log_2(n)} + 2 \cdot \ceil{\log_2(m)} \in \complexity{\log(n)}$.

Furthermore, $\lstwo(n)$ can be computed in $\complexity{\log(n)}$ time and in $\complexity{1}$ space.

The following graphic is interactive; select a vertex and then hover over other vertices to see the shortest paths between them.

In 2 the authors show that the construction becomes more efficient if in every step the graph is not duplicated but tripled.

$\newcommand{\lsthreeConstruction}[1]{\href{##lsthree_construction}{H_{#1}}}$We define the graph $\lsthreeConstruction{i}$ inductively:

$\lsthreeConstruction{0} \defeq \sequence{\set{1}, \set{}}$

Let $i \in \N$. Then $\lsthreeConstruction{i + 1}$ is obtained from $\lsthreeConstruction{i}$ as follows:

- Add
disjoint copies of $\lsthreeConstruction{i}$ to the graph. - In the copies, delete all arcs originating in a vertex of the form $\frac{3^k - 1}{2}$ for some $k \in \N$.
- In the first copy, add from every such vertex an arc to the greatest vertex of the original graph, and in the second copy, add from every such vertex an arc to the greatest vertex of the first copy.
- In the first copy, add $\frac{3^{i - 1} - 1}{2}$ to every vertex, in the second copy, add $3^{i - 1} - 1$ to every vertex.
- Add a new vertex $\frac{3^i - 1}{2}$ to the graph, with an edge to the greatest vertex of the original graph and of the second copy.

Increment $i$

Decrement $i$

We can again develop a function that computes the skip links which result from this construction.

$\DeclareMathOperator{\lsThreeVanilla}{ls_{3}}\newcommand{\lsthree}[0]{\href{##lsthree}{\lsThreeVanilla}}$The function $\fun{\lsthree}{\Np}{\Np}$ is recursively defined:

\begin{align*} \lsthree(n) &= \begin{cases} n - 3^{k - 1} &\mbox{if } n = \frac{3^k - 1}{2}, k \in \N \\ n - \frac{3^{g(n)} - 1}{2} & \mbox{otherwise}\\ \end{cases}\\ g(n) &= \begin{cases} k &\mbox{if } n = \frac{3^k - 1}{2}, k \in \N \\ g(n - \frac{3^{k - 1} - 1}{2}) & \mbox{if } \frac{3^{k - 1} - 1}{2} < n < \frac{3^k - 1}{2}, k \in \N \end{cases}\\ \end{align*}Let $n, m \in \Np$.

$\sequence{n, m} \in \mAssociatedGraph{\lsthree}$ if and only if there exists $i \in \N$ such that $\sequence{n, m} \in A(\lsthreeConstruction{i})$.

$\mAssociatedGraph{\lsthree}$ is a binary anti-monotone linking scheme.

Let $n, m \in \Np, n \geq m$.

Then the shortest path from $n$ to $m$ in $\mAssociatedGraph{\lsthree}$ has length at most $3 \cdot \ceil{\log_3(n)} + 3 \cdot \ceil{\log_3(m)} \in \complexity{\log(n)}$.

Furthermore, $\lsthree(n)$ can be computed in $\complexity{\log(n)}$ time and in $\complexity{1}$ space.

The following graphic is interactive; select a vertex and then hover over other vertices to see the shortest paths between them.

Binary anti-monotone linking schemes shine when the following problem needs solving: Consider two peers in a distributed system. Both store locally a subset of some sequence of items. Each of them selects an item, and now they want to efficiently find out whether the items are indeed part of the same sequence. This boils down to the question of whether there is a path of links between the items.

The paths in a binary anti-monotone linking scheme satisfy some helpful properties for attacking this problem.

Let $f$ be a function such that $\mAssociatedGraph{f}$ is a binary anti-monotone linking scheme, and let $n, m \in \Np, n \geq m$.

Then the shortest path from $n$ to $m$ is unique.

Let $f$ be a function such that $\mAssociatedGraph{f}$ is a binary anti-monotone linking scheme, and let $n, m \in \Np, n \geq m$.

Then any path from $n$ to $m$ contains all vertices and arcs of the shortest path from $n$ to $m$.

In the problem of determining whether two items are part of the same sequence, both peers care about only a subset of the sequence. If both of them only store items they directly care about, then there will always be pairs of items for which the two peers do not, amongst themselves, have all the items of the shortest path between the two items of interest available.

A natural problem is thus to find out which items a peer has to store in addition to the items it directly cares about, so that the items for shortest paths are always available. This leads to the following definition:

Let $G$ be a linking scheme.

A certificate pool scheme for $G$ is a function $\fun{p}{\Np}{\powerset{\Np}}$ satisfying for all $n, m \in \Np, n \geq m$ that $p(n) \cup p(m)$ contains the vertices of a path from $n$ to $m$.

We call $p(n)$ the certificate pool of $n$.

Define $\fun{p}{\Np}{\powerset{\Np}}, p(n) \defeq \set{m \in \Np \mid m \leq n}$.

Then $p$ is a certificate pool scheme for every linking scheme $G$.

This trivial certificate pool scheme works for every linking scheme, but it is very inefficient because the union of two certificate pools is guaranteed to contains the *longest* path between the two items. For certain binary anti-monotone linking schemes however there is a certificate pool scheme in which only the shortest path between any two items is included in the union.

Let $G$ be a binary anti-monotone linking scheme.

We call $G$ a bounded-degree binary anti-monotone linking scheme if every vertex of $G$ has only finitely many in-neighbors.

Let $f$ be a function such that $\mAssociatedGraph{f}$ is a bounded-degree binary anti-monotone linking scheme, and let $n \in \Np$.

The tip of $n$ is $\mTip{f}(n) \defeq \max({m \in \Np \mid \sequence{m, n} \in \mAssociatedGraph{f}})$. This is well defined because of the bouned in-degree of $\mAssociatedGraph{f}$.

The head of $n$ is the path $\mHead{f}(n) \defeq \gpath{f}(\mTip{f}(n), n)$.

The tail of $n$ is the path $\mTail{f}(n) \defeq \gpath{f}(n, 1)$.

The minimal certificate pool scheme of $f$ is the function $\fun{\p{f}}{\Np}{\powerset{\Np}}, \p{f}(n) \defeq \mHead{f}(n) \cup \mTail{f}(n)$.

The following interactive graphic visualizes $\p{\lstwo}$. Hover over a vertex to see its certificate pool. Select a vertex and then hover over others to see both certificate pools, as well as the shortest path between the two vertices.

Let $f$ be a function such that $\mAssociatedGraph{f}$ is a bounded-degree binary anti-monotone linking scheme.

Then $\p{f}$ is a certificate pool scheme.

Let $n \in \Np$.

Then $\abs{\p{\lstwo}(n)} \leq 2 \cdot \ceil{\log_2(n)}$ and $\abs{\p{\lsthree}(n)} \leq 3 \cdot \ceil{\log_3(n)}$

A bounded-degree binary anti-monotone linking scheme with small shortest paths between any two numbers thus leads to small certificate pools.

Let $f$ be a function such that $\mAssociatedGraph{f}$ is a bounded-degree binary anti-monotone linking scheme.

If for all $n, m \in \Np, n \geq m$ the length of $\gpath{f}(n, m)$ is in $\complexity{\log(n)}$, then $\p{f}(n) \in \complexity{\log(n)}$ for all $n \in \Np$.

A different problem when dealing with linked sequences is that of appending new items to the end of the sequence. Doing so requires creating new links, and in order to link to an item, you need to have some information about that item — for simplicity we shall assume that you need to have that item available.

To be able to compute the predecessor link, you need to have the latest sequence item available. But the skip link could point anywhere. Is it necessary to store *all* previous sequence items? For arbitrary binary linking schemes, this is the case, but the situation is better for binary anti-monotone linking schemes.

Let $f$ be a function such that $\mAssociatedGraph{f}$ is a binary anti-monotone linking scheme, and let $n \in \Np$.

Then $f(n + 1) \in \mTail{f}(n)$.

It thus suffices to store the tail of the latest sequence item in order to be able to append new items.

Let $f$ be a function such that $\mAssociatedGraph{f}$ is a bounded-degree binary anti-monotone linking scheme.

If $\p{f}(n) \in \complexity{\log(n)}$ for all $n \in \Np$, then the number of items which need to be stored so that appending an $n$'th item is possible is in $\complexity{\log(n)}$.

Binary anti-monotone linking schemes are graphs with surprisingly nice properties. $\lstwo$ and $\lsthree$ can be constructed vertex by vertex, yet paths between any two vertices have length at most logarithmic in the greater vertex. The uniqueness of shortest paths also leads to small certificate pools. They are thus suitable for the design of protocols which rely on securely linking items and easily verifying the existence of paths between items.

I hope this document covers all the aspects of binary anti-monotone linking schemes required for such use cases. I encourage you to also look at the original literature, it includes some mathematical background I intentionally skipped. 2 in particular includes an alternative characterization of binary anti-monotone linking schemes based on a graph composition operator, as well as some funky math regarding the optimality of $\lsthree$.

- 0Lipmaa, H. (1999). Secure and efficient time-stamping systems. Tartu: Tartu University Press.
- 1Buldas, A., Laud, P., Lipmaa, H., & Villemson, J. (1998, August). Time-stamping with binary linking schemes. In Annual International Cryptology Conference (pp. 486-501). Springer, Berlin, Heidelberg
- 2Buldas, A., & Laud, P. (1998, December). New linking schemes for digital time-stamping. In ICISC (Vol. 98, pp. 3-14).