## Reversibility of binary relations, substochastic matrices, and partial functions

After the last post, I decided that the next post should contain images. Next I decided that the time to publish another post has come. Here is an image of an acceptor finite-state machine, parsing the string “nice”.

How can we revert this machine, such that it parses the string “ecin”? An easy way is to remove state “6 Error”, revert all remaining arrows, swap state “1 Start” and state “7 Success”, then add state “6 Error” again, and add arrows to it for all yet undefined cases. This post elaborates the observation that renouncing on the state “6 Error” altogether, and instead allow undefined cases, makes everything simpler and more symmetric. This is not just relevant for the deterministic case, but also for the nondeterministic and the probabilistic case. Time reversal is significantly different for (non-)deterministic and probabilistic machines. Using partial functions, binary relations, and substochastic matrices as transitions gives deceptively simple representations of this fact.

Volker Diekert, Manfred Kufleitner, Gerhard Rosenberger: “Diskrete algebraische Methoden: Arithmetik, Kryptographie, Automaten und Gruppen” is a very nice book. Its chapter 7 Automatentheorie gave me the idea that non-deterministic machines arise naturally if (arbitrary) binary relations are used instead of functions. If you read that chapter, you will realize that this post is a significant step back in terms of depth and generality. So the worst is yet to come…

What is the difference between non-determinism and randomness? This question asked for clarification of the following quote

A non-deterministic machine is not the same as a probabilistic machine. In crude terms, a non-deterministic machine is a probabilistic machine in which probabilities for transitions are not known

I gave a short answer trying to highlight the fact that the probabilities on the transitions make time reversal for a probabilistic machine significantly less trivial than time reversal for a non-deterministic machine. Niel de Beaudrap commented that even non-deterministic machines are not reversible without further constraints. I agree with him in a certain complicated technical sense, but I couldn’t really judge whether he had this complicated technical sense in mind. Maybe he just doubts the basic heuristic principle according to which non-deterministic machines are much more symmetry with respect to time than deterministic machines. So I took some time to elaborate my answer, and highlight in more detail the achievable time symmetry for non-deterministic and probabilistic machines. I want to try to reblog my answer here:

### Stepping backwards during debugging as a motivation for non-determinism

The notion of a non-deterministic machine suggests itself when you wish to step backward (in time) through a program while debugging. In a typical computer, each step modifies only a finite amount of memory. If you always save this information for the previous 10000 steps, then you can nicely step both forward and backward in the program, and this possibility is not limited to toy programs. If you try to remove the asymmetry between forward steps and backward steps, then you end up with the notion of a non-deterministic machine.

### Similarities and differences between non-determinism and randomness

While probabilistic machines shares some characteristics with non-deterministic machines, this symmetry between forward steps and backward steps is not shared. To see this, let’s model the steps or transitions of a deterministic machine by (total or partial) functions, the transitions of a non-deterministic machine by (finite) relations, and the transitions of a probabilistic machine by (sub)stochastic matrices. For example, here are corresponding definitions for finite automata

• a finite set of states $Q$
• a finite set of input symbols $\Sigma$
• deterministic: a transition function $\delta:Q\times \Sigma \to Q$
• non-deterministic: a transition function $\Delta:Q\times \Sigma \to P(Q)$
• non-deterministic: a transition relation $\Delta\subset Q\times \Sigma \times Q$
• non-deterministic: a function $\Delta: \Sigma \to P(Q \times Q)$
• probabilistic: a function $\delta: \Sigma \to ssM(Q)$

Here $P(Q)$ is the power set of $Q$ and $ssM(Q)$ is the space of substochatic matrices on $Q$. A right substochastic matrix is a nonnegative real matrix, with each row summing to at most 1.

### There are many different reasonable acceptance conditions

The transitions are only one part of a machine, initial and final states, possible output and acceptance conditions are also important. However, there are only very few non-eqivalent acceptance conditions for deterministic machines, a number of reasonable acceptance conditions for non-deterministic machines (NP, coNP, #P, …), and many possible acceptance conditions for probabilistic machines. Hence this answer focuses primarily on the transitions.

### Reversibility is non-trivial for probabilistic machines

A partial function is reversible iff it is injective. A relation is always reversible in a certain sense, by taking the opposite relation (i.e. reversing the direction of the arrows). For a substochastic matrix, taking the transposed matrix is analogous to taking the opposite relation. In general, the transposed matrix is not a substochastic matrix. If it is, then the matrix is said to be *doubly substochastic*. In general $P P^T P\neq P$, even for a doubly substochastic matrix $P$, so one can wonder whether this is a reasonable notion of reversibility at all. It is reasonable, because the probability to reach state $B$ from state $A$ in $k$ forward steps is identical to the probability to reach state $A$ from state $B$ in $k$ backward steps. Each path from A to B has the same probability forward and backward. If suitable acceptance conditions (and other boundary conditions) are selected, then doubly substochastic matrices are an appropriate notion of reversibility for probabilistic machines.

### Reversibility is tricky even for non-deterministic machines

Just like in general $P P^T P\neq P$, in general $R\circ R^{op}\circ R \neq R$ for a binary relation $R$. If $R$ describes a partial function, then $R\circ R^{op}\circ R = R$ and $R^{op}\circ R\circ R^{op} = R^{op}$. Even if relations $P$ and $Q$ should be strictly reversible in this sense, this doesn’t imply that $P\circ Q$ will be strictly reversible too. So let’s ignore strict reversibility now (even so it feels interesting), and focus on reversal by taking the opposite relation. A similar explanation like for the probabilistic case shows that this reversal works fine if suitable acceptance conditions are used.

### These considerations also make sense for pushdown automata

This post suggests that one motivation for non-determinism is to remove that asymmetry between forward steps and backward steps. Is this symmetry of non-determinism limited to finite automata? Here are corresponding symmetric definitions for pushdown automata

• a finite set of states $Q$
• a finite set of input symbols $\Sigma$
• a finite set of stack symbols $\Gamma$
• deterministic: a partial transition function $\delta:Q\times\Gamma\times (\Sigma\cup\{\epsilon\}) \to Q\times\Gamma^{\{0,2\}}$ such that $\delta(q,\gamma,\epsilon)\neq\epsilon$ only if $\delta(q,\gamma,\sigma)=\epsilon$ for all $\sigma\in\Sigma$
• non-deterministic: a transition function $\Delta:Q\times\Gamma^{\{0,1\}}\times (\Sigma\cup\{\epsilon\}) \to P(Q\times\Gamma^{\{0,1\}})$
• non-deterministic: a transition relation $\Delta\subset Q\times\Gamma^{\{0,1\}}\times (\Sigma\cup\{\epsilon\}) \times Q\times\Gamma^{\{0,1\}}$
• non-deterministic: a function $\Delta: \Sigma\cup\{\epsilon\} \to P(Q\times\Gamma^{\{0,1\}}\ \times\ Q\times\Gamma^{\{0,1\}})$
• probabilistic: a function $\delta: \Sigma\cup\{\epsilon\} \to ssM(Q\times\Gamma^{\{0,1\}})$ such that $\delta(\epsilon)+\delta(\sigma)\in ssM(Q\times\Gamma^{\{0,1\}})$ for all $\sigma\in\Sigma$

Here $\epsilon$ is the empty string, $\Gamma^{\{0,2\}}=\{\epsilon\}\cup\Gamma\cup(\Gamma\times\Gamma)$ and $\Gamma^{\{0,1\}}=\{\epsilon\}\cup\Gamma$. This notation is used because it is similar to $\Gamma^*$, which is used in many definitions for pushdown automata.

### Diagramed verification of reversal for (non)advancing input and stack operations

An advancing input operation with $b\in\Sigma\subset\Sigma\cup\{\epsilon\}$ gets reversed as follows

$\begin{matrix} a|bc \to a|\boxed{b}c \to ab|c\\ a|bc \leftarrow a\boxed{b}|c \leftarrow ab|c\\ c|ba \to c|\boxed{b}a \to cb|a \end{matrix}$

A non advancing input operation with $\epsilon\in\Sigma\cup\{\epsilon\}$ that doesn’t read any input can be reversed

$\begin{matrix} a|bc \to a|bc \to a|bc\\ a|bc \leftarrow a|bc \leftarrow a|bc\\ cb|a \to cb|a \to cb|a \end{matrix}$

Here is a diagram of an advancing input operation whose reversal would look bad

For a stack operation $(s,t) \in \Gamma^{\{0,1\}}\times\Gamma^{\{0,1\}}$, there are the three cases $(s,t)=(a,\epsilon)$, $(s,t)=(\epsilon,a)$, and $(s,t)=(a,b)$. The stack operation $(a,\epsilon)$ gets reversed to $(\epsilon,a)$ as follows

$\begin{matrix} ab\ldots \to \boxed{a}b\ldots \to |b\ldots\\ \boxed{a}b\ldots \leftarrow |b\ldots \leftarrow b\ldots\\ b\ldots \to |b\ldots \to \boxed{a}b\ldots \end{matrix}$

The stack operation $(a,b)$ gets reversed to $(b,a)$ as follows

$\begin{matrix} ac\ldots \to \boxed{a}c\ldots \to \boxed{b}c\ldots\\ \boxed{a}c\ldots \leftarrow \boxed{b}c\ldots \leftarrow bc\ldots\\ bc\ldots \to \boxed{b}c\ldots \to \boxed{a}c\ldots \end{matrix}$

A generalized stack operation $(ab,cde)\in\Gamma^*\times\Gamma^*$ would be reversed to $(cde,ab)$

$\begin{matrix} abf\ldots \to \boxed{ab}f\ldots \to \boxed{cde}f\ldots\\ \boxed{ab}f\ldots \leftarrow \boxed{cde}f\ldots \leftarrow cdef\ldots\\ cdef\ldots \to \boxed{cde}f\ldots \to \boxed{ab}f\ldots \end{matrix}$

### Reversibility for Turing machines

A machine with more than one stack is equivalent to a Turing machine, and stack operations can easily be reversed. The motivation at the beginning also suggests that reversal (of a Turing machine) should not be difficult. A Turing machine with a typical instruction set is not so great for reversal, because the symbol under the head can influence whether the tape will move left or right. But if the instruction set is modified appropriately (without reducing the computational power of the machine), then reversal is nearly trivial again.

A reversal can also be constructed without modifying the instruction set, but it is not canonical and a bit ugly. It might seem that the existence of a reversal is just as difficult to decide as many other question pertaining to Turing machines, but a reversal is a local construction and the difficult questions often have a global flavor, so pessimism would probably be unjustified here.

The urge to switch to equivalent instruction sets (easier to reverse) shows that these questions are less obvious than they first appear. A more subtle switch happened in this post before, when total functions and stochastic matrices were replaced by partial functions and substochastic matrices. This switch is not strictly necessary, but the reversal is ugly otherwise. The switch to the substochastic matrices was actually the point where it became obvious that reversibility is not so trivial after all, and that one should write down details (as done above) instead of taking just a high level perspective (as presented in the motivation at the beginning). The questions raised by Niel de Beaudrap also contributed to the realization that the high level perspective is slightly shaky.

### Conclusion

Non-deterministic machines allow a finite number of deterministic transitions at each step. For probabilistic machines, these transitions additionally have a probability. This post conveys a different perspective on non-determinism and randomness. Ignoring global acceptance conditions, it focuses on local reversibility (as a local symmetry) instead. Because randomness preserves some local symmetries which are not preserved by determinism, this perspective reveals non-trivial differences between non-deterministic and probabilistic machines.