The integers are sometimes formally constructed as the equivalence classes of ordered pairs of natural numbers . The equivalence relation is defined via
so that gets interpreted as . This motivates the signed-digit representation. To avoid storing two numbers instead of one, the subtraction gets performed on the level of digits. This leads to signed digits. The overhead can be kept small by using unsigned 7-bit or 31-bit digits as starting point (instead of decimal or binary digits).
The overhead may be small, but the signed-digit representation remains redundant and non-unique. Comparing two integers in this representation is a non-trivial operation. But addition (and subtraction) becomes easy from a hardware perspective, since carry-bits no longer propagate. Multiplication also gets easier, since a part of multiplication consists of adding together intermediary results.
This post started as a subsection of another yet to be finished post. That subsection tried to use the trivial redundancy of representations of the computable real numbers as motivation to use similar redundant representations also for natural numbers. However, the constant references to simpler precursor ideas, to the well known practical applications, and to further complications and generalizations required in practical applications made that subsection grow totally out of proportion. It is still awfully long, even as a standalone post. The subsections try to cut it into more digestible pieces, but all the references to external material remain. The reader is not expected to read much of the linked material. It only serves as proof that this material has practical applications and is well known. This post may be seen as describing one specific alternative number format in detail, for which I dreamt about sufficient hardware support in a previous post.
Fast parallel arithmetic
In the 6502 Assembly language (C64, Apple II, …), addition is performed by the ADC instruction, which is the mnemonic for Add Memory to Accumulator with Carry. It adds the value in memory plus the carry-bit to the value in the accumulator, stores the result in the accumulator, and sets the carry-bit according to whether the addition overflowed or not. This operation could be represented as A + M + C -> A, or more precisely as
(A + M + C) % 256 -> A
(A + M + C) / 256 -> C
Since A and M can be at most 255 and C at most 1, the resulting C will again be either 0 or 1. This allows to add two arbitrarily long numbers. First one sets the carry bit to zero, then adds the lowest bytes, and then successively the higher bytes. Nice, but not very parallel.
Since the 6502 Assembly language has no instruction for multiplication, calling it fast would be an exaggeration. Let us instead imagine a machine which can perform 8 add with carry additions in parallel, where the carry-bits are stored in an additional byte. Since the meaning of the carry-bits before and after the parallel addition are different, we probably want to left-shift the carry-bits after the addition. Let us imagine that this machine can also perform 4 multiply with carry multiplications in parallel. How should it handle the carry-bits? Since (255+1)*(255+1)-1 can be represented in two bytes, it seems reasonable to treat them as a simple addition of 1 or 0 on input. The carry-bit on output might be interpreted as addition of 256 or 0, which would be consistent with the interpretation on output after addition.
Outputting a second carry-bit to be interpreted as addition of 256^2 or 0 would also be consistent, even so it is not required here. (Since (255+16+1)*(255+16+1)-16^3-256-16-1 = 16^4+16^3-16-1 cannot be represented in two bytes, allowing a second carry-bit may be useful to simplify recursive implementation of multiplication.) It is slightly inconsistent that the add with carry instruction only consumes one carry-bit per addition. After all, the idea is to represent intermediate results as M+C, in order to enable fast parallel addition of intermediate results. But consuming two carry-bits would be problematic since the result of (255+1)+(255+1) is not nicely representable.
What we just described is not new (of course). It is basically just the Carry Save Adder technique used to parallelize multiplication. However, since arithmetic with very long numbers is often used for modular arithmetic, this is not yet the end of the story. The Drawbacks section on wikipedia say that it has to be combined with the Montgomery modular multiplication technique, to be useful for public-key cryptography.
It is cool that our imagined computer can multiply 4 bytes and add 8 bytes in parallel, but how would we actually multiply two 4-bytes long numbers? Each byte of the first number must be multiplied by each byte of the second number, those results must be arranged appropriately and added together. The required 16 multiplication can be done by 4 subsequent parallel multiply with carry instructions, where the second 4-bytes (and the corresponding carry bits) are rotated in between. We have to add 1, 3, 5, 7, 7, 5, 3, 1 bytes in the different places of the result, which requires at least 24 additions. There are at most 6 additions in the last round, so we need at least 4 rounds of (8 bytes) parallel additions. And we might need even more rounds, since we get slightly too many carry-bits.
Faster multiplication and signed integers
Subtraction and signed integers are useful even for multiplication of positive integers, since subtraction is a basic operation required by fast multiplication algorithms like Karatsuba multiplication or its generalization, Toom-Cook multiplication. So what we describe next is not new either. See for example section “3.2 Multiplication” of The Complexity of Boolean Functions by Ingo Wegener, 1987, where the signed-digit representation is used for fast parallel multiplication algorithms.
The 6502 Assembly language also has an SBC instruction, which is the mnemonic for Subtract Memory from Accumulator with Borrow. This operation could be represented as A – M – ~C -> A, apparently the 6502 performs an ADC instruction with the bitwise-not of M. It makes sense, as this is the cheapest way to reduce subtraction to addition. For us, a SBB instruction where ~C got replaced by B fits better into our interpretation scheme
(A – M – B) % 256 -> A
(A – M – B) / 256 -> -B
The integer division here truncates towards , we could also write
(A – M – B + 256) % 256 -> A
(A – M – B + 256) / 256 -> 1-B
We can read A – M – B both as A – (M+B) and as (A-B) – M. The second reading is preferable, if we imagine a machine which can perform 8 SBB instructions in parallel, where the borrow-bits are stored in an additional byte. Again, we should left-shift the borrow-bits after the subtraction, since their meaning before and after the parallel subtraction is different. The parallel SBB instruction (+ left-shift) returned a signed integer in the a-b signed-digit representation, which we mentioned in the introduction.
Even so bit-twiddling may be nice, it somehow makes the trivial idea behind representing signed integers as a-b using signed-digits feel unnecessarily complicated. For D >= 3, we may just assume that we have a digit between -D+1 and D-1, after addition or subtraction we will have a result between -2*D+2 and 2*D-2, and if we stay in the range -D+2 and D-2, then we can just emit an overflow of -D, 0, or D, and the next digit can consume that overflow without overflowing itself.
More bit-twiddling with borrow-bits
Can we define a straightforward (parallel) ADB (add with borrow) instruction, which adds an unsigned integer to a signed integer, and returns a signed integer? We get
(A-B + M + 1) % 256 -> A
(A-B + M + 1) / 256 -> 1-B
if we just use ADC and replace C by (1-B). This represents the result as -1 + A + (1-B)*256. But how is this compatible with our signed-digit representation? When we left-shift the borrow-bits after the addition, we fill the rightmost place with 1 (instead of 0). In the other places, the -1 cancels with the 1 from (1-B)*256 of the previous byte. The leftmost place is also interesting: If B is set, then we can ignore it. But if it is not set, then A needs an additional byte set to 1. So signed-digit representation works with nearly no overhead.
We basically just replaced C by (1-B). But how could this trivial change enable us to represent signed integers? A-B = A – (1-C) = A+C – 1, so A and C now describe a digit in -1..255, while before it was a digit in 0..256. Even so we prefer the borrow-bits variant here, this way of looking at it (offset binary) can sometimes be helpful, for example when trying to interpret the SBC instruction of the 6502 Assembly language.
In my previous thoughts about (big) integers, there was the question whether the two’s complement representation is well suited for a variable width integer data type. One argument against it was that the addition or subtraction of a small number from a big number would take an unnecessarily long time. In the borrow-bits variant, for addition all bytes where the borrow-bit is not set must be touched, and for subtraction all bytes where the borrow-bit is set must be touched.
Before we come to parallel multiplication, what about multiplication by -1 and by a power of two? -1*(A-B)=B-A, so SBB of B (as bytes without borrow) and A gives the correct result. It is the sum of the bitwise-not of A, the byte-wise B and a carry-bit set to 1. The resulting (to be left-shifted) borrow-bit B is given by (1-C). The result of multiplication by a power of two is given by a left-shift followed by SBB. The same holds for multiplication by minus a power of two. In general, two successive multiplications by -1 will not yield the identity on the binary level, and identities about multiplication by powers of two (like associativity) will not hold on the binary level.
How to perform 4 multiply with borrow multiplications in parallel? If the borrow-bits are interpreted as simple addition of -1 or 0 on input, then the smallest result is -255, and the biggest 255*255, so the borrow-bit on output might be interpreted as addition of -256 or 0. However, what about the problem that addition cannot consume two borrow-bits? ADC outputs a digit in 0..2*255+1, SBB outputs a digit in -256..255, and ADB outputs a digit in -1..2*255. If we let ADB instead output a digit in 0..2*255-1 and an overflow of -256, 0, or 256, then the next digit can consume that overflow without overflowing itself. This extends the output range of ADB to -256…3*255, so we can now consume two borrows-bits of -1 and three bytes. (Basically the same as if we had just performed two subsequent ADB instructions, but the borrow propagation behaves nicer.)
The output range can also be extended one-sided (by 255). Extending ADB to the negative side or extending SBB to the positive side both yield a range -256..2*255, which can consume two borrow-bits of -1 and two bytes (or more precisely two positive bytes, one negative byte, and one borrow-bit of -1). ADC can be extended to the positive side to a range 0..3*255+1, so that it can consume two bytes and two carry-bits of +1.
Redundant representations of the computable real numbers
The idea of using a redundant representation for computable real numbers somehow feels trivial. There is no way to get a completely non-redundant representation anyway. The non-trivial idea is to exploit the redundancy to improve the computational properties of the computable real numbers. This idea is not new (of course, I keep repeating myself), and I tried my luck with it in an email 17 Dec 2018 TK: A signed-digit representation solves some issues to Norman Wildberger in response to the issues he raised in “Difficulties with real numbers as infinite decimals (II) | Real numbers + limits Math Foundations 92” on slide 8 at 31:24 before claiming: No such theory exists (2012)!! There are redundant representations of the computable real numbers, for which the operations of addition, subtraction and multiplication are computable. At the beginning of Problems with limits and Cauchy sequences | Real numbers and limits Math Foundations 94 on slide 1, we read
Attempts at ‘defining real numbers’:
– as infinite decimals
– as Cauchy sequences of rationals
– as Dedekind cuts
– as continued fractions
– others (Axiomatics, …)
None of these work!
In subsection “2.2. Non-termination” of Luther Tychonievich’s post on Continued Fractions, it is explained why computability (of addition and multiplication) fails for a representation by classical continued fractions. But they would be computable, if signed integers (instead of natural numbers) were allowed in the continued fraction expansion. One still needs to keep some restrictions on the numbers in the expansion, but it can be done, see section “3.3 Redundant Euclidean continued fractions” of Exact Real Computer Arithmetic with Continued Fractions by Jean Vuillemin, 1987. The signed-digit representation is also described there, in section “3.1 Finite representations of infinite sequences of numbers” at the bottom of page 13.
An infinite signed-digit representation of the computable real numbers
Signed-digit representations are not limited to integers. We will use them as representation of computable real numbers. The apparent drawback that the signed-digit representation of a number is not unique becomes a feature when it is used to represent computable real numbers.
We could try to define a computable real number as a Turing machine outputting the digits of its infinite binary extension. An objection is that addition of two such computable real numbers would not be computable. But what if we define it instead as the difference of two non-negative series of digits. It turns out that addition, subtraction and multiplication then become computable. (The reason why a signed-digit representation helps there is because it can eliminate chains of dependent carries.) However, testing whether such a number is zero turns out to be undecidable. This is fine, since
… equality testing for computable real numbers is undecidable. The idea here is to take some Turing machine for which we would like to decide whether it halts or not, and write out the real number 0.000…000… with an additional zero for every step of the machine, and output one final 1 when (and only if) the machine finally halts. Deciding whether this real number is equal to zero is equivalent to solving the halting problem for the given Turing machine.
Even so equality testing is undecidable, it would still be nice to have a representation for which addition and multiplication (and many other operations) are computable. And the signed-digit representation achieves exactly this, more or less. Division is still not computable, since it has a discontinuity at zero, and the signed-digit representation cannot easily avoid that issue. Using a representation via (a-b)/(c-d) can work around that issue, at the cost of also allowing 1/0 and 0/0. Jean Vuillemin tries to explain why this is important, but at least initially his definitions just confused me.
Even so the above text is awfully long and trivial, at least it is a proper blog post again. Maybe I should have talked more about the less trivial stuff, but the trivial stuff already filled the post. And this bit-twiddling is a real part of my thinking and acting, at least in my distant past. And I hope that having finally finished this post will allow me to also finish the other to be finished post soon, the one whose subsection grew totally out of proportion and turned into this post.