Information and Coding

Information Science: Binary Coding
by Robin Stewart | May 2001

Page 2

The next thing to consider is that sometimes accurate information is much more important than efficiently coded information. This is especially true when there is a lot of noise in the transmission material corrupting the signal, for instance in communications from space probes (the Mariner-9 code was sent with 26 bits of protection for every 6 message bits!). Coding so as to ensure accuracy is called redundancy. Redundancy, intuitively enough, is defined to be 1-efficiency &endash; accentuating the compromise between the two.

The simplest form of redundancy is to simply repeat each bit a certain number of times. This technique is denoted Rn, where n is the number of repetitions. The code 101 would become 111000111. A more general way of representing redundant codes is in the (n,k) form, where n is the codename length and k is the number of actual message digits communicated. R3 in (n,k) form is (3,1). (n,k) code can be in any format; the International Telegraph Alphabet [Table 5] is a (6,5) code with a very clever "check digit" right after the 5 message digits. The check digit is chosen so as to make the number of 1’s even (this is called even-weight code). If you receive a code with an odd number of 1’s you know to request a repeat.

However, in some situations you can’t request a repeat (like when dealing with space probes), so you have to make do with the signal you get. In majority logic decoding, "the same signal is transmitted many times (i.e. Rn) with the assumption that the one most often received is probably right" (42). If you’re using R6 and get the code 110111, the message is probably ‘1’; you can correct that one error. With R6 you can also correct two errors, but if there are three (i.e. 110100), the message could be 1 or 0; you can detect there is error but you can’t correct it. It turns out that with any Rn code, if n is even, n/2 errors can be detected and (n-2)/2 corrected. If n is odd, (n-1)/2 errors can be detected and corrected. So n should be determined based on the number of errors expected; if you expect half of the bits transmitted to come out faulty, you need R3 to be able to correct ((3)-1)/2 = 1 error out of 2. In practice, you’d want several more backup bits to make sure the message you receive is the right one.

The last thing I’ll talk about is called the Hamming distance. What it refers to is how many bits are different between two codenames of equal length. If that doesn’t make sense, stick with me. It’s called "distance" because you can picture it as a dimensional coordinate system. Using a 2-bit code, you can set up a square, with coordinates:

(0,1) (1,1)
(0,0) (1,0)

To get from (0,1) to (1,0), for instance, you have to travel across two dimensions, so the Hamming distance is 2. Looking at it symbolically, both the first digit and second digit are different between codes. A 3-bit code produces a cube, and an n-bit code produces an n-dimensional system that’s impossible to visualize, but still easy to deal with mathematically: the Hamming distance between 100100101 and 001001101 is 4 (bold digits are the ones that differ between the two codes).

It turns out that the minimum Hamming distance is the more general determiner for the number of errors a coding system can detect and correct. The formulas for this are the same as the ones above for Rn code, because Rn code has a minimum Hamming distance of n (i.e. consider R4: 0000 and 1111). It makes sense that the bigger the difference between codenames, the easier to detect and correct errors. In the code 111101111 and 111011111, one faulty digit can easily make it impossible to know which of the messages was intended.

In summary, designing codes is quite a complex and involved process, which depends a lot on the type of data involved, the transmission process, and the reliability required. Like anything else in life, it’s impossible to get something from nothing, but there are many clever ways to squeeze out as much as possible from what you have to work with.

Back to page 1