error-detecting and error- correcting codes

ERROR-DETECTING AND ERROR- CORRECTING CODES

․A simple scheme for detecting errors is to append a parity

check digit to each block of message. Even parity: the total

number of 1s in a block is even.

Thus, for the message BAT, which encoded as

01000010 01000001 01010100

will be transmitted as 010000100 010000010 010101001.

․Suppose BAT is received as 010000110 010000010

010101001. We know one digit is wrong in the first block.

․Another error-detecting scheme is to transmit each block twice. For instance, BAT is transmitted as

01000010 01000010 01000001 01000001

01010100 01010100

Again, received message is as the following, revealing the 8th digit is incorrect in the 1st block.

01000011 01000010 01000001 01000001

01010100 01010100

․Definition:

(1) message word: message block; message length k = 8

(2) code-word: bit string to be transmitted for a message word;

code-word length n = 9 (the former), n = 16 (the latter)

(3) efficiency of (k,n)-block code, i.e. k/n: 8/9 (the former),

8/16 (the latter)

․ 編碼 (E)/ 解碼 (D) 函數 : one-to-one function and E = D-1 (inverse func of D), D = E-1 message w1, w2, w3… to be encoded as E(w1), E(w2), E(w3), ….. decoded as D(E(w1)) = w1 if correct, D(E(w2)) = w2, ….

Ex31: By transmitting each block 3 times, we get 01000010 01000010 01000010 01000001 01000001 01000001 01010100 01010100 01010100 Although efficiency is low (k/n=8/24), it is possible to identify the intended code-word even if transmitted wrongly. The 8th bit is incorrectly received as follows: 01000011 01000010 01000010 …., but we assume correct code-word is 01000010 01000010 01000010 …. dut to probability.

Ex32: 兩字碼如後 : 01001100 01001101, 若錯誤發生在這兩個字碼的最後一個位元 , 系統不會察覺錯誤已發生 ; 所以欲擁有錯誤修正能力 , 不只要求編 / 解碼函數為一對一函數 , 另外還需下述的論辯 .

․Hamming distance: d(010101, 011001) = 2

d(11010100, 01111110) = 4

․The Triangle Inequality: d(c1, c3) ≤ d(c1, c2) + d(c2, c3)

if c1, c2 and c3 are code-words of the same length.

接收到一字碼 c, 系統解碼成 D(c) ; 但 , 接收到不是字碼的 c’ 碼 ,

系統還是會解碼成 D(c) , 如果只有 d(c,c’) 最小 . 如果有兩個以上的字碼一樣的最靠近 c’ , 則系統記錄有錯誤發生 ; 接收到不是字碼的 z 碼 , 若 D(z) 沒有定義 ( 例如雜訊 ), 系統應記錄 : 錯

誤發生

․ 假設一編碼系統的字碼距離都至少為 3 , 且接收到不是字碼的 c’ 碼 ; 如果已知傳送時發生單一錯誤 , 設 d(c6, c’) = 1 , 則 co

de- word c6 必為原先所傳輸的字碼 .

Suppose a code-word c7≠ c6 ; with the Triangle Inequality, 3 ≤ d(c6, c7) ≤ d(c’, c6) + d(c’, c7) = 1 + d(c’, c7) 2 ≤ d(c’, c7) 但 , 已知傳送時只發生單一錯誤 ; 故 , c7 不可能為原先傳輸的字

碼 .

Theorem: The min Hamming distance of a block code is m, then the sys (1) can detect r or fewer errors if and only if m ≥ r + 1 (2) can correct r or fewer errors if and only if m ≥ 2r + 1 ․ 好的編碼 -- 修正區塊中儘可能多的錯誤 -- 其字碼間的最小距離須愈大愈好

MATRIX CODES ․Define Wm to be the set of all strings of 0s and 1s with length

m. For example,

W3 = {000, 001, 010, 011, 100, 101, 110, 111}

the encoding func for a (k, n)-block code is a one-to-one func

E: Wk → Wn . Suppose A is a kxn matrix of 1s and 0s, and x

is a 1xn matrix of 1s and 0s. The product xA over Z2 is a 1xn matrix of 1s and 0s. Hence, an element in Wm can be regarded

as a 1xm matrix of 1s and 0s, and then the product of an

element x in Wk and matrix A is the element xA in Wn .

Ex33: Let and x = [1 1 0] , y = [0 1 0]

xA = [1 1 1 0] is an element of W4 yA = [0 1 1 1]

1101

1110

1001

A

․E defines a matrix code: when E is one-to-one and

E: Wk → Wn defined by E(x) = A

A: generator matrix

․To ensure that E is one-to-one is to choose the first k columns of A form the kxk identity matrix Ik. Denote A having this form

by writing A = [Ik |J] , where J is a kx(n-k) matrix whose columns are the last n-k columns of A. For such matrix A, the

first k entries of xA will be the same as in x. Hence, if x1A = x2A then x1 = x2 .

Ex34: then

1100100

1010010

1001001

A

1100

1010

1001

J

․ 前述的生成矩陣 A 為 3x7 矩陣 , 其對應的矩陣碼為 (3,7)- 區塊碼 ;

對任何 W3 中元素 , [w1 w2 w3], 可得 [w1 w2 w3]A = [w1 w2 w3 w1 w2 w3 w1+w2+w3]

i.e. 對訊息字元 w 而言 , 其對應字碼 , E(w) = wA, 是重複 w 兩次

再加上其同位位元 ; 編碼如下 : E([0 0 0]) = [0 0 0 0 0 0 0] = c1

E([0 0 1]) = [0 0 1 0 0 1 1] = c2

:

:

E([1 1 1]) = [1 1 1 1 1 1 1] = c8

Please note d(c1, c2) = 3, d(c1, c8) = 8 …, so it can detect 2

bits error and correct one bit error according to the theorem

above.

The Check Matrix of a Code ․Define a nx(n-k) matrix A* so that its first k rows are the

corresponding rows of J and its last n-k rows are those of

the (n-k)x(n-k) identity matrix.

Symbolically, we write and is called the check

matrix associated with A.

Ex35:

is not coincident!

knI

JA*

1110

1011

0101

100

010

001

A

1110

1011

0101

J

1000

0100

0010

00011110

1011

0101

*A

0000

0000

0000

*AA

Theorem: A = [Ik |J] be the generator matrix for a (k, n)-block code, and let A* be its check matrix. (1) In the kx(n-k) matrix AA*, each entry equals 0. (2) A word c of length n is a code-word iif cA* = 0 (a 1x(n-k) zero matrix)

Ex36: ( ref ex34)

For x = [1 0 1 0 0 1 1],

its we have xA = [1 0 0 1]

Because xA ≠0, theorem

implies x is not a codeword.

Ex37: Justify the condition that [w1 w2 w3 w4 w5 w6 w7]A*

= [0 0 0 0]

Since [w1 w2 w3 w4 w5 w6 w7]A*

= [w1+w4 w2+w5 w3+w6 w1+w2+w3+w7]

1000

0100

0010

00011100

1010

1001

*A

․We get w1 = -w4, i.e. w1= w4 (over Z2)

then w2 = w5, w3 = w6, and w1 + w2 + w3 = -w7 = w7

Hence, w7 = 0 if [w1 w2 w3] contains an even number of 1s,

and w7 = 1 if [w1 w2 w3] contains an odd number of 1s.

․Thus [w1 w2 w3 w4 w5 w6 w7]A* = [0 0 0 0] when [w4 w5 w6]

is identical to [w1 w2 w3], and w7 is a check bit of 1 or 0 according to whether the number of 1s in [w1 w2 w3] is even

or odd. But these are exactly the requirements that make xA* a codeword for this code.

MATRIX CODES THAT CORRECT ALL SINGLE-DIGIT ERRORS

Theorem: A code can correct all single-bit errors iif its check

matrix satisfies the following two conditions:

(1) No row of A* consists entirely of 0s.

(2) No two row of A* are identical.

Ex38: (ref ex35) The codeword corresponding to the message

word x = [1 0 1] is xA = [1 0 1 1 1 0 1].

Because this is a (3, 7)-block code, the first 3 bits are the

original message word.

Similarly, the codeword [0 1 1 1 0 1 0] decodes as [0 1 1].

Note: In step 1, the wA* is named as syndrome of w.

Ex39: Decode the message 0010111 1011100 0111010 1000111 1100010 which was sent using the code in ex 35. The decoding of this message, using the check matrix row

decoding method, is shown in the following table:

Since the syndrome of the last word is nonzero, but not a row of A*, there must have been 2 or more errors in the transmission.

Received

word w 0010111

1011100

0111010

1000111

1100010

Syndrome

wA*

0000

0001

0000

1101

0101

Row of A*

--

7

--

2

not a row of A*

Codeword

0010111

1011101

0111010

1100111

unknown

Message

word

001

101

011

110

unknown

Hamming Code ․ 回憶 ex37 的 (3, 7)- 碼 , 能夠修正所有單一位元錯誤 , 其效率為 3/7 ; 似乎不夠好 , 不禁要問 : 能夠修正所有單一位元錯誤且最有效率的 (3, n)- 碼為何 ?

P. 51 的定理透露端倪 , 如果 n=3 則每個收到的字元皆為字碼 ;

因而無法偵測到傳輸錯誤 . 假設 A = [Ik |J] 是 (3, 4)- 碼的生成矩陣 , 則 J 為 3x1 矩陣 , 會有兩相同列 ; 所以 A* 有兩個相同的列 ,

故無法修正所有單一位元錯誤 .

設 A = [Ik |J] 是 (3, 5)- 碼的生成矩陣 , 則 J 為 3x2 矩陣 , 會有兩相同列 ; 所以 A* 無法修正所有單一位元錯誤 .

　然而 , 若 A 是 (3, 6)- 碼的生成矩陣 , 則 J 為 3x3 矩陣 , 很容易找到

A*, 例如 , 故 (3, 6)- 碼可以修正所有單一位元錯誤 .

110

101

011

J

Theorem: A (k, n)-code exists that corrects all single-bit errors

iif 2n-k – 1 ≥ n.

Setting r = n – k, n = 2r – 1 i.e. n is 1 less than a power of 2.

We require r > 1 because r = 1 implies that k = 0 (wrong!)

The first five such values of k and n are given in the following:

For these pairs of k and n, we can construct a (k, n)-code that

corrects all single-bit errors. The check matrix, which is nxr, has

n rows, each of length r. But there are only 2r such rows for A*, so each nonzero row must be used. A matrix code for which the check matrix has this property is called a Hamming code.

r 2 3 4 5 6

k = 2r – r - 1 1 4 11 26 57

n = 2r - 1 3 7 15 31 63

Ex40: The preceding table shows that the Hamming code for r = 2 has k = 1

and n = 3. Thus the check matrix is

since the rows of A* must be nonzero

and distinct, so ? must be 1.

In this case, for every message word [x] we have [x]A = [x x x]

and so the code repeats each message word three times.

The decoding of each possible word that can be received

is given in the following:

10

01

11

*2I

JA

10

01

??

*2I

JA

Word received syndrome row of A* codeword Message word

000

001

010

011

100

101

110

111

00

01

10

11

11

10

01

00

--

3

2

1

1

2

3

--

000

000

000

111

000

111

111

111

0

0

0

1

0

1

1

1

• Suppose we have a Hamming code, and a word c’ is received.

The syndrome of c’ is s = c’A*. If s ≠ 0, then s appears as a

row of A* because every nonzero word of length r appears as

a row of A*. Let row i of A* equal s. Then c’ will be decoded as

the first k bits of c, where c is the word obtained by changing

the ith bit of c’. Thus when check matrix row decoding is used,

step 4 never happens, and every received word can be

decoded.

• The efficiency of the Hamming code for a particular value of r

is and

Thus there are Hamming codes with efficiencies that is close to 1.

121

12

12

rr

r rr

n

k rwhen

n

k1

error-detecting and error- correcting codes

Documents