Skip to Content
CSE5313CSE5313 Coding and information theory for data science (Lecture 7)

CSE5313 Coding and information theory for data science (Lecture 7)

Continue on Linear codes

Let C=[n,k,d]F\mathcal{C}= [n,k,d]_{\mathbb{F}} be a linear code.

There are two equivalent ways to describe a linear code:

  1. A generator matrix GFqk×nG\in \mathbb{F}^{k\times n}_q with kk rows and nn columns, entry taken from Fq\mathbb{F}_q. C={xGxFk}\mathcal{C}=\{xG|x\in \mathbb{F}^k\}
  2. A parity check matrix HFq(nk)×nH\in \mathbb{F}^{(n-k)\times n}_q with (nk)(n-k) rows and nn columns, entry taken from Fq\mathbb{F}_q. C={cFn:Hc=0}\mathcal{C}=\{c\in \mathbb{F}^n:Hc^\top=0\}

Dual code

Definition of dual code

CC^{\perp} is the set of all vectors in Fn\mathbb{F}^n that are orthogonal to every vector in CC.

C={xFn:xc=0 for all cC}C^{\perp}=\{x\in \mathbb{F}^n:x\cdot c=0\text{ for all }c\in C\}

Also, the alternative definition is:

  1. C={xFn:Gx=0}C^{\perp}=\{x\in \mathbb{F}^n:Gx^\top=0\} (only need to check basis of CC)
  2. C={xHxFnk}C^{\perp}=\{xH|x\in \mathbb{F}^{n-k}\}

By rank-nullity theorem, dim(C)=ndim(C)=nkdim(C^{\perp})=n-dim(C)=n-k.

Warning

CC={0}C^{\perp}\cap C=\{0\} is not always true.

Let C={(0,0),(1,1)}F22C=\{(0,0),(1,1)\}\subseteq \mathbb{F}_2^2. Then C={(0,0),(1,1)}F22C^{\perp}=\{(0,0),(1,1)\}\subseteq \mathbb{F}_2^2 since (1,1)(11)=0(1,1)\begin{pmatrix} 1\\ 1\end{pmatrix}=0.

Example of binary codes

Trivial code

Let F=F2\mathbb{F}=\mathbb{F}_2.

Let C=FnC=\mathbb{F}^n.

Generator matrix is identity matrix.

Parity check matrix is zero matrix.

Minimum distance is 1.

Parity code

Let F=F2\mathbb{F}=\mathbb{F}_2.

Let C={(c1,c2,,ck,i=1kci)}C=\{(c_1,c_2,\cdots,c_{k},\sum_{i=1}^k c_i)\}.

The generator matrix is:

G=(10001010010010100011)G=\begin{pmatrix} 1 & 0 & 0 & \cdots & 0 & 1\\ 0 & 1 & 0 & \cdots & 0 & 1\\ 0 & 0 & 1 & \cdots & 0 & 1\\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots\\ 0 & 0 & 0 & \cdots & 1 & 1 \end{pmatrix}

The parity check matrix is:

H=(11111)H=\begin{pmatrix} 1 & 1 & 1 & \cdots & 1 & 1 \end{pmatrix}

Minimum distance is 2.

CC^{\perp} is the repetition code.

Lemma for minimum distance

The minimum distance of C\mathcal{C} is the maximum integer dd such that every d1d-1 columns of HH are linearly independent.

Proof

Assume minimum distance is dd. Show that every d1d-1 columns of HH are independent.

  • Fact: In linear codes minimum distance is the minimum weight (dH(x,y)=wH(xy)d_H(x,y)=w_H(x-y)).

Indeed, if there exists a d1d-1 columns of HH that are linearly dependent, then we have Hc=0Hc^\top=0 for some cCc\in \mathcal{C} with wH(c)<dw_H(c)<d.

Reverse are similar.

The Hamming code

Let mNm\in \mathbb{N}.

Take all 2m12^m-1 non-zero vectors in F2m\mathbb{F}_2^m.

Put them as columns of a matrix HH.

Example: for m=3m=3,

H=(100110101010110010111)H=\begin{pmatrix} 1 & 0 & 0 & 1 & 1 & 0 & 1\\ 0 & 1 & 0 & 1 & 0 & 1 & 1\\ 0 & 0 & 1 & 0 & 1 & 1 & 1\\ \end{pmatrix}

Minimum distance is 3.

Proof for minimum distance

Using the lemma for minimum distance. Since HH are linearly independent, the minimum distance is 3.

So the maximum number of error correction is 1.

The length of code is 2m12^m-1.

k=2mm1k=2^m-m-1.

Hadamard code

Define the code by encoding function:

E(x):F2mF22m=(xy1,,xy2m)E(x): \mathbb{F}_2^m\to \mathbb{F}_2^{2^m}=(xy_1^\top,\cdots,xy_{2^m}^\top) (yF2my\in \mathbb{F}_2^m)

Space of codewords is image of EE.

This is a linear code since each term is a linear combination of xx and yy.

If x1,x2,,xmx_1,x_2,\ldots,x_m is a basis of F2m\mathbb{F}_2^m, then E(x1),E(x2),,E(xm)E(x_1),E(x_2),\ldots,E(x_m) is a basis of C\mathcal{C}.

So the dimension of C\mathcal{C} is mm.

Minimum distance is: 2m12^{m-1}.

Proof for minimum distance

Since the code is linear, then the minimum distance is the minimum weight of the codewords.

For each xF2mx\in \mathbb{F}_2^m, there exists 2m12^{m-1} such that E(x)=1E(x)=1.

For all non-zero xx we have d(E(x),E(0))=2m1d(E(x),E(0))=2^{m-1}.

There exists a redundant yi=0y_i=0, we remove it from E(x)E(x) to get a puntured Hadamard codeword.

The length of the code is 2m12^{m-1}.

SO E:F2mF22m1E: \mathbb{F}_2^m\to \mathbb{F}_2^{2^{m-1}} is a linear code.

The generator matrix is the parity check matrix of Hamming code.

The dual of Hamming code is the (puntured) Hadamard code.

Last updated on