Skip to Content
CSE5313CSE5313 Coding and information theory for data science (Lecture 8)

CSE5313 Coding and information theory for data science (Lecture 8)

Review on Linear codes

The codeDimension kk (effective message length)Minimum distance ddDual codeMinimum distance of dual code
Fn\mathbb{F}^nnn11{0}\{0\}00
Parity coden1n-122Repetition codenn
Hamming code2mm12^m-m-133Punctured Hadamard code2m12^{m-1}

More on linear codes

Extended Hamming code

Consider the Hamming code [2m1,2mm1,3]F2[2^m-1,2^m-m-1,3]_{\mathbb{F}_2}.

Extend it to a cod eof length 2m2^m by adding a parity bit.

Recall the hamming code [7,4,3]2[7,4,3]_{2}.

HHAM=(100110101010110010111)H_{HAM}= \begin{pmatrix} 1 & 0 & 0 & 1 & 1 & 0 & 1\\ 0 & 1 & 0 & 1 & 0 & 1 & 1\\ 0 & 0 & 1 & 0 & 1 & 1 & 1\\ \end{pmatrix}

The extended Hamming code is:

HEXT=(11111111100110100101011000101110)H_{EXT}= \begin{pmatrix} 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1\\ 1 & 0 & 0 & 1 & 1 & 0 & 1 & 0\\ 0 & 1 & 0 & 1 & 0 & 1 & 1 & 0\\ 0 & 0 & 1 & 0 & 1 & 1 & 1 & 0\\ \end{pmatrix}

The minimum distance of the extended Hamming code is 4.

Proof for minimum distance

It is sufficient to show that every 3 columns are linearly independent.

Using the lemma for minimum distance, we have that the minimum distance is 4.

Notice that in F2\mathbb{F}_2, multiplication is equivalent to AND operation.

By simple observations, we know that every 2 columns are linearly independent.

Consider the following linear combination:

a1(1)+a2(1)+a3(1)=0a_1\begin{pmatrix} 1\\ \vdots\\ \end{pmatrix}+a_2\begin{pmatrix} 1\\ \vdots\\ \end{pmatrix}+a_3\begin{pmatrix} 1\\ \vdots\\ \end{pmatrix}=0

Therefore, every 3 columns are linearly independent since the top row will always be 1. if a1,a2,a3a_1,a_2,a_3 are 11.

Therefore, the minimum distance is 4.

For the dimension of this code, we have k=2mm1k=2^m-m-1, with total code length mm.

Augmented Hadamard code

Consider what is generated by the parity check matrix of the extended Hamming code.

Let xHEXTxH_{EXT}

  • If xHEXT=0xH_{EXT}=0, then (0,x2,,xm)HHAM=(y,0)(0,x_2,\ldots, x_m)H_{HAM}=(y,0), where yy is the punctured hadamard codeword.
  • If xHEXT=1xH_{EXT}=1, then (1,x2,,xm)HHAM=(1,1,,1)+(y,0)(1,x_2,\ldots, x_m)H_{HAM}=(1,1,\ldots,1)+(y,0), where yy is the punctured hadamard codeword.

Therefore, the code generated by HEXTH_{EXT} is given by:

Let C\mathcal{C} be the hadamard code.

Let C+I\mathcal{C}+\mathbb{I} be its shift by the all 1’s vector (flip all bits of all words).

– Augmented Hadamard code = CC+I\mathcal{C} \cup \mathcal{C} + \mathbb{I}.

The length of the code is 2m2^m.

The dimension of the code is m+1m+1.

Since the code is still linear, the minimum distance is the minimum weight of the codewords.

So minimum distance of C+I\mathcal{C}+\mathbb{I} is 2m12^{m-1}.

Summary for simple linear codes

The codeDimension kk (effective message length)Minimum distance ddDual codeMinimum distance of dual code
Fn\mathbb{F}^nnn11{0}\{0\}00
Parity coden1n-122Repetition codenn
Hamming code2mm12^m-m-1 (length 2m12^m-1)33Punctured Hadamard code2m12^{m-1}
Extended Hamming code2mm12^m-m-1 (length 2m2^m)44Augmented Hadamard code2m12^{m-1}

Boundary of linear codes

Natural questions:

  • Can we extend the table of linear codes infinitely?
  • What set of configuration (n,k,d)q(n,k,d)_q are impossible?
  • What set of configuration (n,k,d)q(n,k,d)_q are possible, even if we don’t know how to construct them?

Boundary I: Singleton bound

Singleton is a name for the person who discovered this bound.

Theorem: For any linear code CFqn\mathcal{C}\subseteq \mathbb{F}^n_q, we have dnk+1d\leq n-k+1.

Proof

Idea: Using the Pigeonhole principle.

Assume an code [n,k,d]q[n,k,d]_q exists.

Pigeons: All qkq^k possible code word of C\mathcal{C}.

Holes: All qq^\ell values of the first \ell entries of a codeword (for some <n\ell<n).

If q<qkq^\ell<q^k, then by Pigeonhole principle, there exists two codewords in C\mathcal{C} that agrees on the first \ell entries.

So dnd\leq n-\ell.

So the largest \ell value for which this arguments works is nk+1n-k+1.

Definition of Maximum Distance Separable (MDS) code

A code C=[n,k,d]q\mathcal{C} = [n,k,d]_q with d=nk+1d = n - k + 1 is called a Maximum Distance Separable (MDS) code.

Examples for singleton bound

Fn\mathbb{F}^n: any n,k=n,d=1n, k = n, d = 1.

  • Attains with equality!

Parity: any n,k=n1,d=2n, k = n - 1, d = 2.

  • Attains with equality!

Hamming: n=2m1,k=2mm1,d=3n = 2^m - 1, k = 2^m - m - 1, d = 3.

nk+1=m+1>3n - k + 1 = m + 1 > 3.

This creates some trade-off between kk and dd.

Boundary II: The Sphere Packing Bound

Let r=d12r=\lfloor \frac{d-1}{2}\rfloor, then i=0r(ni)(q1)iqnk\sum_{i=0}^{r}\binom{n}{i}(q-1)^i\leq q^{n-k}.

Proof

Let c=(c1,c2,,cn)Fqnc=(c_1,c_2,\ldots,c_n)\in \mathbb{F}^n_q, and let B(c,r)={yFqn:dH(c,y)r}B(c,r)=\{y\in \mathbb{F}^n_q: d_H(c,y)\leq r\} for some rnr\leq n.

Computer B(c,r)|B(c,r)|.

B(c,0)=1|B(c,0)|=1

{yFqn:dH(c,y)=1}=n(q1)|\{y\in \mathbb{F}^n_q: d_H(c,y)=1\}|=n(q-1).

{yFqn:dH(c,y)=2}=(n2)(q1)2=n(n1)2(q1)2|\{y\in \mathbb{F}^n_q: d_H(c,y)=2\}|=\binom{n}{2}(q-1)^2=\frac{n(n-1)}{2}(q-1)^2.

So, B(c,r)=i=0r(ni)(q1)i|B(c,r)|=\sum_{i=0}^{r}\binom{n}{i}(q-1)^i.

Recall that C\mathcal{C} of minimum distance dd if and only if c1,c2C,B(c1,d12)B(c2,d12)=\forall c_1,c_2\in \mathcal{C}, B(c_1,\lfloor \frac{d-1}{2}\rfloor)\cap B(c_2,\lfloor \frac{d-1}{2}\rfloor)=\emptyset.

Therefore, let r=d12r=\lfloor \frac{d-1}{2}\rfloor, we have i=0r(ni)(q1)iqnk\sum_{i=0}^{r}\binom{n}{i}(q-1)^i\leq q^{n-k}.

Definition for perfect code

A code C\mathcal{C} is called a perfect code if C=qnk|C|=q^{n-k}.

Examples for sphere packing bound

Let q=2q=2.

F2n\mathbb{F}_2^n: any n,k=n,d=1n, k = n, d = 1.

  • r=d12=0r = \frac{d-1}{2} = 0.
  • i=00(ni)(q1)i=1qnk=2nn=1\Rightarrow \sum_{i=0}^{0}\binom{n}{i}(q-1)^i = 1 \leq q^{n-k} = 2^{n-n} = 1.
  • Attained with equality!

Parity: any n,k=n1,d=2n, k = n - 1, d = 2.

  • r=d12=0r = \frac{d-1}{2} = 0.

  • i=00(ni)(q1)i=1qnk=2nk=2n(n1)=2\Rightarrow \sum_{i=0}^{0}\binom{n}{i}(q-1)^i = 1 \leq q^{n-k} = 2^{n-k} = 2^{n-(n-1)} = 2.

  • q2q \geq 2 \Rightarrow NOT attained with equality.

Exercise: Equality is attained for the repetition code (dual of parity) for odd nn.

Hamming: n=2m1,k=2mm1,d=3n = 2^m - 1, k = 2^m - m - 1, d = 3.

  • r=d12=1r = \frac{d-1}{2} = 1.
  • i=01(ni)(q1)i=1+(2m1)=2m\Rightarrow \sum_{i=0}^{1}\binom{n}{i}(q-1)^i = 1 + (2^{m}-1) = 2^{m}.
  • qnk=2m\Rightarrow q^{n-k} = 2^{m}.
  • Attained with equality!

• Attained with equality!

Fortunately, there are only 4 types of binary linear perfect codes:

  • Fn\mathbb{F}^n
  • Repetition code
  • Hamming code
  • [23,12,7]2[23,12,7]_2 Golay code

Boundary III: The Gilbert-Varshamov Bound

Let n,k,d,qn,k,d,q such that Vq(nk,d2)qnkV_q(n-k, d-2)\leq q^{n-k}, then there exists an [n,k,d]q[n,k,d]_q code.

Singleton, sphere-packing provide necessary conditions for existence of codes.

Are there sufficient conditions?

Recall:

  • Lemma: The minimum distance of C\mathcal{C} is the maximum integer such that every d1d-1 columns of the parity-check matrix HH are linearly independent.

Idea:

  • Construct HH column by column, ensuring that no dependencies occur.

Idea:

Construct HH column by column, ensuring that no dependencies occur.

Algorithm:

  • Begin with (nk)×(nk)(n-k)\times (n-k) identity matrix.
  • Assume we choose columns h1,h2,,hh_1,h_2,\ldots,h_\ell (each hih_i is in Fqn\mathbb{F}^n_q)
  • Then next column hh_{\ell} must not be in the space of any previous d2d-2 columns.
    • hh_{\ell} cannot be written as [h1,h2,,h1]x[h_1,h_2,\ldots,h_{\ell-1}]x^\top for xx of Hamming weight at most d2d-2.
    • So the ineligible candidates for hh_{\ell} is:
      • B1(0,d2)={xFq1:dH(0,x)d2}B_{\ell-1}(0,d-2)=\{x\in \mathbb{F}^{\ell-1}_q: d_H(0,x)\leq d-2\}.
      • B1(0,d2)=i=0d2(1i)(q1)i|B_{\ell-1}(0,d-2)|=\sum_{i=0}^{d-2}\binom{\ell-1}{i}(q-1)^i, denoted by Vq(1,d2)V_q(\ell-1, d-2).
    • So the candidates for hh_{\ell} is:
      • Out of which Vq(1,d2)V_q(\ell-1, d-2) are ineligible.
      • Need nn columns overall, so we need Vq(nk,d2)qnkV_q(n-k, d-2)\leq q^{n-k}.
Last updated on