CSE5313 Coding and information theory for data science (Lecture 6)
Recap
Vector spaces and subspaces over finite fields
is a vector space over .
With point-wise vector addition and scalar multiplication.
Example
is a vector space over .
Let
Then is a vector in that’s “orthogonal” to itself.
in .
In general field, the dual space and space may intersect non-trivially.
Let be a subspace of .
is a subgroup of under vector addition .
- Apply the theorem: If is finite, non-empty, and closed under the operation of , then is a subgroup of .
Proof
left to show:
Associativity: inherited from .
Unit element: .
Consider , are in . Since is finite, there exists such that .
Then .
Inverses: .
Automatically holds for unit element traversing.
Is every subgroup of a subspace?
Answer
Consider (field extension of with ).
, is a subgroup of .
But the span of is , which is not a subspace of .
Cosets in this definition are called Affine subspaces.
New content
Linear codes
A linear code is a subspace of over .
- The dimension of is denoted by .
- The minimum Hamming distance of is denoted by .
- Notation .
Two equivalent ways to constructing a linear code:
-
A generator matrix with rows and columns.
- The left image of is .
- The rows of are a basis for .
-
A parity check matrix with rows and columns.
- The right kernel of is .
- Multiplying by “checks” if .
Encoding of linear codes
Reminder:
- Encoding is the process of mapping a message to a codeword .
is a linear map.
Let be a linear code with generator matrix .
- Encoding is given by .
- It is injective (1-1). Suppose otherwise, then there exists such that . Then . Therefore, is injective.
So linear codes implies linear encoding: .
Systematic codes
Fact: Every can be brought to the form by
- Row operations.
- Permutation of columns.
Fact and are equivalent.
- Same length .
- Same dimension .
- Same minimum Hamming distance .
Encoding a systematic code:
- The input is a part of the output.
- Efficient encoding
- Immediate decoding (if no errors).
Codes, cosets, encoding, decoding
Linear code is a dimensional subspace of .
Size of the code is .
Encoding: .
Decoding: , .
Use syndrome to identify which coset that the noisy-code to belongs to.
Syndrome decoding
- Heavily depends onn the linear structure of the code.
Linear code is a -dimensional subspace of .
Shift of Linear code is a -dimensional affine subspace of .
All cosets of the same size
If , then it is possible to extract from .
by syndrome decoding, we can do better than exhaustive search.
Idea:
Let belogns to the coset .
Moreover, and are in the same coset.
Standard Array
Let and denote .
- Then .
- The number of cosets is .
Then we arrange all elements of into a array.
- So that every row is a coset (including itself)
- Lightest word in each cosets on the leftmost column
Example
Let and
So .
Then .
The standard array is:
First row is .
Second row is ,
Third row is .
Fourth row is .
| 00000 | 10110 | 01011 | 11101 |
|---|---|---|---|
| 00001 | 10111 | 01010 | 11100 |
| 00010 | 10100 | 01001 | 11110 |
| 00100 | 10010 | 01101 | 11000 |
Any two elements in a row are of the form and for some .
Same syndrome if .
Entries in different rows have different syndrome.
Proof
Choose the lightest word in each coset on the leftmost column.
Time complexity: . Space complexity: space.
Compare with exhaustive search: Time: .
Syndrome decoding - Intuition
Given , we identify the set to which belongs by computing the syndrome.
- We identify as the coset leader (leftmost entry) of the row .
- We output the codeword in which is closest () by subtracting from .
Syndrome decoding - Formal
Given , we identify the set to which belongs by computing the syndrome.
We identify as the coset leader (leftmost entry) of the row .
We output the codeword in which is closest (example ) by subtracting from .