TMA4185: Coding Theory
Finite fields
We get the preliminaries out of the way first. You are expected to know Group theory and Ring theory before diving into this. Reading this chapter of Galois Theory is probably also helpful.
We reiterate some of the most useful points of those two courses.
Definiton
Definition. A finite field
F is a Ring where multiplication and addition are both abelian, and each element has a multiplicative inverse (i.e. it is a division ring), and the field has a finite set of elements.
The group of nonzero elements of a field
Definition.
F∗q is a cyclic group, i.e. there is anp∈Fq such that
F∗q={1,p,p2,…,pm−1}
m is called the order of the cyclic groupF∗ , and the elementp is called generator ofF∗ .
Every generator of a group
Polynomials
Definition
Given some field
Definition. Given a field
Fq the polynomial field overFq is the field of all polynomials
anxn+an−1xn−1⋯+a1x+a0,ai∈Fq under the operations of addition and multiplication of polynomials in the standard way.
The highest
If this
Definition. A polynomial
f(x) is said to be irreducible over a fieldFq if there exists no polynomialsg,h of degree lower thanf such thatf(x)=g(x)h(x) .
Usage to construct finite fields
Interestingly, we can use the above definition to construct fields of certain sizes.
Theorem. Let
f(x) be an irreducible polynomial over a fieldFp of degreem . Then the residue classes ofFp modulof(x) form a field of degreeq=pm :
Fq=Fp[x]/(f(x))
Example.
Let
Thus
Cyclotomic polynomials
Motivation
Basics
Codes
We wish to send some message
This message hopefully is of such a character that it allows us to succesfully decode
Before we go about constructing an encoding scheme, we make a few definitions:
Definition. A code
C is a set of codewords, each a tuple of some lengthn ;(a1,a2…an) .
A code
Put in other words, we say that a code describes a set of possible encodings of inputs (messages). Typically, the codewords are
We often write a codeword
In the most general of terms, this is all the structure necessary required to define codes. However, we almost exclusively work with linear codes:
Definition. A linear code is a code where all linear manipulations of one or more codewords result in a new codeword, i.e. for all
x,y∈C :(i)
ax∈C,a∈F (ii)
x+y∈C .
Example. Let
Generator matrices
We typically convert a message
Definition. A generator matrix
Equivalently, we can say that the rows of
Example. Let
We can calculate the entirity of
We note to no surprise that the amount of distinct codewords in
If a linear code takes input of length
Definition. A generator matrix is said to be in standard form if it is on the form
[Ik∣A] .
Example. The matrix
Parity check matrices
An interesting property of a linear code is that as it is a subspace of some vector space (in particular, of the subspace
So why is this of any importance to us? Looking back at our motivation for defining error-correcting codes, it is precisely that of detecting errors. As
Definition. The Parity-check matrix
H of a codeC is a(n−k)×n matrix, defined byHxT=0,∀x∈C .
If we have a generator matrix on standard form, we have a nice way of figuring out
Theorem. If
G=[Ik∣A] thenH=[−AT∣In−k] .
Proof. Clearly
Example. Let
be the generator matrix of some code
Finally we note (without proof) that the rows of
Dual codes
Definition. Let
G be the generator matrix of some codeC with parity check matrixH . Then the code defined by the rows ofH is called the dual code ofG , and is denotedG⊥ .
A code is said to be self-orthagonal if
Weights and distances
One interesting property of codes is that of weight and distance.
Definition. The (Hamming) distance
d(x,y) between two codewordsx andy of some codeC is defined as the number of coordinates in whichx andy differ.
Definition. The (Hamming) weight
wt(x,y) of a codewordx is defined as the number of non-zero coordinates ofx .
We make the following observation:
Theorem. If
x,y∈Fnq thend(x,y)=wt(x−y) .
Proof. Should be fairly clear: simply observe that
Definition. The minimum distance of a code
C is the minumum distance between any two codewords ofC .
Theorem. If
C is a linear code, then the minimum distance ofC is the same as the minimum weight of the non-zero codewords ofC .
Proof. The distance between a minimally weighted codeword
If an
There is one particularly useful way to determine the minimum distance of a code when its parity check matrix
Let
Thus the columns of
Say now, that
Theorem. If
C is a linear code with parity check matrixH which has a set ofd linearly dependent columns, but no set ofd−1 linearly dependent columns, then the minimum weight ofC isd .
Proof. Follows immediately from the above discussion.
Puncturing and extending codes
Having already created some code
Puncturing
As might be clear from the name, puncturing a code implies removing some
What might the minimum distance
Theorem. Let
C be a linear code, and letC∗ be the codeC punctured in thei 'th coordinate. Then:(i) If
C has a codeword of minumum weight with a non-zero entry in the punctured coordinatei , the minimum distance of the new code decreases by1 :C∗ is an[n−1,k,d−1] code.(ii) If
C has minimum distance of1 , and there are two codewords ofC which differ only in thei 'th coordinate, the new codeC∗ has dimensionk−1 :C∗ is an[n−1,k−1,d∗] , withd∗>1 .
We proof this informally be giving some intuition through an example:
Let
Contrarily, let
For a demonstration of the second property of the theorem, let
Extension
We can in similar fashion extend a code by adding a coordinate. Typically, we do this to get a new code consisting of only even-like codewords (codewords whose length are even).
Clearly, the inverse of the previous theorem applies for extension.
Example. Take the binary code generated by the matrix
In other words, to figure out the dimension and minimum distance of the extended code, simply apply the previous theorem in reverse.
Direct sum of codes
We can create larger codes from smaller codes by concatenating them through direct sum:
Definition. Let
G1 andG2 be generator matrices for linear codesC1 andC2 with parity check matricesH1 andH2 . The direct sum of the codesC1 andC2 is the code with generator- and parity check matrix
G=[G1OOG2]H=[H1OOH2] The new code is a
[n1+n2,k1+k2,mind1,d2 code.
Sadly, as the minimum distance of the new code is no larger than any of the original codes, the new code is of little use (as we will see).
The (u∣u+v) -construction
In somewhat similar fashion, we can concatenate two codes using what is called the
Definition. Let
G1 andG2 be generator matrices for linear codesC1 andC2 of the same lengthn (but not necessarily same dimension) with parity check matricesH1 andH2 . The(u∣u + v) construction of the codesC1 andC2 is the codeC with generator- and parity check matrix
G=[G1G1OG2]H=[H1O−H2H2] The new code is a
[2n,k1+k2,min2d1,d2] code.
We can also write the code
Equivalence of codes
There are multiple ways in which codes can be "equivalent", or "essentially" the same. One way is to assume equality as vector spaces (i.e. the vector spaces are isomorphic), however in this case, it should be fairly clear that properties such as weight might not be retained across the isomorphism - we could for instance create some isomorphism mapping all vectors of weight
Thus, we look at a different class of equivalence, namely permutation equivalence:
Definition. Two codes
C1 andC2 are permutation equivalent if there is a permutation of coordinates which sendsC1 toC2 .
Example. Let
We see that we can acquire
Granted that two permutation codes are "essentially the same", we can for any generator matrix of some code, acquire a generator matrix on standard form of some permutation equivalent code. This is indeed handy for finding parity check matrices for an essentially alike code.
Example. Let
However, we can create a permutation equivalent code
This matrix is row-reducible to standard-form:
This matrix is now on standard-form. Consequently we can easily find its parity-check matrix:
Encoding and decoding
Before we start defining some codes, and more complex methodologies, we look briefly on the process of encoding and decoding some messages.
Simple example
Say we have the code
Lets say that we encode the message
This is our codeword
Say we transmit this code, receive it at some endpoint, and wish to verify that the received message is a codeword.
We feed
Thus we know that, while we can't rule out that the message has been completely mangled, at least the received message is a codeword. If the likelyhood of one error during transmission is small, we know that the message with great probability is intact.
As
Take now a received encoded message
Thus some error has occured during transmission. Can we retrieve the error? Well if we assume that only one error has occured, it is clear that it must have happened in the third coordinate, as we know
Nearest neighbor decoding
One technique for decoding erroneous message is nearest neighbor decoding.
Definition. Let
x be a received message of some codeC . Using nearest neighbor decoding to decodex into its "most likely sent message* is done finding thec satisfying the following equation:
minc∈Cd(x,c)
In other words, we decode
Example with Hamming code
Extra Examples
Some codes
We now introduce some typical codes.
Binary repetition code
Perhaps the most obvious and simple code to construct, is the binary repetition code.
Lets say we want to transmit the code
We can use this code to detect up to
The binary repition code is a
Using nearest neighbor decoding, we can correct up to
Due to the sheer size of the generation matrix, as well as the superfluity of data having to be transmitted, binary repetition codes are seldom used.
Properties
Length | |
Dimension | |
Minimum distance |
Hamming codes
And important class of codes are the Hamming codes. They are determined uniquely by their parity check matrices. Take for instance the binary
Which gives us the parity-check matrix
What is noticeable about this parity-check matrix, is that its column vectors are all non-zero binary vectors of length
This property is the defining one of Hamming codes, and generalizes to any
Say we want to define a
From the last theorem of the section on weights, it is clear that any hamming code must have minimum distance
Interestingly, we can state the following:
Theorem. Any
[2r−1,2r−r−1,3] binary code is equivalent to the binary Hamming codeHr .
Properties
Length | |
Dimension | |
Minimum distance |
Reed-Muller codes
The Reed-Muller class of codes are constructed using the
Let
We can describe this code directly, by
Properties
Length | |
Dimension | |
Minimum weight(distance) |
Example
Lets build
Golay codes
Bounds on codes
It is interesting for us to study the size of codes, i.e. how many codewords can possibly exist in specific codes.
Aq(n,d) and Bq(n,d)
We base our following work around bounds relating to two values:
The size of the maximal
Sphere packing bound and perfect codes
Spheres and sphere packing
Before establishing a bound, we introduce the notion of a sphere around a codeword:
Definition. A sphere of size
r around the codewordc of a codeC is defined as:
Sr(u)={v∈Fnq∣d(v,y)≤r}
Theorem. A sphere of size
r around a codewordc over the fieldFnq has
r∑i=0(ni)(q−1)i codewords.
Proof. For each
Definition. The packing radius of a code
C is the largest value ofr , such that when placing spheres of sizer around all codewords ofC , all the sphere remain disjoint.
We are now ready to state the sphere packing bound:
Sphere packing bound
Theorem. The maximal size of any code
C overFnq ,Aq(n,d) , is contrained by the following equation:
Bq(n,d)≤Aq(n,d)≤qn∑ti=0(ni)(q−1)i,t=⌊(d−1)/2⌋
Proof. We prove this somewhat informally, using the notions of the discussion above.
First of all, we definitely know that
We can tighten the bound by sphere packing: As each codeword is separated by a distance
Perfect codes
If the sphere packing bound yields equality for some code
Some properties of Aq(n,d) and Bq(n,d)
MDS codes and the Singleton Upper Bound
Singleton bound
Theorem. Let
d≤n , then the following bound is called the Singleton Bound:
Aq(n,d)≤qn−d+1
Proof. We know that
From this, it can be shown that if a
MDS codes
A maximum distance separable (MDS) code is a code which satisfies the Singleton bound with equality.
We have the following properties for MDS codes:
Definition. Let
C be an[n,k] code overFq withk≥1 . Then the following are equivalent:(i)
C is an MDS code.(ii)
C⊥ is an MDS code.(iii) Every set of
k coordinates inC is an information set.(iv) Every set of
n−k coordinates inC⊥ is an information set.
Gilbert Lower Bound
We've now defined a few upper bounds, and we now turn to a handy lower bound, namely the Gilbert Lower Bound.
Theorem.
Bq(n,d)≥qn∑d−1i=0(ni)(q−1)i
We notice that this lower bound is virtually the same as the sphere packing bound, minus the inequality sign's direction, as well as the numerator of the sum, which here is
Proof. Left for you. Simply argue that spheres defined by the sum necessarily covers the entire vector space
Examples
Cyclic codes
Definitions
Note that we are working with linear codes.
Definition. A cyclic code is a code
C where every codewordc has the property that when each element ofc is shifted one step forward (or backward), the resulting vector is also a codeword:
(c0,c1,…,cn−1)c∈C⟹(cn−1,c0,c1,…cn−2∈C
As shifting one direction
An alternative and very handy notation (as we will see) of cyclic codes, are through polynomials.
Definition. Let
c=c0c1c2… be a codeword of some code. The polynomial representation ofc is the polynomial
c[x]=c0+c1x+c2x2+⋯+cn−1xn−1
The context of addition and multiplication of such polynomials are in the field
What is handy about this representation is that a cyclic shift of a codeword is the same as multiplying by
Clearly
Theorem. The cyclic codes over
Fq are precisely the ideals ofFq[x]/(xn−1) .
Proof. Omitted.
We often write the latter field as
Analogously to generator matrices, we define the notation of a generator polynomial:
Theorem. Let
g(x) be a polynomial of a nonzero cyclic codeC , such thatg(x) is monic, of minimum degree inC . Then:(i)
C=(g(x)) (ii)
g(x)∣(xn−1)
Theorem. Let
C be a cyclic code with generator polynomialg(x) . Letk=n−degg(x) and letg(x)=∑n−ki=0gixi . Then the dimension ofC isk , and{g(x),xg(x),…,xk−1g(x)} is a basis for the code.
We can "convert" the generation of our code from polynomial form into matrix form by nothing the following:
Theorem. Let
g(x) be our generator polynomial for a cyclic code[n,k] code,C with coefficients(g0,g1,…,gn−lk) . Then a generator matrix for our code is
G=[g0g1…gn−k0…0g0…gn−k−1gn−k…0…⋱]
Connection to cyclotomic polynomials and cyclotomic cosets
Theorem. Let
n be a positive integer relatively prime toq . Lett=ordn(q) and letα be a primitive nth root of unity inFqt . Let thens be some integer, the minimal polynomial ofαs overFq is defined as
Mαs=∏i∈Cs(x−αi) where
Cs is the q-cyclotomic coset ofs modulon .
This leads us to the following theore, which is what "ties it all together":
Theorem. Let
C be a cyclic code with generation polynomialg(x) over some fieldFq . Ifα is a primitive nth root of unity in some extension field ofFq , then
g(x)=∏sMαs(x) where the product is over some subset of the
q -cyclotomic cosets modulon .
In other words, the generating polynomial is necessarily a product of some minimial polynomials, which again are defined by the
Definition. The union of the
q -cyclotomic cosets defining a generator polynomialg(x) is called the defining set of the codeC generated byg(x) . The defining set is commonly denotedT .
There might be several different such polynomials for any given field, in fact, we have the following theorem:
Theorem. The number of cyclic codes in
Rn is2m wherem is the number ofq -cyclotomic cosets modulon .
Minimum distance of cyclic codes
Finally, we can state some bounds regarding the minimum distance of cyclic codes.
Theorem (BCH-Bound). Let
C be a cyclic code of lengthn overFq with a defining setT . Let the code have minimum weightd . IfT containsδ−1 consecutive elements (e.g.1,2,3… ) for some integerδ thend≥δ .
Examples
Example. (Exer. 2 Exam 2013)
Let
(i) What is the dimension and minimum distance of
(ii) If you receive
Solution.
(i) We know that the dimension of a cyclic code is
Any linear code can always correct
(ii) Obviously any codeword
Thus we have
Thus the error was in the
Encoding and decoding
BCH codes
From the BCH-Bound defined above, we notice that if we can find a defining set of some code with
Definition. A BCH-Code of length
n with designed distanceσ , is a codeC overFq with defining set
T=Cb∪Cb+1∪⋯∪Cb+σ−2 where
Ci is theq -cyclotomic coset modulon containingi .
Clearly
Theorem. A BCH-code with designed distance
σ has minimum weightσ
As is clear from the BCH-bound.
Reed-Solomon codes
One family of very useful BCH-codes are the Reed-Solomon codes:
Defintion. A Reed-Solomon code is a BCH-code
C overFq of lengthn=q−1 .
This implies first and foremost that all irreducible factors of
Theorem. A Reed-Solomon code with designed distance
σ has defining setT={b,b+1,…,b+σ−2} .
Convolutional codes
Definition
Up until now, all codes we have discussed have been block codes. For any input of length
Example. Let
This code takes one-dimensional input and produces two-dimensional output, i.e. it is a
Thus our messages are
Matrices
There is a very nice way of representing a convolutional message through polynomials. We let
As usual we call this a generator matrix.
This notation is very handy for generating codewords, as indeed we can map our messages to polynomials, and multiply these with the matrix in the obvious way.
Example. Let
Say we want to encode the message
Encoding this message is done by performing the calculation
I.e. the codewords
Encoding & State diagrams
Decoding using Viterbi algorithm
Degrees & Cannonical generator matrices
External degree
Definition. Let
G be a generator matrix for a convolutional code. The degree of anyi 'th row ofG is the degree of the highest-degree polynomial of that row.
Example. Let
The degree of the first row is
Definition. Let
G be a generator matrix for a convolutional code. The external degree of the generator matrix is the sum of the degrees of all the rows ofG .
Example. Let
The external degree of
Definition. Let
G be ak×n polynomial generator matrix for a convolutional code. Ak×x minor ofG is the determinant of somek columns.
Example. Let
Note that
Internal degree
Definition. Let
G be ak×n polynomial generator matrix for a convolutional code. The internal degree ofG is the maximum degree of all itsk×k minors.
Cannonical form
Definition. If a generator matrix
G of a codeC is the generator matrix of the code with the least external degree, the matrix is said to be on cannonical form.