# TMA4185: Coding Theory

# Finite fields

We get the preliminaries out of the way first. You are expected to know Group theory and Ring theory before diving into this. Reading this chapter of Galois Theory is probably also helpful.

We reiterate some of the most useful points of those two courses.

## Definiton

Definition.Afinite field$\F$ is a Ring where multiplication and addition are both abelian, and each element has a multiplicative inverse (i.e. it is a division ring), and the field has a finite set of elements.

The group of nonzero elements of a field

Definition.$\F_q^*$ is acyclic group, i.e. there is an$p \in \F_q$ such that

$$ \F_q^* = \{1, p, p^2, \dots, p^{m-1}\} $$

$m$ is called theorderof the cyclic group$\F^*$ , and the element$p$ is calledgeneratorof$\F^*$ .

Every generator of a group **primitive element** of

## Polynomials

### Definition

Given some field

Definition.Given a field$\F_q$ thepolynomial fieldover$\F_q$ is the field of all polynomials

$$a_n x^n + a_{n-1}x^{n-1} \dots + a_1 x + a_0,\> a_i \in \F_q$$ under the operations of addition and multiplication of polynomials in the standard way.

The highest **degree** of a polynomial.

If this **monic**.

Definition.A polynomial$f(x)$ is said to beirreducibleover a field$\F_q$ if there exists no polynomials$g, h$ of degree lower than$f$ such that$f(x) = g(x)h(x)$ .

### Usage to construct finite fields

Interestingly, we can use the above definition to construct fields of certain sizes.

Theorem.Let$f(x)$ be an irreducible polynomial over a field$\F_p$ of degree$m$ . Then the residue classes of$\F_p$ modulo$f(x)$ form a field of degree$q = p^m$ :

$$ \F_q = \F_p[x] / (f(x)) $$

**Example.**

Let

Thus

## Cyclotomic polynomials

# Motivation

# Basics

## Codes

We wish to send some message

This message hopefully is of such a character that it allows us to succesfully decode

Before we go about constructing an encoding scheme, we make a few definitions:

Definition.Acode$\C$ is a set ofcodewords, each a tuple of some length$n$ ;$(a_1, a_2\ldots a_n)$ .

A code

Put in other words, we say that a code describes a set of possible encodings of inputs (messages). Typically, the codewords are **binary code**. If the field is **ternary code**.

We often write a codeword

In the most general of terms, this is all the structure necessary required to define codes. However, we almost exclusively work with *linear* codes:

Definition.Alinear codeis a code where all linear manipulations of one or more codewords result in a new codeword, i.e. for all$\b x, \b y \in \C$ :(i)

$ax \in C,\> a \in \F$ (ii)

$x + y \in C$ .

**Example.** Let

## Generator matrices

We typically convert a message **generator matrix**. Such a matrix also imposes a (so far undefined, but definitely necessary) length requirement on the messages:

**Definition.** A **generator matrix**

Equivalently, we can say that the rows of

**Example.** Let

We can calculate the entirity of

We note to no surprise that the amount of distinct codewords in

If a linear code takes input of length

Definition.A generator matrix is said to be instandardform if it is on the form$[I_k \mid A]$ .

**Example.** The matrix

## Parity check matrices

An interesting property of a linear code is that as it is a subspace of some vector space (in particular, of the subspace *kernel* of some linear transformation

So why is this of any importance to us? Looking back at our motivation for defining error-correcting codes, it is precisely that of detecting errors. As

Definition.TheParity-check matrix$H$ of a code$C$ is a$(n-k) \times n$ matrix, defined by$H\b x^T = \b 0,\> \forall x \in \C$ .

If we have a generator matrix on standard form, we have a nice way of figuring out

Theorem.If$G = [I_k \mid A]$ then$H = [-A^T \mid I_{n-k}]$ .

*Proof.* Clearly

**Example.** Let

be the generator matrix of some code

Finally we note (without proof) that the rows of

## Dual codes

Definition.Let$G$ be the generator matrix of some code$\C$ with parity check matrix$H$ . Then the code defined by the rows of$H$ is called the dual code of$G$ , and is denoted$G^{\perp}$ .

A code is said to be **self-orthagonal** if **self-dual** if

## Weights and distances

One interesting property of codes is that of *weight* and *distance*.

Definition.The (Hamming)distance$d(x, y)$ between two codewords$\b x$ and$\b y$ of some code$\C$ is defined as the number of coordinates in which$\b x$ and$\b y$ differ.

Definition.The (Hamming)weight$wt(x, y)$ of a codeword$\b x$ is defined as the number of non-zero coordinates of$\b x$ .

We make the following observation:

Theorem.If$x, y \in \F_q^n$ then$d(x, y) = wt(x - y)$ .

*Proof.* Should be fairly clear: simply observe that

Definition.Theminimum distanceof a code$\C$ is the minumum distance between any two codewords of$\C$ .

Theorem.If$\C$ is a linear code, then the minimum distance of$\C$ is the same as the minimum weight of the non-zero codewords of$\C$ .

*Proof.* The distance between a minimally weighted codeword

If an

There is one particularly useful way to determine the minimum distance of a code when its parity check matrix

Let

Thus the columns of

Say now, that

Theorem.If$\C$ is a linear code with parity check matrix$H$ which has a set of$d$ linearly dependent columns, but no set of$d-1$ linearly dependent columns, then the minimum weight of$\C$ is$d$ .

*Proof.* Follows immediately from the above discussion.

## Puncturing and extending codes

Having already created some code *puncture* or *extension*.

### Puncturing

As might be clear from the name, **puncturing** a code implies removing some

What might the minimum distance

Theorem.Let$\C$ be a linear code, and let$\C^*$ be the code$\C$ punctured in the$i$ 'th coordinate. Then:(i) If

$\C$ has a codeword of minumum weight with a non-zero entry in the punctured coordinate$i$ , the minimum distance of the new code decreases by$1$ :$\C^*$ is an$[n - 1, k, d - 1]$ code.(ii) If

$\C$ has minimum distance of$1$ , and there are two codewords of$\C$ which differ only in the$i$ 'th coordinate, the new code$\C^*$ has dimension$k-1$ :$\C^*$ is an$[n - 1, k-1, d^*]$ , with$d^* > 1$ .

We proof this informally be giving some intuition through an example:

Let

Contrarily, let

For a demonstration of the second property of the theorem, let

### Extension

We can in similar fashion extend a code by adding a coordinate. Typically, we do this to get a new code consisting of only even-like codewords (codewords whose length are even).

Clearly, the inverse of the previous theorem applies for extension.

**Example.** Take the binary code generated by the matrix

In other words, to figure out the dimension and minimum distance of the extended code, simply apply the previous theorem in reverse.

## Direct sum of codes

We can create larger codes from smaller codes by concatenating them through *direct sum*:

Definition.Let$G_1$ and$G_2$ be generator matrices for linear codes$\C_1$ and$\C_2$ with parity check matrices$H_1$ and$H_2$ . Thedirect sumof the codes$\C_1$ and$C_2$ is the code with generator- and parity check matrix

$$G = \bmat{G_1 & \O \\ \O & G_2}\quad H = \bmat{H_1 & \O \\ \O & H_2}$$ The new code is a

$[n_1+n_2, k_1+k_2, \min{d_1, d_2}$ code.

Sadly, as the minimum distance of the new code is no larger than any of the original codes, the new code is of little use (as we will see).

## The $( u \mid u + v )$ -construction

In somewhat similar fashion, we can concatenate two codes using what is called the

Definition.Let$G_1$ and$G_2$ be generator matrices for linear codes$\C_1$ and$\C_2$ of the same length$n$ (but not necessarily same dimension) with parity check matrices$H_1$ and$H_2$ . The$\b{(u} \mid \b{u + v)}$ construction of the codes$\C_1$ and$C_2$ is the code$\C$ with generator- and parity check matrix

$$G = \bmat{G_1 & G_1 \\ \O & G_2}\quad H = \bmat{H_1 & \O \\ -H_2 & H_2}$$ The new code is a

$[2n, k_1+k_2, \min{2d_1, d_2}]$ code.

We can also write the code

## Equivalence of codes

There are multiple ways in which codes can be "equivalent", or "essentially" the same. One way is to assume equality as vector spaces (i.e. the vector spaces are isomorphic), however in this case, it should be fairly clear that properties such as weight might not be retained across the isomorphism - we could for instance create some isomorphism mapping all vectors of weight

Thus, we look at a different class of equivalence, namely *permutation equivalence*:

Definition.Two codes$\C_1$ and$\C_2$ are permutation equivalent if there is a permutation of coordinates which sends$\C_1$ to$\C_2$ .

**Example.** Let

We see that we can acquire

Granted that two permutation codes are "essentially the same", we can for any generator matrix of some code, acquire a generator matrix on standard form of some permutation equivalent code. This is indeed handy for finding parity check matrices for an essentially alike code.

**Example.** Let

However, we can create a permutation equivalent code

This matrix is row-reducible to standard-form:

This matrix is now on standard-form. Consequently we can easily find its parity-check matrix:

## Encoding and decoding

Before we start defining some codes, and more complex methodologies, we look briefly on the process of encoding and decoding some messages.

### Simple example

Say we have the code

Lets say that we encode the message

This is our codeword

Say we transmit this code, receive it at some endpoint, and wish to verify that the received message is a codeword.

We feed

Thus we know that, while we can't rule out that the message has been completely mangled, at least the received message *is* a codeword. If the likelyhood of *one* error during transmission is small, we know that the message with great probability is intact.

As

Take now a received encoded message

Thus some error has occured during transmission. Can we retrieve the error? Well if we assume that only one error has occured, it is clear that it must have happened in the third coordinate, as we know

## Nearest neighbor decoding

One technique for decoding erroneous message is *nearest neighbor decoding*.

Definition.Let$x$ be a received message of some code$\C$ . Usingnearest neighbor decodingto decode$x$ into its "most likely sent message* is done finding the$c$ satisfying the following equation:

$$\min_{c \in \C} d(x, c)$$

In other words, we decode

### Example with Hamming code

## Extra Examples

# Some codes

We now introduce some typical codes.

## Binary repetition code

Perhaps the most obvious and simple code to construct, is the **binary repetition code**.

Lets say we want to transmit the code

We can use this code to detect up to

The binary repition code is a

Using nearest neighbor decoding, we can correct up to

Due to the sheer size of the generation matrix, as well as the superfluity of data having to be transmitted, binary repetition codes are seldom used.

### Properties

Length | |

Dimension | |

Minimum distance |

## Hamming codes

And important class of codes are the **Hamming codes**. They are determined uniquely by their parity check matrices. Take for instance the binary

Which gives us the parity-check matrix

What is noticeable about this parity-check matrix, is that its column vectors are all non-zero binary vectors of length

This property is the defining one of Hamming codes, and generalizes to any

Say we want to define a

From the last theorem of the section on weights, it is clear that any hamming code must have minimum distance

Interestingly, we can state the following:

Theorem.Any$[2^r - 1, 2^r - r - 1, 3]$ binary code is equivalent to the binary Hamming code$\H_r$ .

### Properties

Length | |

Dimension | |

Minimum distance |

## Reed-Muller codes

The **Reed-Muller** class of codes are constructed using the

Let **Reed-Muller** code:

We can describe this code directly, by

### Properties

Length | |

Dimension | |

Minimum weight(distance) |

### Example

Lets build

## Golay codes

# Bounds on codes

It is interesting for us to study the size of codes, i.e. how many codewords can possibly exist in specific codes.

$A_q(n,d)$ and $B_q(n, d)$

We base our following work around bounds relating to two values:

The size of the **maximal $[n, k]$ code**,

**maximal linear**$[n, k]$ code,

## Sphere packing bound and perfect codes

### Spheres and sphere packing

Before establishing a bound, we introduce the notion of a *sphere* around a codeword:

Definition.A sphere of size$r$ around the codeword$\b c $ of a code$\C$ is defined as:

$$S_r(u) = \{v \in \F_q^n \mid d(v, y) \leq r\}$$

Theorem.A sphere of size$r$ around a codeword$\b c$ over the field$\F_q^n$ has

$$\sum_{i=0}^r \ncr n i (q - 1)^i$$ codewords.

*Proof.* For each

Definition.Thepacking radiusof a code$\C$ is the largest value of$r$ , such that when placing spheres of size$r$ around all codewords of$\C$ , all the sphere remain disjoint.

We are now ready to state the sphere packing bound:

### Sphere packing bound

Theorem.The maximal size of any code$\C$ over$F_q^n$ ,$A_q(n, d)$ , is contrained by the following equation:

$$B_q(n, d) \leq A_q(n, d) \leq \frac{q^n}{\sum_{i=0}^{t} \ncr n i (q - 1)^i},\quad t = \floor{(d-1)/2}$$

*Proof.* We prove this somewhat informally, using the notions of the discussion above.

First of all, we definitely know that

We can tighten the bound by sphere packing: As each codeword is separated by a distance

### Perfect codes

If the sphere packing bound yields equality for some code **perfect**.

### Some properties of $A_q(n,d)$ and $B_q(n, d)$

## MDS codes and the Singleton Upper Bound

### Singleton bound

Theorem.Let$d \leq n$ , then the following bound is called theSingleton Bound:

$$A_q(n,d) \leq q^{n-d+1}$$

*Proof.* We know that

From this, it can be shown that if a

### MDS codes

A **maximum distance separable** (MDS) code is a code which satisfies the Singleton bound with equality.

We have the following properties for MDS codes:

Definition.Let$\C$ be an$[n, k]$ code over$F_q$ with$k \geq 1$ . Then the following are equivalent:(i)

$\C$ is an MDS code.(ii)

$\C^{\perp}$ is an MDS code.(iii) Every set of

$k$ coordinates in$\C$ is an information set.(iv) Every set of

$n-k$ coordinates in$\C^{\perp}$ is an information set.

## Gilbert Lower Bound

We've now defined a few upper bounds, and we now turn to a handy lower bound, namely the *Gilbert Lower Bound*.

Theorem.

$$ B_q(n, d) \geq \frac{q^n}{\sum_{i=0}^{d-1} \ncr n i (q - 1)^i}$$

We notice that this lower bound is virtually the same as the sphere packing bound, minus the inequality sign's direction, as well as the numerator of the sum, which here is

*Proof.* Left for you. Simply argue that spheres defined by the sum necessarily covers the entire vector space

## Examples

# Cyclic codes

## Definitions

Note that we are working with linear codes.

Definition.Acyclic codeis a code$\C$ where every codeword$c$ has the property that when each element of$c$ is shifted one step forward (or backward), the resulting vector is also a codeword:

$$(c_0, c_1,\dots,c_{n-1}) c \in \C \implies (c_{n-1}, c_0, c_1, \dots c_{n-2} \in \C$$

As shifting one direction

An alternative and very handy notation (as we will see) of cyclic codes, are through polynomials.

Definition.Let$c = c_0c_1c_2\dots$ be a codeword of some code. Thepolynomial representationof$c$ is the polynomial

$$c[x] = c_0 + c_1 x + c_2 x^2 + \dots + c_{n-1} x^{n-1}$$

The context of addition and multiplication of such polynomials are in the field

What is handy about this representation is that a cyclic shift of a codeword is the same as multiplying by

Clearly

Theorem.The cyclic codes over$F_q$ are precisely the ideals of$\F_q[x] / (x^n - 1)$ .

*Proof.* Omitted.

We often write the latter field as

Analogously to generator matrices, we define the notation of a *generator polynomial*:

Theorem.Let$g(x)$ be a polynomial of a nonzero cyclic code$\C$ , such that$g(x)$ is monic, of minimum degree in$\C$ . Then:(i)

$\C = (g(x))$ (ii)

$g(x) \mid (x^n - 1)$

Theorem.Let$\C$ be a cyclic code with generator polynomial$g(x)$ . Let$k = n - \mathrm{deg} g(x)$ and let$g(x) = \sum_{i = 0}^{n-k}g_i x^i$ . Then the dimension of$\C$ is$k$ , and$\{g(x), xg(x), \dots, x^{k-1}g(x)\}$ is a basis for the code.

We can "convert" the generation of our code from polynomial form into matrix form by nothing the following:

Theorem.Let$g(x)$ be our generator polynomial for a cyclic code$[n, k]$ code,$\C$ with coefficients$(g_0, g_1, \dots, g_{n-lk})$ . Then a generator matrix for our code is

$$G = \bmat{ g_0 & g_1 & \dots & g_{n-k} & 0 \dots & \\ 0 & g_0 & \dots & g_{n-k-1} & g_{n-k} & \dots \\ 0 & \dots & & && \ddots} $$

## Connection to cyclotomic polynomials and cyclotomic cosets

Theorem.Let$n$ be a positive integer relatively prime to$q$ . Let$t = \mathrm{ord}_n(q)$ and let$\alpha$ be a primitive nth root of unity in$\F_{q^t}$ . Let then$s$ be some integer, the minimal polynomial of$\alpha^s$ over$\F_q$ is defined as

$$M_{\alpha^s} = \prod_{i \in C_s} (x - \alpha^i)$$ where

$C_s$ is the q-cyclotomic coset of$s$ modulo$n$ .

This leads us to the following theore, which is what "ties it all together":

Theorem.Let$\C$ be a cyclic code with generation polynomial$g(x)$ over some field$\F_q$ . If$\alpha$ is a primitive nth root of unity in some extension field of$\F_q$ , then

$$ g(x) = \prod_{s} M_{\alpha^s}(x)$$ where the product is over some subset of the

$q$ -cyclotomic cosets modulo$n$ .

In other words, the generating polynomial is necessarily a product of some minimial polynomials, which again are defined by the

Definition.The union of the$q$ -cyclotomic cosets defining a generator polynomial$g(x)$ is called thedefining setof the code$\C$ generated by$g(x)$ . The defining set is commonly denoted$T$ .

There might be several different such polynomials for any given field, in fact, we have the following theorem:

Theorem.The number of cyclic codes in$\RC_n$ is$2^m$ where$m$ is the number of$q$ -cyclotomic cosets modulo$n$ .

## Minimum distance of cyclic codes

Finally, we can state some bounds regarding the minimum distance of cyclic codes.

Theorem (BCH-Bound).Let$\C$ be a cyclic code of length$n$ over$\F_q$ with a defining set$T$ . Let the code have minimum weight$d$ . If$T$ contains$\delta - 1$ consecutive elements (e.g.$1, 2, 3 \dots$ ) for some integer$\delta$ then$d \geq \delta$ .

## Examples

**Example. (Exer. 2 Exam 2013)**

Let

(i) What is the dimension and minimum distance of

(ii) If you receive

**Solution.**

(i) We know that the dimension of a cyclic code is

Any linear code can always correct

(ii) Obviously any codeword

Thus we have

Thus the error was in the

## Encoding and decoding

# BCH codes

From the BCH-Bound defined above, we notice that if we can find a defining set of some code with

Definition.ABCH-Codeof length$n$ withdesigned distance$\sigma$ , is a code$\C$ over$\F_q$ with defining set

$$T = \C_b \cup \C_{b+1} \cup \dots \cup C_{b + \sigma - 2}$$ where

$C_i$ is the$q$ -cyclotomic coset modulo$n$ containing$i$ .

Clearly

Theorem.A BCH-code with designed distance$\sigma$ has minimum weight$\sigma$

As is clear from the BCH-bound.

## Reed-Solomon codes

One family of very useful BCH-codes are the Reed-Solomon codes:

Defintion.AReed-Solomoncode is a BCH-code$\C$ over$\F_q$ of length$n = q-1$ .

This implies first and foremost that all irreducible factors of

Theorem.AReed-Solomoncode with designed distance$\sigma$ has defining set$$T = \{b, b+1,\dots,b+\sigma - 2\}$$ .

# Convolutional codes

## Definition

Up until now, all codes we have discussed have been *block codes*. For any input of length **memory**

**Example.** Let

This code takes one-dimensional input and produces two-dimensional output, i.e. it is a

Thus our messages are

## Matrices

There is a very nice way of representing a convolutional message through polynomials. We let

As usual we call this a **generator matrix**.

This notation is very handy for generating codewords, as indeed we can map our messages to polynomials, and multiply these with the matrix in the obvious way.

**Example.** Let

Say we want to encode the message

Encoding this message is done by performing the calculation

I.e. the codewords

## Encoding & State diagrams

## Decoding using Viterbi algorithm

## Degrees & Cannonical generator matrices

### External degree

Definition.Let$G$ be a generator matrix for a convolutional code. Thedegreeof any$i$ 'th row of$G$ is the degree of the highest-degree polynomial of that row.

**Example.** Let

The degree of the first row is

Definition.Let$G$ be a generator matrix for a convolutional code. Theexternal degreeof the generator matrix is the sum of the degrees of all the rows of$G$ .

**Example.** Let

The external degree of

Definition.Let$G$ be a$k \times n$ polynomial generator matrix for a convolutional code. A$k \times x$ minorof$G$ is the determinant of some$k$ columns.

**Example.** Let

Note that

### Internal degree

Definition.Let$G$ be a$k \times n$ polynomial generator matrix for a convolutional code. Theinternal degreeof$G$ is the maximum degree of all its$k \times k$ minors.

### Cannonical form

Definition.If a generator matrix$G$ of a code$\C$ is the generator matrix of the code with the least external degree, the matrix is said to be oncannonical form.