This is an old version of the compendium, written Aug. 8, 2016, 10:11 a.m. Changes made in this revision were made by trmd. View rendered version.

TMA4185: Coding Theory

$$ \newcommand{\Gr}{\mathscr{G}} \newcommand{\Ri}{\mathscr{R}} \newcommand{\Z}{\mathbb{Z}} \newcommand{\Q}{\mathbb{Q}} \newcommand{\R}{\mathbb{R}} \newcommand{\RC}{\mathcal{R}} \newcommand{\N}{\mathbb{N}} \newcommand{\F}{\mathbb{F}} \newcommand{\C}{\mathcal{C}} \newcommand{\O}{\mathcal{O}} \newcommand{\H}{\mathcal{H}} \newcommand{\div}{\,\vert\,} \newcommand{\ndiv}{\,\not\vert\,} \newcommand{\neq}{\not=} \newcommand{\b}{\textbf} \newcommand{\bmat}[1]{\begin{bmatrix}#1\end{bmatrix}} \newcommand{\barr}[2]{\left[\begin{array}{#1}#2\end{array}\right]} \newcommand{\ceil}[1]{\left\lceil #1 \right\rceil} \newcommand{\floor}[1]{\left\lfloor #1 \right\rfloor} \newcommand{\ncr}[2]{{#1 \choose #2}} $$ # Finite fields We get the preliminaries out of the way first. You are expected to know [Group theory](https://www.wikipendium.no/TMA4150_Algebra_og_tallteori/nb/) and [Ring theory](https://www.wikipendium.no/MA3201_Ringer_og_Moduler/nb/#introduksjon-ringer) before diving into this. Reading [this chapter](https://www.wikipendium.no/MA3202_Galoisteori/nb/#definisjoner-teoremer) of Galois Theory is probably also helpful. We reiterate some of the most useful points of those two courses. ## Definiton >**Definition.** A **finite field** $\F$ is a [Ring](https://www.wikipendium.no/MA3201_Ringer_og_Moduler/nb/#introduksjon-ringer) where multiplication and addition are both abelian, and each element has a multiplicative inverse (i.e. it is a division ring), and the field has a finite set of elements. The group of nonzero elements of a field $\F$ is denoted $\F^*$, and form a group under addition. If a field has $q$ elements, we write $\F_q$. >**Definition.** $\F_q^*$ is a **cyclic group**, i.e. there is an $p \in \F_q$ such that > $$ \F_q^* = \{1, p, p^2, \dots, p^{m-1}\} $$ > $m$ is called the **order** of the cyclic group $\F^*$, and the element $p$ is called **generator** of $\F^*$. Every generator of a group $\F_q^*$ derived from a field $\F_q$ is called a **primitive element** of $\F_q$.

~~Consequently, for some $p$~~ ## Polynomials

### Definition Given some field $F_q$, we can create the field of polynomials over $\F_q$, $\F_q[x]$ where we call $x$ an indeterminate. >**Definition.** Given a field $\F_q$ the **polynomial field** over $\F_q$ is the field of all polynomials >$$a_n x^n + a_{n-1}x^{n-1} \dots + a_1 x + a_0,\> a_i \in \F_q$$ > under the operations of addition and multiplication of polynomials in the standard way. The highest $n$ for which $a_n \neq 0$ is called the **degree** of a polynomial. If this $a_n = 1$ the polynomial is called **monic**. >**Definition.** A polynomial $f(x)$ is said to be **irreducible** over a field $\F_q$ if there exists no polynomials $g, h$ of degree lower than $f$ such that $f(x) = g(x)h(x)$. ### Usage to construct finite fields Interestingly, we can use the above definition to construct fields of certain sizes. >**Theorem.** Let $f(x)$ be an irreducible polynomial over a field $\F_p$ of degree $m$. Then the residue classes of $\F_p$ modulo $f(x)$ form a field of degree $q = p^m$: > $$ \F_q = \F_p[x] / (f(x)) $$ **Example.** Let $f(x) = x^3 + x^2 + 2$ over $\F_3$. $f(x)$ is irreducible, as $f(0) = 2, f(1) = 1, f(2) = 8 + 4 + 2 = 14 \equiv 2 \mod 3$. Thus $\F_3 / (x^3 + x^2 + 2)$ is a field with $3^3 = 27$ elements. ## Cyclotomic polynomials [#] TODO # Motivation [#] TODO # Basics ## Codes We wish to send some message $m$ from a source, over a channel, and receive it at some recipient. To do this we encode it, creating a new message $c$ (often denoted by $x$ aswell). This message hopefully is of such a character that it allows us to succesfully decode $m$ at the recipient. Before we go about constructing an encoding scheme, we make a few definitions: >**Definition.** A **code** $\C$ is a set of **codewords**, each a tuple of some length $n$; $(a_1, a_2\ldots a_n)$. A code $\C$ is said to be an $(n, M)$ code, if each codeword is of length $n$ and the number of codewords in $\C$ is $M$. Put in other words, we say that a code describes a set of possible encodings of inputs (messages). Typically, the codewords are $n$-tuples over some field $\F_q$. If the field is $\F_2$ we call the code a **binary code**. If the field is $\F_3$ we call the code a **ternary code**. We often write a codeword $(a_1, a_2, a_3 \dots)$ in a concatenated form $a_1a_2a_3\dots$. In the most general of terms, this is all the structure necessary required to define codes. However, we almost exclusively work with *linear* codes: >**Definition.** A **linear code** is a code where all linear manipulations of one or more codewords result in a new codeword, i.e. for all $\b x, \b y \in \C$: > (i) $ax \in C,\> a \in \F$ > (ii) $x + y \in C$. **Example.** Let $\C = \{000, 110, 011, 101, 111\}$ be a code over $\F_2$. It is a linear code as can be easily verified. ## Generator matrices We typically convert a message $x$ into a codeword $c$ through the means of a **generator matrix**. Such a matrix also imposes a (so far undefined, but definitely necessary) length requirement on the messages: **Definition.** A **generator matrix** $G$ of a code $\C$ is any matrix of dimension $n \times k$, where $n$ is the length of each codeword, and $k$ is the length of each message, whose rows form a basis for $\C$. Equivalently, we can say that the rows of $G$ must span $\C$. **Example.** Let $$G = \bmat{1 & 1 & 0 & 1 \\ 1 & 0 & 1 & 0 \\ 0 & 0 & 1 & 1}$$ be a generator matrix of a code $\C$ over the field $F_2$. The matrix takes as "input" any message of length 3, and outputs a code of length 4. For instance take a message $x = [1, 0, 1]$. This encodes as $xG = [1, 1, 1, 0]$. We can calculate the entirity of $\C$ by combining the rows of $G$ in arbitrary fashion: $$ \C = \{0000, 1101, 1010, 0011, 1110, 0111, 1001, 0100\} $$ We note to no surprise that the amount of distinct codewords in $\C$ is the same as the amount of possible inputs: $2^3 = 8$. If a linear code takes input of length $k$ and procudes codewords of length $n$, we say that the code has dimension $k$ and is a $[n, k]$ code. >**Definition.** A generator matrix is said to be in **standard** form if it is on the form $[I_k \mid A]$. **Example.** The matrix $$\barr{cc|cc}{ 1 & 0 & 1 & 1 \\ 0 & 1 & 0 & 1}$$ is on standard form. The matrix $$\barr{cc|cc}{1 & 0 & 1 & 1 \\ 0 & 1 & 0 & 1 \\ 1 & 0 & 1 & 1}$$ is not. ## Parity check matrices An interesting property of a linear code is that as it is a subspace of some vector space (in particular, of the subspace $F_q^n$), it is the *kernel* of some linear transformation $H$. For each codeword $x \in \C$, $Hc^T = \b 0$. So why is this of any importance to us? Looking back at our motivation for defining error-correcting codes, it is precisely that of detecting errors. As $H$ (the construction of which we haven't touched on yet) multiplied by any codeword will yield $\b 0$, we know that if a received message of some code does not yield $\b 0$ on multiplication with $H$, then some error has occured. In fact, if our code is cleverly constructed, we can also figure out which error has occured, if the amount of errors are not too large. **Definition.** The **Parity-check matrix** $H$ of a code $C$ is a $(n-k) \times n$ matrix, defined by $H\b x^T = \b 0,\> \forall x \in \C$. If we have a generator matrix on standard form, we have a nice way of figuring out $H$: **Theorem.** If $G = [I_k \mid A]$ then $H = [-A^T \mid I_{n-k}]$. *Proof.* Clearly $HG^T = [-A^T \mid I_{n-k}]\cdot [I_k \mid A]^T =-A^T + A^T = \mathcal{O}$. **Example.** Let $$G = \barr{ccc|cc}{ 1 & 0 & 0 & 1 & 1 \\ 0 & 1 & 0 & 0 & 1 \\ 0 & 0 & 1 & 1 & 0 }$$ be the generator matrix of some code $\C$ over $F_2$. Then $\C$ has parity check matrix (note that $-1 = 1 \mod 2$) $$H = \barr{ccc|cc}{ 1 & 0 & 1 & 1 & 0 \\ 1 & 1 & 0 & 0 & 1 }$$ Finally we note (without proof) that the rows of $H$ is also necessarily independent. ## Dual codes **Definition.** Let $G$ be the generator matrix of some code $\C$ with parity check matrix $H$. Then the code defined by the rows of $H$ is called the dual code of $G$, and is denoted $G^{\perp}$. A code is said to be **self-orthagonal** if $G \subseteq G^{\perp}$, and **self-dual** if $G = G^{\perp}$. ## Weights and distances One interesting property of codes is that of *weight* and *distance*. >**Definition.** The (Hamming) **distance** $d(x, y)$ between two codewords $\b x$ and $\b y$ of some code $\C$ is defined as the number of coordinates in which $\b x$ and $\b y$ differ. $ $ >**Definition.** The (Hamming) **weight** $wt(x, y)$ of a codeword $\b x$ is defined as the number of non-zero coordinates of $\b x$. We make the following observation: >**Theorem.** If $x, y \in \F_q^n$ then $d(x, y) = wt(x - y)$. *Proof.* Should be fairly clear: simply observe that $x-y$ is zero in all coordinates where $x$ and $y$ are similar. >**Definition.** The **minimum distance** of a code $\C$ is the minumum distance between any two codewords of $\C$. $ $ >**Theorem.** If $\C$ is a linear code, then the minimum distance of $\C$ is the same as the minimum weight of the non-zero codewords of $\C$. *Proof.* The distance between a minimally weighted codeword $\b c$ and $\b 0$ is clearly the weight of $\b c$. If an $[n, k]$ code has minimum distance $d$, we say that the code is an $[n, k, d]$-code. There is one particularly useful way to determine the minimum distance of a code when its parity check matrix $H$ is known. Let $\C$ be a code with minimum weight $d$, and suppose $\b c$ is a codeword of $\C$. We notice that as $\b c$ is a codeword, $H\b c^t = 0$. Thus the columns of $H$ defined by the non-zero coordinates of $\b c$ must be linearly dependent, as they sum to $\b 0$ when multiplied by $\b c^t$. Say now, that $H$ contains a set of $d-1$ columns of $H$ that are linearly dependent. Then they in some fashion multiply to $0$, which also means that they determine a codeword of weight $d-1$, contradicting the fact that the code is of minimum weight $d$. Thus no such set of coordinates exist. This is nicely wrapped up in the following theorem: >**Theorem.** If $\C$ is a linear code with parity check matrix $H$ which has a set of $d$ linearly dependent columns, but no set of $d-1$ linearly dependent columns, then the minimum weight of $\C$ is $d$. *Proof.* Follows immediately from the above discussion. ## Puncturing and extending codes Having already created some code $\C$, a simple way to create 'new' codes from $\C$ is through either *puncture* or *extension*. ### Puncturing As might be clear from the name, **puncturing** a code implies removing some $i$'th coordinate from all codewords in the code. What might the minimum distance $d$ and dimension $k$ of a punctured code? >**Theorem.** Let $\C$ be a linear code, and let $\C^*$ be the code $\C$ punctured in the $i$'th coordinate. Then: > (i) If $\C$ has a codeword of minumum weight with a non-zero entry in the punctured coordinate $i$, the minimum distance of the new code decreases by $1$: $\C^*$ is an $[n - 1, k, d - 1]$ code. > (ii) If $\C$ has minimum distance of $1$, and there are two codewords of $\C$ which differ only in the $i$'th coordinate, the new code $\C^*$ has dimension $k-1$: $\C^*$ is an $[n - 1, k-1, d^*]$, with $d^* > 1$. We proof this informally be giving some intuition through an example: Let $\C$ be the binary code generated by $$G = \bmat{1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1}$$. We see that the minimum distance of this code is $2$ as all the codewords are $\{0000,1100,0011,1111\}$. Let $\C^*$ be the code punctured in the 3rd coordinate. The new code has generator matrix $$G^* = \bmat{1 & 1 & 0 \\ 0 & 0 & 1}$$. The minimum weight of this code is $1$, as the codewords are $\{000, 001, 110, 111\}$. E.g. the distance between $110$ and $111$ is $1$. Contrarily, let $\C$ be the binary code generated by $$G = \bmat{1 & 1 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 1}$$. The minimum distance of this code is $2$. Puncturing this code in the third coordinate gives us a code with generator matrix $$G^* = \bmat{1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1}$$. As already noted, the minimum distance of this code is also $2$. For a demonstration of the second property of the theorem, let $\C$ be the binary code with the generator matrix $$G = \bmat{1 & 1 & 1 & 1 \\ 0 & 1 & 1 & 1},$$ with dimension $2$. The codewords of this code are $\{0000, 0111, 1000, 1111\}$. Puncturing this code in the first coordinate, we get the code with generator matrix $$G^* = \bmat{1 & 1 & 1 \\ 1 & 1 & 1} \sim \bmat{1 & 1 & 1}..$$ Clearly this code has dimension $1$. ### Extension We can in similar fashion extend a code by adding a coordinate. Typically, we do this to get a new code consisting of only even-like codewords (codewords whose length are even). Clearly, the inverse of the previous theorem applies for extension. **Example.** Take the binary code generated by the matrix $$G = \bmat{1 & 0 & 1 \\ 0 & 1 & 1}$$ The code is a $[3, 2, 2]$ code. We extend the code by adding a new coordinate with column vector $\bmat{1 \\ 0}$ in between the second and third coordinate: $$G^* = \bmat{1 & 1 & 0 & 1 \\ 0 & 1 & 0 & 1}$$, as can be checked, this is also a $[4, 2, 2]$ code. We see that the minimum distance has not decreased, as the extended code $G^*$ has a minimum weight codeword without a $1$ in the second coordinate: $1101 + 0101 = 1000$. In other words, to figure out the dimension and minimum distance of the extended code, simply apply the previous theorem in reverse. ## Direct sum of codes We can create larger codes from smaller codes by concatenating them through *direct sum*: >**Definition.** Let $G_1$ and $G_2$ be generator matrices for linear codes $\C_1$ and $\C_2$ with parity check matrices $H_1$ and $H_2$. The **direct sum** of the codes $\C_1$ and $C_2$ is the code with generator- and parity check matrix > $$G = \bmat{G_1 & \O \\ \O & G_2}\quad H = \bmat{H_1 & \O \\ \O & H_2}$$ > The new code is a $[n_1+n_2, k_1+k_2, \min{d_1, d_2}$ code. Sadly, as the minimum distance of the new code is no larger than any of the original codes, the new code is of little use (as we will see). ## The $( u \mid u + v )$-construction In somewhat similar fashion, we can concatenate two codes using what is called the $(\b u \mid \b u + \b v)$ construction. >**Definition.** Let $G_1$ and $G_2$ be generator matrices for linear codes $\C_1$ and $\C_2$ of the same length $n$ (but not necessarily same dimension) with parity check matrices $H_1$ and $H_2$. The $\b{(u} \mid \b{u + v)}$ construction of the codes $\C_1$ and $C_2$ is the code $\C$ with generator- and parity check matrix > $$G = \bmat{G_1 & G_1 \\ \O & G_2}\quad H = \bmat{H_1 & \O \\ -H_2 & H_2}$$ > The new code is a $[2n, k_1+k_2, \min{2d_1, d_2}]$ code. We can also write the code $\C$ more concisely, by $$\C = \{ (\b u, \b u + \b v) \mid u \in \C_1,\> v \in \C_2\}.$$ ## Equivalence of codes There are multiple ways in which codes can be "equivalent", or "essentially" the same. One way is to assume equality as vector spaces (i.e. the vector spaces are isomorphic), however in this case, it should be fairly clear that properties such as weight might not be retained across the isomorphism - we could for instance create some isomorphism mapping all vectors of weight $2$ to some vector of weight $3$ and vica-versa under the only constraint that the vector spaces should be isomorphic. Thus, we look at a different class of equivalence, namely *permutation equivalence*: >**Definition.** Two codes $\C_1$ and $\C_2$ are permutation equivalent if there is a permutation of coordinates which sends $\C_1$ to $\C_2$. **Example.** Let $\C_1$ and $\C_2$ be two binary codes defined by the generator matrices $$ G_1 = \bmat{1 & 0 & 1 & 1 \\ 0 & 1 & 1 & 0 \\ 1 & 0 & 0 & 1},\quad G_2 = \bmat{0 & 1 & 1 & 1 \\ 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1} $$ We see that we can acquire $G_2$ from $G_1$ by switching the first and second coordinate. Thus $C_1$ and $C_2$ are permutation equivalent. Granted that two permutation codes are "essentially the same", we can for any generator matrix of some code, acquire a generator matrix on standard form of some permutation equivalent code. This is indeed handy for finding parity check matrices for an essentially alike code. **Example.** Let $\C$ be the code with generator matrix $$G = \bmat{1 & 1 & 1 & 0 & 1 \\ 0 & 0 & 0 & 1 & 1 \\ 0 & 0 & 1 & 1 & 1}$$ Clearly, this matrix can not be manipulated by row-operations into a matrix on standard form, as we we never be able to "separate" the first and second coordinates. However, we can create a permutation equivalent code $\C^*$ by swapping the second and fourth coordinate: $$G^* = \bmat{1 & 0 & 1 & 1 & 1 \\ 0 & 1 & 0 & 0 & 1 \\ 0 & 1 & 1 & 0 & 1}$$ This matrix is row-reducible to standard-form: $$ \begin{align} G^* = \bmat{ 1 & 0 & 1 & 1 & 1 \\ 0 & 1 & 0 & 0 & 1 \\ 0 & 1 & 1 & 0 & 1} \sim \bmat{ 1 & 0 & 1 & 1 & 1 \\ 0 & 1 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 & 0} \sim \barr{ccc|cc}{ 1 & 0 & 0 & 1 & 1 \\ 0 & 1 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 & 1} \end{align} $$ This matrix is now on standard-form. Consequently we can easily find its parity-check matrix: $$ H^* = \barr{ccc|cc}{ 1 & 0 & 0 & 1 & 0 \\ 1 & 1 & 1 & 0 & 1 } $$ ## Encoding and decoding Before we start defining some codes, and more complex methodologies, we look briefly on the process of encoding and decoding some messages. ### Simple example Say we have the code $\C$ defined by the generator matrix $$G = \bmat{1 & 0 & 1 & 1 \\ 0 & 1 & 0 & 1}.$$ As the matrix is on standard form, we can easily find its parity check matrix $$H = \bmat{1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 1}$$. Lets say that we encode the message $x = 01$ using $G$. We get $$xG = \bmat{0 & 1}\bmat{ 1 & 0 & 1 & 1 \\ 0 & 1 & 0 & 1 } = 0101 $$ This is our codeword $c$. Say we transmit this code, receive it at some endpoint, and wish to verify that the received message is a codeword. We feed $0101$ into $H$, which gives us $$ H\b c^T = \bmat{ 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 1}\bmat{0 \\ 1 \\ 0 \\ 1} = \b 0 $$ Thus we know that, while we can't rule out that the message has been completely mangled, at least the received message *is* a codeword. If the likelyhood of *one* error during transmission is small, we know that the message with great probability is intact. As $G$ is in standard form, clearly $c_1 c_2 = x_1 x_2$, and the original code is easily retrieved. Take now a received encoded message $c_2 0111$. Again, we wish to figure out if some error has occured during transmission. We multiply by $H$: $$ H\b c_2^T = \bmat{ 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 1}\bmat{0 \\ 1 \\ 1 \\ 1} = \bmat{1 & 0} \neq \b 0 $$ Thus some error has occured during transmission. Can we retrieve the error? Well if we assume that only one error has occured, it is clear that it must have happened in the third coordinate, as we know $0101$ is a codeword, and changing any other coordinate will simply produce more errors. (Try for instance $0110$ or $1111$). Thus we know that the most likely original codeword was $0101$. ## Nearest neighbor decoding One technique for decoding erroneous message is *nearest neighbor decoding*. >**Definition.** Let $x$ be a received message of some code $\C$. Using **nearest neighbor decoding** to decode $x$ into its "most likely sent message* is done finding the $c$ satisfying the following equation: > $$\min_{c \in \C} d(x, c)$$ In other words, we decode $x$ into the codeword which lies closest to $x$ in regards to Hamming distance. ### Example with Hamming code [#] TODO ## Extra Examples [#] TODO # Some codes We now introduce some typical codes. ## Binary repetition code Perhaps the most obvious and simple code to construct, is the **binary repetition code**. Lets say we want to transmit the code $001$. We encode the message by repeating each digit $p$ times. For simplicity, lets take $p$ to be $4$ in this example, leading to the codeword $c = 000000001111$. We can use this code to detect up to $p-1$ errors, as any permutation of $p-1$ codewords necessarily does not yield a new codeword. The binary repition code is a $[k\cdot p, k, k]$ code. Using nearest neighbor decoding, we can correct up to $\floor{(n-1)/2}$ errors. Due to the sheer size of the generation matrix, as well as the superfluity of data having to be transmitted, binary repetition codes are seldom used. ### Properties || Length || $k\cdot p$ || || Dimension || $k$ || || Minimum distance || $k$ || ## Hamming codes And important class of codes are the **Hamming codes**. They are determined uniquely by their parity check matrices. Take for instance the binary $[7, 4]$ Hamming code, denoted $H_3$. It has generator matrix $$ G = \bmat{ 1 & 0 & 0 & 0 & 0 & 1 & 1 \\ 0 & 1 & 0 & 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 1 & 1 & 1 & 1} $$ Which gives us the parity-check matrix $$ H = \bmat{ 0 & 1 & 1 & 1 & 1 & 0 & 0 \\ 1 & 0 & 1 & 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 1 & 0 & 0 & 1 \\ } $$ What is noticeable about this parity-check matrix, is that its column vectors are all non-zero binary vectors of length $3$. This property is the defining one of Hamming codes, and generalizes to any $[2^r - 1, n - r] = [2^r - 1, 2^r - r - 1]$ code, where $r \in \N$. Say we want to define a $[3, 2]$ hamming code, $\H_2$. The parity check matrix of this code must then be $$ H = \bmat{1 & 1 & 0 \\ 1 & 0 & 1} $$ From the last theorem of the section [on weights](#weights_and_distances), it is clear that any hamming code must have minimum distance $3$, as there are always a set of three linearly dependent columns; take for instance the columns $\bmat{1 & 0 & & 0 & \cdots}^T$, $\bmat{0 & 1 & & 0 & \cdots}^T$ and $\bmat{1 & 1 & 0 & \cdots}^T$. Interestingly, we can state the following: >**Theorem.** Any $[2^r - 1, 2^r - r - 1, 3]$ binary code is equivalent to the binary Hamming code $\H_r$. ### Properties || Length || $2^r - 1$ || || Dimension || $2^r - r - 1$ || || Minimum distance || $3$ || ## Reed-Muller codes The **Reed-Muller** class of codes are constructed using the $(\b u \mid \b u + \b v)$ construction, in the following way: Let $G(0, m) = \bmat{1 & 1 & \dots & 1}$ (with $2^m$ $1$s), and $G(m, m) = I_{2^m}$. We then define the general generator matrix of the $(r,m)$ **Reed-Muller** code: $$ G(r, m) = \bmat{G(r, m-1) & G(r, m-1) \\ \O & G(r-1,m-1)} $$ We can describe this code directly, by $$ \mathcal{R}(r,m) = \{ (\b u \mid \b u + \b v) \mid \b u \in \mathcal{R}(r, m-1),\>\b v \in \mathcal{R}(r-1, m-1)\} $$ ### Properties || Length || $2^m$ || || Dimension || $\ncr m 0 + \ncr m 1 + \dots + \ncr m r$ || || Minimum weight(distance) || $2^{m-r}$ || ### Example Lets build $G(1,2)$. $$ G(1, 2) = \bmat{G(1, 1) & G(1, 1) \\ \O & G(0, 1)} = \barr{cc|cc}{ 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 \\\hline 0 & 0 & 1 & 1 } $$ ## Golay codes [#] TODO # Bounds on codes It is interesting for us to study the size of codes, i.e. how many codewords can possibly exist in specific codes. ## $A_q(n,d)$ and $B_q(n, d)$ We base our following work around bounds relating to two values: The size of the **maximal $[n, k]$ code**, $A_q(n,d)$, and the size of the **maximal linear $[n, k]$ code**, $B_q(n, d)$. ## Sphere packing bound and perfect codes ### Spheres and sphere packing Before establishing a bound, we introduce the notion of a *sphere* around a codeword: >**Definition.** A sphere of size $r$ around the codeword $\b c $ of a code $\C$ is defined as: >$$S_r(u) = \{v \in \F_q^n \mid d(v, y) \leq r\}$$ $ $ >**Theorem.** A sphere of size $r$ around a codeword $\b c$ over the field $\F_q^n$ has >$$\sum_{i=0}^r \ncr n i (q - 1)^i$$ > codewords. *Proof.* For each $s < r$ we can choose $s$ coordinates in $\ncr n s$ different ways. These coordinates can vary in $(q-1)^s$ different ways. The result follows. >**Definition.** The **packing radius** of a code $\C$ is the largest value of $r$, such that when placing spheres of size $r$ around all codewords of $\C$, all the sphere remain disjoint. We are now ready to state the sphere packing bound: ### Sphere packing bound >**Theorem.** The maximal size of any code $\C$ over $F_q^n$, $A_q(n, d)$, is contrained by the following equation: >$$B_q(n, d) \leq A_q(n, d) \leq \frac{q^n}{\sum_{i=0}^{t} \ncr n i (q - 1)^i},\quad t = \floor{(d-1)/2}$$ *Proof.* We prove this somewhat informally, using the notions of the discussion above. First of all, we definitely know that $A_q(n,d) \leq q^n$, as this is the number of vectors in the vector-space $\F_q^n$. Furthermore, we know that as there are atleast a distance $d$ between each codeword, that bound cannot be particularly tight. We can tighten the bound by sphere packing: As each codeword is separated by a distance $d$, we can cover the codewords by sphere of size $t = \floor{(d-1)/2}$. These spheres contain $\alpha = \sum_{i=0}^{t} \ncr n i (q - 1)^i$ vectors. Thus we know that at most every $\alpha$ vector in $F_q^n$ is a codeword of $\C$ (With equality if our sphere packing fills the entire space $F_q^n$). The result follows. ### Perfect codes If the sphere packing bound yields equality for some code $\C$, then the code is called **perfect**. ### Some properties of $A_q(n,d)$ and $B_q(n, d)$ $$ \begin{align} A_q(n,d) &\leq A_q(n-1,d-1),\> B_q(n,d) \leq B_q(n-1,d-1) \\ A_q(n,n) &= B_q(n, n) = q \\ A_q(n,d) &\leq qA_q(n-1, d), \> B_q(n,d) \leq qB_q(n-1, d) \end{align} $$ ## MDS codes and the Singleton Upper Bound ### Singleton bound >**Theorem.** Let $d \leq n$, then the following bound is called the **Singleton Bound**: >$$A_q(n,d) \leq q^{n-d+1}$$ *Proof.* We know that $A_q(d, d) \leq q$ from the previous section. We also know that $$A_q(n, d) \leq qA_q(n-1, d) \leq q^2A_q(n-2, d) \leq \dots \leq q^{n-d}A_q(d,d) = q^{n-d+1}.$$ QED. From this, it can be shown that if a $[n, k, d]$ linear code over $\F_q$ exists, then $k \leq n - d + 1$. ### MDS codes A **maximum distance separable** (MDS) code is a code which satisfies the Singleton bound with equality. We have the following properties for MDS codes: >**Definition.** Let $\C$ be an $[n, k]$ code over $F_q$ with $k \geq 1$. Then the following are equivalent: >(i) $\C$ is an MDS code. >(ii) $\C^{\perp}$ is an MDS code. > (iii) Every set of $k$ coordinates in $\C$ is an information set. > (iv) Every set of $n-k$ coordinates in $\C^{\perp}$ is an information set. ## Gilbert Lower Bound We've now defined a few upper bounds, and we now turn to a handy lower bound, namely the *Gilbert Lower Bound*. >**Theorem.** >$$ B_q(n, d) \geq \frac{q^n}{\sum_{i=0}^{d-1} \ncr n i (q - 1)^i}$$ We notice that this lower bound is virtually the same as the sphere packing bound, minus the inequality sign's direction, as well as the numerator of the sum, which here is $d-1$. *Proof.* Left for you. Simply argue that spheres defined by the sum necessarily covers the entire vector space $\F_q^n$. ## Examples [#] TODO # Cyclic codes ## Definitions Note that we are working with linear codes. >**Definition.** A **cyclic code** is a code $\C$ where every codeword $c$ has the property that when each element of $c$ is shifted one step forward (or backward), the resulting vector is also a codeword: >$$(c_0, c_1,\dots,c_{n-1}) c \in \C \implies (c_{n-1}, c_0, c_1, \dots c_{n-2} \in \C$$ As shifting one direction $n-1$ times equals shifting the other direction we see that it is arbitrary whether we specify "forward" (right) or "backward" (left) shifts. An alternative and very handy notation (as we will see) of cyclic codes, are through polynomials. >**Definition.** Let $c = c_0c_1c_2\dots$ be a codeword of some code. The **polynomial representation** of $c$ is the polynomial >$$c[x] = c_0 + c_1 x + c_2 x^2 + \dots + c_{n-1} x^{n-1}$$ The context of addition and multiplication of such polynomials are in the field $F_q[x] / (x^n - 1)$. What is handy about this representation is that a cyclic shift of a codeword is the same as multiplying by $x$: $$ x \b c = x(c_0 + c_1 x + c_2 x^2 + \dots + c_n x^{n-1}) = c_0x + c_1x^2 + c_2 x^3 + \dots + c_{n-1} x^n) $$ Clearly $x^n = 1 \mod (x^n - 1)$, thus $x \b c$ has been cyclically shifted one position. >**Theorem.** The cyclic codes over $F_q$ are precisely the ideals of $\F_q[x] / (x^n - 1)$. *Proof.* Omitted. We often write the latter field as $$\mathcal{R}_n = \F_q[x] / (x^n - 1)$$ Analogously to generator matrices, we define the notation of a *generator polynomial*: >**Theorem.** Let $g(x)$ be a polynomial of a nonzero cyclic code $\C$, such that $g(x)$ is monic, of minimum degree in $\C$. Then: > (i) $\C = (g(x))$ > (ii) $g(x) \mid (x^n - 1)$ $g(x)$ is called the generator polynomial of $\C$. >**Theorem.** Let $\C$ be a cyclic code with generator polynomial $g(x)$. Let $k = n - \mathrm{deg} g(x)$ and let $g(x) = \sum_{i = 0}^{n-k}g_i x^i$. Then the dimension of $\C$ is $k$, and $\{g(x), xg(x), \dots, x^{k-1}g(x)\}$ is a basis for the code. We can "convert" the generation of our code from polynomial form into matrix form by nothing the following: >**Theorem.** Let $g(x)$ be our generator polynomial for a cyclic code $[n, k]$ code, $\C$ with coefficients $(g_0, g_1, \dots, g_{n-lk})$. Then a generator matrix for our code is >$$G = \bmat{ g_0 & g_1 & \dots & g_{n-k} & 0 \dots & \\ 0 & g_0 & \dots & g_{n-k-1} & g_{n-k} & \dots \\ 0 & \dots & & && \ddots} $$ ## Connection to cyclotomic polynomials and cyclotomic cosets >**Theorem.** Let $n$ be a positive integer relatively prime to $q$. Let $t = \mathrm{ord}_n(q)$ and let $\alpha$ be a primitive nth root of unity in $\F_{q^t}$. Let then $s$ be some integer, the minimal polynomial of $\alpha^s$ over $\F_q$ is defined as > $$M_{\alpha^s} = \prod_{i \in C_s} (x - \alpha^i)$$ > where $C_s$ is the q-cyclotomic coset of $s$ modulo $n$. This leads us to the following theore, which is what "ties it all together": >**Theorem.** Let $\C$ be a cyclic code with generation polynomial $g(x)$ over some field $\F_q$. If $\alpha$ is a primitive nth root of unity in some extension field of $\F_q$, then >$$ g(x) = \prod_{s} M_{\alpha^s}(x)$$ > where the product is over some subset of the $q$-cyclotomic cosets modulo $n$. In other words, the generating polynomial is necessarily a product of some minimial polynomials, which again are defined by the $q$-cyclotomic cosets modulo n. >**Definition.** The union of the $q$-cyclotomic cosets defining a generator polynomial $g(x)$ is called the **defining set** of the code $\C$ generated by $g(x)$. The defining set is commonly denoted $T$. There might be several different such polynomials for any given field, in fact, we have the following theorem: >**Theorem.** The number of cyclic codes in $\RC_n$ is $2^m$ where $m$ is the number of $q$-cyclotomic cosets modulo $n$. ## Minimum distance of cyclic codes Finally, we can state some bounds regarding the minimum distance of cyclic codes. >**Theorem (BCH-Bound).** Let $\C$ be a cyclic code of length $n$ over $\F_q$ with a defining set $T$. Let the code have minimum weight $d$. If $T$ contains $\delta - 1$ consecutive elements (e.g. $1, 2, 3 \dots$) for some integer $\delta$ then $d \geq \delta$. ## Examples **Example. (Exer. 2 Exam 2013)** Let $\C$ be a cyclic code of length $8$ over $F_9$. Let $\alpha \in \F_9$ be a solution to the equation $\alpha^2 + \alpha + 2 = 0$, and let the generating polynomial of $\C$ be $g(x) = (x - \alpha)(x - \alpha^2)(x - \alpha^3)$. (i) What is the dimension and minimum distance of $\C$? How many codes can it always correct? (ii) If you receive $y(x) = x^5 + \alpha x^4 + (\alpha + 2)x^3 + (\alpha + 1)x + (2\alpha + 1)$, deduce if one error has occured, and if so, where? **Solution.** (i) We know that the dimension of a cyclic code is $k = n - \mathrm{deg} g(x) = 8 - 3 = 5$. It is also clear that the defining set of the generator polynomial is $T = \{1, 2, 3\}$. There are 3 consecutive elements of $T$, and consequently the minimum distance is $4$. Any linear code can always correct $\floor{(d-1)/2)}$ errors, thus our cyclic code can always detect $1$ error. (ii) Obviously any codeword $c(x)$ generated from $g(x)$ fed with $\alpha$ equals $0$. Also, as we're given $\alpha^2 + \alpha + 2 = 0$, we have the following set of equations: $$ y(\alpha) = c(\alpha) + e_j \alpha^j,\> y(\alpha^2) = c(\alpha) + e_j \alpha^j \implies y(\alpha^2) = y(\alpha)a^j $$ Thus we have $$ y(\alpha^2)/y(\alpha) = \frac{\alpha^10 + \alpha^9 + \alpha^7 + 2\alpha^6 + 2\alpha^4 + \alpha^3 + \alpha^2 + 3\alpha + 1}{2\alpha^5 + \alpha^4 + 2\alpha^3 + 2\alpha^2 + \alpha^2 + \alpha + 2\alpha + 1} = 2\alpha + 1 = \alpha^2 $$ Thus the error was in the $x^2$ term. ## Encoding and decoding [#] TODO # BCH codes From the BCH-Bound defined above, we notice that if we can find a defining set of some code with $\sigma-1$ consecutive elements, then we have minimum weight at least $\sigma$. We can flip this around, and use it to our advantage: >**Definition.** A **BCH-Code** of length $n$ with **designed distance** $\sigma$, is a code $\C$ over $\F_q$ with defining set >$$T = \C_b \cup \C_{b+1} \cup \dots \cup C_{b + \sigma - 2}$$ > where $C_i$ is the $q$-cyclotomic coset modulo $n$ containing $i$. Clearly $T$ in the definiton contains the sequence $b\dots b + \sigma - 2$. >**Theorem.** A BCH-code with designed distance $\sigma$ has minimum weight $\sigma$ As is clear from the BCH-bound. ## Reed-Solomon codes One family of very useful BCH-codes are the Reed-Solomon codes: >**Defintion.** A **Reed-Solomon** code is a BCH-code $\C$ over $\F_q$ of length $n = q-1$. This implies first and foremost that all irreducible factors of $x^n - 1$ over $F_q$ is of degree 1. Consequently, each $q$-cyclotomic coset modulo $n$ has size 1. This gives us the following result: >**Theorem.** A **Reed-Solomon** code with designed distance $\sigma$ has defining set $$T = \{b, b+1,\dots,b+\sigma - 2\}$$. # Convolutional codes ## Definition Up until now, all codes we have discussed have been *block codes*. For any input of length $k$, we create a block of length $n$. Convolutional codes however, are created by letting a codeword depend on $M$ pieces of sequential input $x$. Say we want to encode the message $\b x (i), \b x(i+1), \dots$. Then if we have a convolutional code with **memory** $M$, each codeword created by the code depends not only on each $x(i)$, but the $M-1$ preceding messages. **Example.** Let $\C$ be a ternary convolutional code defined by the equations $$ c_1(i) = x(i) + x(i-1) c_2(i) = x(i-1) - x(i-2) $$ This code takes one-dimensional input and produces two-dimensional output, i.e. it is a $[n, k]$ code. Lets try to encode the message $121$: $$c(0) = (1+0, 0+0), c(1) = (2+1 = 0, 1 + 0), c(2) = (0, 1+2 = 0)$$ Thus our messages are $\b c_1 = 100, \b c_2 = 010$, giving our message $c = 100010$. ## Matrices There is a very nice way of representing a convolutional message through polynomials. We let $D$ be an indeterminate, denoting "delay". Thus when we write $c = 1 + D + D^2$, what we mean is $c(i) = x(i) + x(i-1) + x(i-2)$. If we have multi-dimensional output (i.e. codewords), we can write this in matrix form, for instance $$ G = \bmat{1 + D & D - D^2} $$ As usual we call this a **generator matrix**. This notation is very handy for generating codewords, as indeed we can map our messages to polynomials, and multiply these with the matrix in the obvious way. **Example.** Let $\C$ be a binary convolutional code with generator matrix $$\bmat{1 + D + D^2 & 1 + D}$$ Say we want to encode the message $\b x = 11010$, this corresponds to the matrix $1 + D + D^3$. Encoding this message is done by performing the calculation $\b x G$, remember: $$ \b x G = \bmat{ (1 + D + D^2) \cdot (1 + D + D^3) & (1 + D)\cdot (1 + D + D^3)} = \bmat{1 + D^4 + D^5 & 1 + D^2 + D^3 + D^4} $$ I.e. the codewords $100011$ and $101110$. We interleave the messages to produce $100011$ and $101110$. ## Encoding & State diagrams [#] TODO ## Decoding using Viterbi algorithm [#] TODO ## Degrees & Cannonical generator matrices ### External degree >**Definition.** Let $G$ be a generator matrix for a convolutional code. The **degree** of any $i$'th row of $G$ is the degree of the highest-degree polynomial of that row. **Example.** Let $$G = \bmat{1 + D & 1 + D^2 \\ 1 + D^3 & 1 + D + D^2}$$ The degree of the first row is $2$ and the degree of the second row is $3$. >**Definition.** Let $G$ be a generator matrix for a convolutional code. The **external degree** of the generator matrix is the sum of the degrees of all the rows of $G$. **Example.** Let $$G = \bmat{1 + D & 1 + D^2 \\ 1 + D^3 & 1 + D + D^2}$$ The external degree of $G$ is $5$. >**Definition.** Let $G$ be a $k \times n$ polynomial generator matrix for a convolutional code. A $k \times x$ **minor** of $G$ is the determinant of some $k$ columns. **Example.** Let $$G = \bmat{1 + D & 1 + D + D^2 & 1 + D^2 \\ 1 + D^2 && 1 + D && 1}$$ $G$ has $3$ different $2 \times 2$ minors: $$ \begin{align} (1+D)(1+D^2) - (1+D^2)(1+D+D^2) = D^4 \\ (1+D)(1) - (1+D^2)(1+D^2) = D + D^4 \\ (1+D+D^2)(1) - (1+D)(1+D^2) = D^3 \end{align} $$ Note that $G$ has $\ncr{n,k}$ such minors. ### Internal degree >**Definition.** Let $G$ be a $k \times n$ polynomial generator matrix for a convolutional code. The **internal degree** of $G$ is the maximum degree of all its $k \times k$ minors. ### Cannonical form >**Definition.** If a generator matrix $G$ of a code $\C$ is the generator matrix of the code with the least external degree, the matrix is said to be on **cannonical form**. ## Catastrophic codes