This is an old version of the compendium, written June 3, 2013, 7:06 p.m. Changes made in this revision were made by stiaje. View rendered version.

TDT4125: Algorithm Construction, Advanced Course

# Practical information * The exam is held on the 22. of May 2013 at 9 am. * All printed or hand-written materials may be used during the exam (including the textbook, and even this compendium!). Citizen SR-270X is also permitted. * [Old exams](http://algkons-wiki.idi.ntnu.no/index.php/Old_Exams) # Curriculum / reading list This year (2013), the curriculum is chapters 3, 4 and 6, except 4.3.6-4.5, of Hromcovic's _Algorithmics for hard problems_. The topics covered are: * Deterministic approaches * Pseudo-polynomial-time algorithms * Parametrized complexity * Branch and bound * Lowering worst case complexity of exponential algorithms * Local serach * Relaxation to linear programming * Approximation algorithms * Fundamentals: Stability, Dual approximation etc * Algorithm design: lots of approximations for known hard problems * Heuristics * Simulated annealing * Tabu search * Genetic algorithms # Formal languages ## Alphabet > Any non-empty, finite set is called an **alphabet**. Every element of an alphabet $\Sigma$ is called a **symbol** of $\Sigma$. Example alphabets: * $\Sigma_{\text{bool}} = \left\{0, 1\right\}$ * $\Sigma_{\text{latin}} = \left\{a, b, c, d, e , ... , z \right\}$ * $\Sigma_{\text{logic}} = \left\{0, 1, (,), \wedge, \vee, \neg, x \right\}$ ## Word > Let $\Sigma$ be an alphabet. A **word** over $\Sigma$ is any finite sequence of symbols of $\Sigma$. The **empty word** $\Lambda$ is the only word consisting of zero symbols. The set of all words over the alphabet $\Sigma$ is denoted by $\Sigma^*$. ## Language Let $\Sigma$ be an alphabet. Every set $L \subseteq \Sigma^\*$ is called a **language** over $\Sigma$. ## $\text{Time}_A(x)$ and $\text{Space}_A(x)$ > Let $\Sigma_{\text{input}}$ and $\Sigma_{\text{output}}$ be alphabets. Let $A$ be an algorithm that realizes a mapping from $\Sigma_{\text{input}}^\*$ to $\Sigma_{\text{output}}^\*$. For every $x \in\Sigma_{\text{input}}^\*$, $\text{Time}_A(x)$ denotes the time complexity of the computation $A$ on the input $x$, and $\text{Space}_A(x)$ denotes the space complexity of the computation $A$ on $x$. ## Time complexity of an algorithm > Let $\Sigma_{\text{input}}$ and $\Sigma_{\text{output}}$ be two alphabets. Let $A$ be an algorithm that computes a mapping from $\Sigma_{\text{input}}^\*$ to $\Sigma_{\text{output}}^\*$. The worst case time complexity of $A$ is a function $\text{Time}\_A: (N - \left\{ {0} \right\} ) \to N$ defined by > $\text{Time}_A(n) = \text{max}\left\\{\text{Time}_A(x) | x \in \Sigma_{\text{input_n}}\right\}$ > for every positive integer $n$. # Complexity classes ## Time || **Complexity class** || **Model of computation** || **Resource constraint** || || DTIME($f(n)$) || Deterministic Turing machine || Time $f(n)$ || || P || Deterministic Turing machine || Time $\text{poly}(n)$ || || EXPTIME || Deterministic Turing machine || Time $2^{\text{poly}(n)}$ || || NTIME($f(n)$) || Non-deterministic Turing machine || Time $f(n)$ || || NP || Non-deterministic Turing machine || Time $\text{poly}(n)$ || || Co-NP || TBA || TBA || || NEXPTIME || Non-deterministic Turing machine || Time $2^{\text{poly}(n)}$ || ## Space || **Complexity class** || **Model of computation** || **Resource constraint** || || DSPACE($f(n)$) || Deterministic Turing machine || Space $f(n)$ || || L || Deterministic Turing machine || Space $O(\log n)$ || || PSPACE || Deterministic Turing machine || Space $\text{poly}(n)$ || || EXPSPACE || Deterministic Turing machine || Space $2^{\text{poly}(n)}$ || || NSPACE($f(n)$) || Non-deterministic Turing machine || Space $f(n)$ || || NL || Non-deterministic Turing machine || Space $O(\log n)$ || || NPSPACE || Non-deterministic Turing machine || Space $\text{poly}(n)$ || || NEXPSPACE || Non-deterministic Turing machine || Space $2^{\text{poly}(n)}$ || # Cost measurement Time efficiency estimates depends on what is defined to be a single step, which takes a single time unit to execute. Two cost models are generally used: ## Uniform-cost measurement Every machine operation is assigned a single constant time cost. This means that an addition between two integers, as an example, is assumed to take the same amount of time regardless of the size of the integers. This is often the case in practical systems for sensibly sized integers. This is the cost measurement used throughout Hromkovič. ## Logarithmic-cost measurement Cost is assigned to the number of bits involved in each operation. This is more cumbersome to use, and is therefore usually only applied when necessary. # Decision problems A decision problem is a question with a yes-or-no answer, depending on the input parameters. Formally, it is represented as a triple $(L,U,\Sigma)$ where $\Sigma$ is an alphabet, $L$ and $U$ are languages, and $L \subseteq U \subseteq \Sigma^\*$. When $U = \Sigma^\*$, which happens quite often, $(L,\Sigma)$ can be used as a shorthand. An algorithm $A$ solves $(L,U,\Sigma)$ if for every $x \in U$: * $A(x)$ = $1 \text{if} x \in L$ * $A(x)$ = $0 \text{if} x \in U - L$ A decision problem is equivalent to a language. ## Famous NP-complete problems This is a list of important or otherwise famous NP-complete problems, which are probably nice to know by heart, as they assumed to be known, and can therefore be used freely in proofs. ### Hamiltonian cycle problem (HC) The Hamiltonian cycle problem is the problem of determining whether a given graph contains a Hamiltonian cycle. ### Satisfiability problem (SAT) The boolean satisfiability problem is the problem of determining whether a given boolean expression in CNF can be satisfied. An expression in CNF is an expression of the form: $(A \vee B \vee ...) \wedge (\neg A \vee ...) \wedge ...$. SAT was the first known NP-complete problem. ### Clique problem (CLIQUE) The clique problem is the problem of determining whether there exists a clique of size $k$ in a given graph $G$ = $(V,E)$. A clique is a subset of $V$ $C$ such that all vertices in $C$ are connected to all other vertices in $C$ by an edge in $E$. ### Vertex cover problem (VCP) The vertex problem is the problem of determining whether a given graph $G = (V,E)$ has a vertex cover of size $k$. A vertex cover is a subset of $V$ such that all edges in $E$ are adjacent to a node in $V$. VCP is a special case of SCP. VCP can be approximated by Maximum Matching by a factor $\rho = 2$. ### Set cover problem (SCP) The set cover problem is the problem of determining whether, given a set of elements $U$ and a set $S$ of sets of the elements in $U$ whose union equals $U$, it is possible to select $k$ or less sets from $S$ such that their union equals $U$. The optimization problem tries to minimize $k$. SCP can be approximated greedily by repeatedly adding the set that contains the largest number of uncovered elements to the set cover. This has an approximation factor of $\rho = H(s)$, $s$ is the size of $U$ and $H(n)$ is the $n$-th harmonic number. SCP can be represented as an IP: Minimize sum $\text{cost}(s) \dot x\_s$ for all $s \in S$ subject to * sum $x_s$ for $s:e \in S_s \ge 1$ for all $e \in U$ * $x\_ \in \left\{0,1\right\}$ ### Travelling salesman problem (TSP) The travelling salesman problem is the problem of finding the the shortest Hamiltonian cycle in a complete graph. While the optimization version of TSP is NP-hard, the decision version of TSP is NP-complete. TSP, with the restriction that the triangle inequality holds, can be approximated by Christofides algorithm with a ratio $\rho = 1.5$. Regular MST constructive heuristic has $\rho = 2$. # Pseudo-Polynomial Time Algorithms A numeric algorithm runs in pseudo-polynomial time if its running time is polynomial in the numeric value of the input. NP-complete problems with pseudo-polynomial time algorithms are called weakly NP-complete. NP-complete problems that are proven to not have a pseudo-polynomial time algorithm are called strongly NP-complete. Strong and weak kinds of NP-hardness are defined analogously. Formally: > Let $U$ be an integer-valued problem, and let $A$ be an algorithm that solves $U$. $A$ is a **pseudo-polynomial-time algorithm for $U$** if there exists a polynomial $p$ of two variables such that $\text{Time}\_A(x) = O(p(|x|, \text{Max-Int}(x)))$ for every instance $x \in U$. ## Example: Primality test Consider the decision problem of whether a number $n$ is a prime number. The naïve approach of checking whether each number from $2$ to $\sqrt{n}$ evenly divides $n$ is sub-linear in the value of $n$, but exponential in the size of $n$. # Parametrized complexity ## Fixed-parameter polynomial time algorithms A parametrized problem is a language $L \subseteq \Sigma^\* \cross \mathbb{N}$. The second component is called the parameter of the problem. A parametrized problem $L$ is _fixed-parameter tractable_ if the question $(x,k) \in? L$ can be decided in running time $f(k) \dot |x|^{O(1)}$, where $f$ is an arbitrary function depending only on $k$. In such cases, it is often practical to fix the parameter $k$ to a small(ish) constant. ### Example: SAT SAT can be parametrised by the number of variables. A given boolean expression of size $x$ with $k$ variables can be checked by brute force in time $O(2^{kx}$. Here, $f(k) = 2^k$, and $|x|^{O(1)} = x$. # Approximation algorithms Approximation algorithms are algorithms that calculate an approximate answer to an optimization problem. Informally, approximation algorithms are said to have an approximation ratio of $\rho$ if the algorithm at worst produces an answer which is $\rho$ times worse than the optimal answer. In a manner analogous to numerical stability, approximation algorithms are said to be stable if small changes in the approximation parameters result in correspondingly small changes in the answer. ## Polynomial-time approximation scheme (PTAS) A polynomial-time approximation algorithm, $A$, is a PTAS if the approximation error is bounded for each input, and $\text{Time}\_A(x, \frac{1}{\text{error}})$ is polynomial in $|x|$. ## Fully polynomial-time approximation scheme (FPTAS)

An PTAS is FPTAS is it is polynomial in $|x|$ and $\frac{1}{\text{error}}$.

# Local Search Local search is a metaheuristic method for optimization problems. Local search moves from solution to solution in the search space by applying local changes until an sufficiently optimal solution is found, or until a time bound is elapsed. ## Straight up regular local search For your basic local search you need the following: * A neighbor graph showing the relation between neighboring solution candidates * A fitness function evaluating candidate solutions * Good luck Straight up regular local search will quickly find a local optimum from where the search is started, but does not guarantee finding a global maximum, nor does it know if a global maximum has been found. There are several ways of selecting a neighbor to follow when doing a local search, two of which are *first improvement* and *steepest descent*. Steepest descent evaluates all neighbors, and selects the best neighbor. First improvement selects the first neighbor which is better than the current solution candidate. First improvement is often faster to calculate, but Steepest descent usually converges faster to a local optimum. ## Variable-depth search Variable depth search provides a nice compromise between first improvement and steepest descent. In variable depth search, changed variables in a solution are locked, preventing their further modification in that branch. Kernighan-Lin is the canonical example of variable-depth search. ## Simulated annealing Simulated annealing introduces probability and randomness to which neighbor is selected. This reduces the chance of getting stuck in a non-global local maximum. ## Tabu search Tabu search marks certain solutions as tabu after processing, and never revisits them. This is often done on an LRU or LFU basis. A common implementation is a search which remembers the last $k$ visited candidate solutions, and avoids revisiting them. Tabu search prevents a candidate solution to be considered repeatedly, getting the search stuck in plataus. ## Randomized Tabu search Like regular Tabu, but randomized (or so the name seems to suggest). ## Intensification vs Diversification Intensification is the intense search of a relatively small area - that is, the exploitation of a discovery of a good area. Diversification is looking at many diverse regions, exploring uncharted territories. Intensification is often quicker at reaching a local optimum, but diversification is better geared towards discovering global optimums. # Branch-and-bound (BB) A branch-and-bound algorithm consists of systematic enumeration of all candidate solutions, where large subsets of fruitless candidates are discarded _en masse_, by using upper and lower estimated bounds of the quantity being optimized. Canonically, BB is used to minimize a function $f(x)$, where $x \in (S = \left\\{ some set of candidate solutions \right\\}). BB requires: * For branching: a splitting procedure that splits a candidate solution set $S$ into $S\_1, S\_2, ..., S\_n$ such that $S = \bigcup \left\\{S\_1, S\_2, ..., S\_n \right\\}, n ≥ 2$, and the minimum of $f(x)$ over $S$ is $\text{min}\left\\{v\_1, v\_2, ..., v\_n\right\\}$, where each $v\_i$ is the minimum of $f(x)$ within $S_i$. * For bounding: a procedure that computes upper and lower bounds for the minimum value of $f(x)$ over a given subset of $S$. For an example of branch and bound used with Integer Programming, see [Algorithm Animations for _Practical Optimization: A gentle Introduction_](http://optlab-server.sce.carleton.ca/POAnimations2007/BranchAndBound.html) # Genetic algorithms Genetic algorithms mimic evolution to obtain practical acceptable solutions to optimization problems. See [](#genetic-algorithm-template) for an overview of how to make a genetic algorithm. Some genetic-domain-specific words need a mapping into the algorithmics domain: || **Genetics term** || **Corresponding algorithmics term** || || an individual || a string (or vector) representation of a candidate solution || || fitness value || cost function || || population || a subset of the set of feasible candidate solutions || || mutation || a random local transformation of an individual || Here, the different stages of the genetic algorithm as described in the [template](#genetic-algorithm-template) are explained in more detail: ## Initialization Initialization is the step of creating a starting population $P = \left\{a\_1, a\_2, ..., a\_k\right\}$ which becomes the first generation of the algorithm. Random generation is a good way of doing this, but there are other approaches. ## Selection Selection is the step of selecting $ n \le k $ indivudials from the population, on which genetic operators will be applied. A good selection strategy is crucial for success in a genetic algorithm. Using the fitness value to select the $n$ best individuals from the population is common ('survival of the fittest'), and throwing in some randomness as well is usually a good move. ## Genetic operators Genetic operators work on individuals to make new, different individuals. ### Crossover/recombination Crossover is when you combine two individuals to make an offspring individual which contains combined parts from both its parents. One way of performing a crossover is to take use the first half of the first individual (a string or vector representation of a candidate solution, remember), and and the second half of the second individual. As a simple example: consider a genetic algorithm trying to find a string of length 10 that contains only vowels. The two individuals **lsjifasdfw** and **areighifpo** are recombined using the previously described method would make a new individual **lsjifhifpo** ### Mutation Mutation does a slight modification of a single individual to form a new individual. Methods include flipping a single bit of the individual, adding a random number to one of the individual's vector elements, generation-number-dependant modifications, etc. Following with the offspring individual from the recombination example: **lsjifhifpo** could mutate to **lsjifaifpo**, using the mutation rule that a single character in the string should be swapped with a random character. ### (regrouping) ### (colonization-extinction) ### (migration) ## Termination # Linear programming (LP) Also known as linear optimization. LP is the process of finding the "best" values obtainable for a function of variables constrained by linear inequalities. LP is solvable in polynomial time. ## Integer programming (IP) IP is linear programming where the variables can only take integer values. IP is NP-hard. ## Binary linear programming (01LP) 01LP is linear programming where the variables can only take 0 or 1 as values. 01LP is NP-hard. ## Relaxation to LP Many interesting optimizaition problems with useful solutions are easily reduced to IP or 01LP. Because IP and 01LP are both NP-hard, and LP is P, relaxing problems to LP is a good idea when possible. Here is how: 1. Express an optimization problem $U$ as an instance $I$ of IP or 01LP. 2. Pretend that $I$ is an instance of LP, and find a solution $a$. 3. Use $a$ to solve the original problem. Easy, right? # Layman's guides This section contains some useful guides to various methods and techniques for algorithm design. ## Parametrized complexity applications So you have an NP-hard problem. Nice. And difficult. So difficult that it is impossible to solve in polynomial time (unless $P = NP$). Anyway, here is a general plan that might help: 1. Define a subproblem mechanism for defining subproblems according to their supposed subproblem difficulty. 2. Use this mechanism to define a class of easy subproblems. 3. Define a requirement for an algorithm to have a nice time complexity with respect to the subproblem difficulty, in such a way that a nice algorthm can solve an easy problem in polynomial time. 4. Then, either: * design a nice algorithm, and prove that is nice, or * prove that it is impossible to design a nice algorithm, unless $P = NP$. Look at these categories for ideas for the different subproblem mechanisms, proofs, etc: ### Pseudo-polynomial-time algorithms || Subproblem mechanism || Include only instances where the maximal integer in the input is bounded by a non-decreasing function $h$ of the input size. || || Easy subproblems || Those where he bouding $h$ function is a polynomial. || || Nice algorithms || Pseudo-polynomial-time algorithms. || || Designing the algorithm || Just design it || || Proving that there are no nice algorithms || Prove that there exists a polynomial $h$ such that $\text{Value}(h) - U$ is NP-hard. Then, the problem is strongly NP-hard. || ### Parametrized complexity || Subproblem mechanism || Define a parametrization function _Par_ to give each input instance an integer score to measure the difficulty of the input instance. Higher _Par_(_x_) means harder difficulty. Subproblems can be defined by including only instance where _Par_(_x_) takes a specific value or range. || || Easy subproblems || Those where _Par_(_x_) is small and does not depend on input size. || || Nice algorithms || Par-parametrized polynomial-time algorithms. || || Designing the algorithm || Just design it || || Proving that there are no nice algorithms || choose a constant _k_ and prove that subproblem where _Par_(_x_) = _k_ is NP-hard. || ## Local search algorithm design 0. Make sure you have an optimization problem and not a decision problem. 1. Define a neighborhood function. This function maps for an input instance _x_ each feasible solution _a_ for _x_ to other feasible solutions for _x_ called the neighbors of _a_. Typically, neighboring solutions differ only by some small modification. 2. Start the search anywhere you want. Randomly choosing some place is a nice strategy. 3. Repeat the following: go to best neighbor (or don't, if all neighbors suck, comparitively). 4. In the case where all neighbors suck, give up. You have a local optimum and the search is done. ## Genetic algorithm template 1. Randomly generate a population of individuals. 2. Apply a fitness function to calculate a fitness score for each individual in the current generation. 3. Select individuals for reproduction based on fitness and a little randomness. 4. Apply crossover and mutation to the selected individuals to produce the next generation. 5. Stop whenever you feel like it. ## Proving that a decision problem is undecidable You have a decision problem _E_ which you suspect is undecidable. Unfortunately, you need proof. A general method is: 1. Find a different decision problem _H_ which is known to be undecidable. I recommend the halting problem. 2. Suppose a decider _R_ decides _E_. Define a decider _S_ that decides _H_ using _R_. 3. If _R_ exists, _S_ can solve _H_. However _H_ cannot be solved. Therefore, _R_ cannot exist. ## The Primal-Dual schema (taken from cmu.edu) We typically devise algorithms for minimization problems in the following way: 1. Write down an LP relaxation of the problem, and find its dual. Try to find some intuitive meaning for the dual variables. 2. Start with vectors x = 0, y = 0, which will be dual feasible, but primal infeasible. 3. Until the primal is feasible: 1. increase the dual values y\_i in some controlled fashion until some dual constraint(s) goes tight (i.e. until sum\_i y_i * a\_ij = c\_j for some j), while always maintaining the deal feasibility of y. 2. Select some subset of the tight dual constraints, and increase the primal variable corresponding to them by an integral amount. 4. For the analysis, prove that the output pair of vectors (x,y) satisfies $c^[trans] * x <= $\rho * y^[trans] * b$ for as small a value of rho as possible. Keep this goal in mind when deciding how to raise the dual and primal variables. ## Christofides algorithm Use this to approximate the TSP $G = (V,w)$ with $\rho = 1.5$, where the edge weights satisfy the triangle inequality. 1. Create the MST $T$ of $G$. 2. Let $O$ be the set of vertices with odd degree in $$T$$ and find a minimal matching $M$ in the complete graph over $O$. 3. Combine the edges of $M$ and $T$ to form a multigraph $H$. 4. Form an eulerian circuit in $H$. 5. Make the circuit found in previous step Hamiltonian by skipping visited nodes ('shortcutting'). You now have an approximation of TSP with $\rho = 1.5$. Why does this approximate TSP with $\rho = 1.5$, you ask? Well: ## Converting problems ### VCP to SCP This is how to turn a VCP into an SCP. We use the variables_G_ = (_V_,_E_) for the VCP graph, and _U_ and _S_ for the SCP variables. 1. Let _U_ be _E_. 2. Let _S\_i_ be the set of edges touching vertex _i_. And that's it! # Appendix: list of algorithms featured in the textbook * 156, DPKP * 165, Ford-Fulkerson * 172, Vertex cover algorithm * 177, B&B for MAXSAT and TSP * 187, D&C-3SAT * 191, Local search scheme * 197, Kernigan-Lin variable-depth search * 434, Metropolis algorithm * 436, Simulated Annealing * 442, Tabu search * 443, Randomized tabu search * 446, Genetic algorithm scheme * 214, Methods for relaxing problems to LP * 226, Simplex * 228, SCP(k) relaxation to LP * 237, Primal-dual scheme * 238, Primal-dual (MINVCP) * 250, Greedy makespan schedule * 262, Algorithm for VCP * 264, Algorithm for SCP * 268, Algorithm for WEIGHT-VCP * 269, Algorithm for MAX-CUT * 272, Greedy-simple KP * 273, PTAS for SKP * 278, modified PTAS for SKP * 280, FPTAS for KP * 283, TSP △-ineq 2-approx * 288, Christofides algorithm * 301, Sekanina's algorithm # Appendix: proofs (TODO: write the actual proofs) * 201, HC $\le_p$ RHC * 201, RHC $\le_p$ SUBOPT\_TSP