This is an old version of the compendium, written April 26, 2015, 9:04 p.m. Changes made in this revision were made by ilsegv. View rendered version.

TDT4230: Graphics & Visualization

# Practical information * The exam is held on the 4th of June 2015 at 9 am. * [Old exams](http://www.idi.ntnu.no/emner/tdt4230/) # Curriculum _Last updated in 2015._ * Chapter 3: Everything from 3.7 and out (this includes 3.7) * Chapter 4: Complete chapter * Chapter 5: Complete chapter * Chapter 6: Complete chapter * Chapter 7: Complete chapter except for 7.5.3 * Chapter 10: Complete chapter except for 10.6 * Chapter 12: 12.1, 12.4, 12.5, 12.6 (except 12.6.4), 12.11 * Chapter 14: Complete chapter, except for 10.7.3, 10.7.4, 10.7.5, 10.8. This means 10.9 is curriculum. * Chapter 15: Complete chapter except for 15.2.3 * Chapter 18: Complete chapter # Chapter 3 2D and 3D Coordinate Systems and Transformations ## 2D and 3D Coordinate Systems and Transformations A point in euclidian space can be defined as 3D vector. Linear transformation is achieved by post-multiplying the point to a 3x3 matrix. ### Affine transformations Affine transformations are transformations which preserve important geometric properties of the objects being transformed There are four basic affine transformations: - Translation - Scaling - Rotation - Shearing ### Transformation Matrices Quick note: a _hyperplane_ refers to a substance of one dimensionality less than its ambient space. I.e. in a 3-dimensional space, the $xy$-plane is a hyperplane. In a 2-dimensional space, the $x$ and $y$ axis are hyperplanes. #### Translation Translation defines movement by a certain distance in a certain direction. Translation of a point __p__ by a vector $\vec{\textbf{d}}$ results in a new point $\textbf{p'} = \textbf{p} + \vec{\textbf{d}}$. All we do is add the vector to the point. The translation transformation matrix is an instantiation of the general affine transformation, $\Phi(p) = A \cdot p + \vec{t}$ with $A = I$ and $\vec{t} = \vec{d}$. The 3D homogenous translation matrix is: $$ T(\vec{d}) = \begin{bmatrix} 1 & 0 & 0 & d_x \\ 0 & 1 & 0 & d_y \\ 0 & 0 & 1 & d_z \\ 0 & 0 & 0 & 1 \end{bmatrix} $$ The $n$-dimensional translation matrix is derived by instantiating the general affine transformation with $A = I(n)$. ($I(n)$ is the $n$-dimensional identity matrix) If you need the homogenous variant, use $I(n+1)$. The inverse translation of $T(\vec{d})$ is $T^{-1}(\vec{d}) = T(-\vec{d})$. #### Scaling We have $n$ scaling factors $s_{i}$, $i = 1, 2, 3, ..., n$ corresponding to the dimensionality of the space. If a scaling factor $s_{i} < 1$ the object's size is reduced in the respective dimension, whileas if $s_{i} > 1$ it is increased. $s_{i} = 1$ has no effect. $s_{i} = 0$ reduces the object's size in the respective dimension to zero. If $s_{i} = 0$, $\forall i$ the object disappears. _Mirroring_ about a major hyperplane can be done by using $s_{i} = -1$ as the factor for the hyperplane which isn't involved in the mirroring. I.e. to perform 2D mirroring about the $x$ axis, use $S(1, -1)$. To perform 3D mirroring about the $xy$-plane, use $S(1, 1, -1)$. The 3D homogenous scaling matrix is: $$ S(s_x, s_y, s_z) = \begin{bmatrix} s_x & 0 & 0 & 0 \\ 0 & s_y & 0 & 0 \\ 0 & 0 & s_z & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}$$ For 3D non-homogenous scaling, remove the fourth row and fourth column. The inverse of the scaling matrix $S(s_x, s_y, s_z, ...)$ is $S^{-1}(s_x, s_y, s_z, ...) = S(\frac{1}{s_x}, \frac{1}{s_y}, \frac{1}{s_z}, ...)$ #### Rotation The book defines _positive rotation_ about an axis $a$ as one which is in the _counterclockwise_ direction when looking from the positive part of $a$ towards the origin. The three dimensional rotation transformation is defined as $R_{a}(\theta)$, where $a$ is the axis to rotate about and $\theta$ is the measure of how much you want to rotate it. The 3D homogenous rotation matrices are: $$ R_{x} (\theta) = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & cos(\theta) & -sin(\theta) & 0 \\ 0 & sin(\theta) & cos(\theta) & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} $$ $$ R_{y} (\theta) = \begin{bmatrix} cos(\theta) & 0 & sin(\theta) & 0 \\ 0 & 1 & 0 & 0 \\ -sin(\theta) & 0 & cos(\theta) & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} $$ $$ R_{z} (\theta) = \begin{bmatrix} cos(\theta) & -sin(\theta) & 0 & 0 \\ sin(\theta) & cos(\theta) & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} $$ For the inverse 3D rotation transformation, use $-\theta$. In 2 dimensions we rotate about a point. The standard rotation transformations specify rotation about the origin. The 2D homogenous rotation matrix is: $$ R(\theta) = \begin{bmatrix} cos(\theta) & -sin(\theta) & 0 \\ sin(\theta) & cos(\theta) & 0 \\ 0 & 0 & 1 \end{bmatrix} $$ #### Shearing Shearing increases $n-1$ of an $n$-dimensional object's coordinates by an amount equal to the $n$th coordinate times a shearing factor (specified seperately for each $n-1$ dimensions). That is, $SH_{xy} (a, b) \cdot p$ increases $p_x$ by $p_z * a$ and $p_y$ by $p_z * b$. The 2D homogenous shearing matrices are: $$ SH_x (a) = \begin{bmatrix} 1 & a & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} $$ $$ SH_y (b) = \begin{bmatrix} 1 & 0 & 0 \\ b & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} $$ The 3D homogenous shearing matrices are: $$ SH_{xy} (a, b) = \begin{bmatrix} 1 & 0 & a & 0 \\ 0 & 1 & b & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} $$ $$ SH_{xz} (a, b) = \begin{bmatrix} 1 & a & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & b & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} $$ $$ SH_{yz} (a, b) = \begin{bmatrix} 1 & 0 & 0 & 0 \\ a & 1 & 0 & 0 \\ b & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} $$ The inverse of a shear is obtained by negating the shear factors. ## Quaternions Quaternions are used to express rotation. They consist of 4 real numbers. $q = (s,x,y,z)$. Here, $s$ is the scalar part, and $\vec{v} = (x,y,z)$ is the vector part of quaternion $q$. The conjugate quaternion of $q$ is $$\bar{q} = (s, -\vec{v})$$ Quaternions can express rotation by an angle $\theta$ as: $$q = (cos \frac{\theta}{2}, sin \frac{\theta}{2}\hat{n}$$ The direction is specified by unit vector $\hat{n}$. ### Rotation We can use quaternions to rotate points around a vector. For the calculation we need 1. $p$, the point that is to be rotated 2. $\vec{v}$, the vector that $p$ is to be rotated around. - Normalize $\vec{v}$ to the unit vector $\hat{v}$. 3. $\theta$, the arc of the rotation (how many "degrees" to rotate if you will). $$ q = \cos{\frac{\theta}{2}} + \hat{v} \sin{\frac{\theta}{2}} $$ $$ q^{-1} = \cos{\frac{\theta}{2}} - \hat{v} \sin{\frac{\theta}{2}} $$ The rotated point $p'$ is then $$ p' = q * p * q^{-1} $$ This is equal to $ p' = q * p * \bar{q} $, which is easier to calculate. Quaternion multiplication required fewer operations and is numerically more stable than rotation matrix multiplication. Note that quaternion multiplication is non-commutative, just like matrix multiplication. ### Conversion between Quaternions and Rotation Matrices Quaternion $q = (s,x,y,z)$ corresponds to $$ R_q = \begin{bmatrix} 1-2y^2 & 2xy-2sz & 2xz+2sy & 0 \\ 2xy+2sz & 1-2x^2-2z^2 & 2yz-2sx & 0 \\ 2xz-2sy & 2yz+2sx & 1-2x^2-2y^2 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} $$ To convert rotation matrix $$ R = \begin{bmatrix} m_{00} & m_{01} & m_{02} & 0 \\ m_{10} & m_{11} & m_{12} & 0 \\ m_{20} & m_{21} & m_{22} & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} $$ to a quaternion $q = (s,x,y,z)$, calculate the following values: $$s = \frac{1}{2}\sqrt{m_{00}+m_{11}+m_{22}+1}$$ $$x=\frac{m_{21}-m_{12}}{4s}, y=\frac{m_{02}-m_{20}}{4s}, z=\frac{m_{10}-m_{01}}{4s}$$ If $s$ equals or is close to 0, a different set of relations can be used. # Chapter 4 Projections and Viewing Transformations To view 3D models on 2D output devices, we need to project the models. Projection usually happens after culling states and before rendering. A viewing transformation defines the transition from the world coordinate system (WCS) to canonical screen space (CSS), via the eye coordinate system (ECS). It also defines the clipping boundaries for frustum culling in ECS. ## Projections There are two types of projection that are interesting in this course, perspective and parallel projection. The distance of the center of projection from the plane of projection is finite in perspective projection, and infinite in parallel projection. There are some other differences: || __Property preserved__ || __Affine__ || __Projective__ || || Angles || No || No || || Distances || No || No || || Ratios of distances || Yes || No || || Parallel lines || Yes || No || || Affine combinations || Yes || No || || Straigt lines || Yes || No || || Cross ratios || Yes || Yes || Projection images of objects defined by a set of points need one more element than affine images. With these extra points, we can perserve properties like depth and color. Ratios of distances are not preserved, but cross ratios are preserved. ### Perspective Projection To project a point $p = [x,y,z]^T$ in 3D onto a point $q = [x,y,d]^T$ on the plane of projection, we need to find the distance between $p$ and $q$, $d$. Then we can use the matrix $$P_{PER} = \begin{bmatrix} d & 0 & 0 & 0 \\ 0 & d & 0 & 0 \\ 0 & 0 & d & 0 \\ 0 & 0 & 1 & 0 \end{bmatrix} $$ After that, we divide the result by the homogeneous coordinate, $z$. The final result for $p$ is $$ \begin{bmatrix} \frac{x*d}{z} \\ \frac{y*d}{z} \\ d \\ 1 \end{bmatrix} $$ ### Parallel projection To perform a parallel projection, we need to know the plane of projection and the direction of projection (a vector). The book mentions two types of parallel projection, orthographic and oblique. #### Orthographic Projection All the projection lines in an orthographic projection are orthogonal to the projection plane. Usually, one of the main planes is used as the plane of projection. To project onto the xy-plane, we can use the following matrix: $$P_{ORTHO} = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} $$ #### Oblique Projection The direction of the projection is given by $$\overrightarrow{DOP} = [DOP_x, DOP_y, DOP_z]^T$$ This direction is not necessarily normal to the plane of projection. When the plane of projection is the xy-plane, the projection matrix becomes $$P_{OBLI} = \begin{bmatrix} 1 & 0 & -\frac{DOP_x}{DOP_Z} & 0 \\ 0 & 1 & -\frac{DOP_y}{DOP_Z} & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} $$

## Viewing TransformationA viewing transformation defines the transition from the WCS to ECS to CSS. It also defines the clipping boundaries for frustum culling in ECS. A point $X_w = [x_w, y_w, z_w, 1]$ can be converted into CSS multiplying matrices in this order: $$ X_{CSS} = M_{ECS to CSS} * M_{WCS to ECS} * X_w$$ ### WCS to ECS

World coordinate system to eye coordinate system is the first transition in viewing transformations. First, we need to define the ECS axes. ECS is usually defined by E : The point of view, the location of an observer $\overrightarrow{g}$ : The direction of view, a vector $\overrightarrow{up}$ : The up direction, which does not need to be perpendicular to $\overrightarrow{g}$ The axis $x_e$ is horizontal, increasing to the right. $y_e$ is vertical, and increases upwards. $z_e$ points towards the observer. $$\overrightarrow{z_e} = -\overrightarrow{g} $$ $$\overrightarrow{x_e} = \overrightarrow{up} \times \overrightarrow{z_e} $$ $$\overrightarrow{y_e} = \overrightarrow{z_e} \times \overrightarrow{x_e} $$ TODO: math for actual transformation

### ECS to CSS There are two ways of converting from the eye coordinate system to the canonical screen space, orthographic projection and perspective projection. #### Orthographic Projection We need to choose parallelepiped-shaped a part of the scene that will be mapped to CSS, called the view volume. It can be defined by two opposite vertices, $(l, b, n)$ and $(r, t, f)$. $$M_{ECS\leftarrow CSS_{ORTHO}} = \begin{bmatrix} \frac{2}{r-l} & 0 & 0 & -\frac{r+l}{r-l} \\ 0 & \frac{2}{t-b} & 0 & -\frac{t+b}{t-b} \\ 0 & 0 & \frac{2}{f-n} & -\frac{n+f}{f-n} \\ 0 & 0 & 0 & 1 \end{bmatrix} $$ #### Perspective projection The view volume is a truncated pyramid symmetrical about the $-z_e$-axis. To define its shape, we need $\theta$, the angle of the field of view, _aspect_, the ratio of width to height in the pyramid, $Z_e = n$, the near clipping plane and $z_e = f$, the far clipping plane. $$t = |n|*\tan{\frac{\theta}{2}}$$ $$b + -t$$ $$r = t*aspect$$ $$l = -r$$ The matrix for this projection is given by $$M_{ECS\leftarrow CSS_{PERSP}} = \begin{bmatrix} \frac{2}{r-l} & 0 & 0 & 0 \\ 0 & \frac{2}{t-b} & 0 & 0 \\ 0 & 0 & \frac{2}{f-n} & -\frac{2nf}{f-n} \\ 0 & 0 & 1 & 0 \end{bmatrix} $$

# Chapter 5 Culling and Hidden Surface Elimination Algorithms ## Hidden Surface Elimination ### Ray-Casting Algorithm A ray is followed for every pixel **p**. Checks intersection with every primitive **P** and return the closest intersection for each pixel. Slow: O(**pP**) but easily parallelizable. # Chapter 6 Model representation and simplification Models are approximate representations of objects. Polygonal models are most common for surfaces, but mathematical models are also used. Models usually use _surface representation_, but in some cases _volume representation_ is better. For example when the object is semi-transparent or the internal structure is important. When an object is not closed, we cannot use volume representation. ## Properties of Polygonal Models A surface model is 2-manifold if every point on the surface has a neighborhood homeomorphic (topologic equivalent) to an open disk. This means that, on the surface, every edge is shared by exactly 2 faces and around each vertex exists a closed loop of faces. The manifold can have a boundary. In that case, the edges on the boundary belong to one face and the vertices on the boundary have a open loop of faces. If the manifold does not have a boundary, it is closed. Closed manifold models homeomorphic to a sphere satisfy Euler's formula, $ V - E + F = 2$. Here, V is the number of vertices, E edges and F faces. For manifolds in general, $ V - E + F = 2 - 2G$. G is the number of genus. That is, the number of holes penetrating the model. For example, a doughnut has genus 1. ## Data structures Primitive data structures store the coordinates of each vertice, edge and face separately. ### Winged-Edge Every edge is on the form $e(v_t,v_b,f_l,f_r,e_{tl},e_{tr},e_{bl},e_{br})$. The v's are the edges vertexes, f's are each face and the e's are each edge connected to the vertexes which also borders one of the faces. In other words, each edge stores references to its 2 vertices, its 2 adjacent faces and its 4 neighboring edges. This data structure makes it possible to traverse the model since we know every adjacent edge. ### Half-Edge The half-edge is an extension of the winged edge, where each edge is decomposed into two edges. Each half-edge stores its vertices, the neighboring edges of its face and a reference to the other half-edge (which stores the other face next to the edge). This structure is more efficient than winged-edge for some adjacency queries. ## Iterative Edge Collapse The idea is to provide [dynamic model simplification](https://www.youtube.com/watch?v=g4uJ02THpRA), without the need of precomputed models. This is achieved by storing the edges in a sorted priority queue. Usually, edges further away have a higher priority. sort the edges in a priority queue. iteratively: collapsing the edge of highest priority re-compute priority of all affected edges The quadratic error metric is mentioned as an example of how to order the edges in a priority, it also provides where to position the resulting vertex. In rough terms the idea of the quadratic error metric is to minimize the square distance of the new vertex from the faces around it. # Chapter 7 Parametric Curves and Surfaces ## Bézier Curves ### Properties of Bézier Curves * Every n-th degree polynomial can be represented by a Bézier curve * Convex Hull property. The curve is inside the convex hull of its control points * Invariance under affine combinations. * Invariance under affine combinations of its parameter. * Symmetry with respect to comtrol points. * Linear precision. If all control points lie on a straight line, then the curve is also a straight line. * Variation diminishing property. A bezier curve is smoother than its control polygon. If you draw a straight line the number of intersections with the curve will be less than or equal to the number of intersections with the control polygon. A curve with a convex control polygon is also convex. A convex curve may have a non-convex control polygon. * Endpoint interpolation. A curve starts at its first point and ends at its last. * Derivative, since the curve can be represented by the Bernstein polynomials and they are derivative, so is the curve. * Tangents at endpoints are parallel to the first edge and last edge of the control polygon. * Pseudo-local control. Moving a control point has effect on the whole curve, but more pronounced effect close to the control point. ## B-Spline Curves ### Properties of B-Spline Curves * Local control * Strong convex hull property * Invariance under affine transformation. Applying an affine transformation to a B-Spline curve transforms only its control points. ## Rational Bézier & B-Spline Curves Rational curves are curves that use homogeneous coordinates. The weights are always positive because negative weights give unpredictable behaviors and some properties is not maintained, like the convex hull property. ### Properties of Rational Bézier Curves * Convex Hull property * Variation diminishing property * Invariance under affine transformation ### Properties of Rational B-Spline Curves (NURBS) * Convex Hull property * Variation diminishing property * Invariance under affine transformation NURBS offer all the properties of the other types of curves. Most used in practice. ## Bézier & B-Spline Surfaces The surfaces features the same properties as the curves respectivly. With the notable exeption of the variation diminishing property. # Chapter 10 Visualization Principles How do we visualize data? Plain matrices might be boring or difficult to look at, here are some alternatives. ## Tone Mapping ### Transfer functions Takes our data as input and outputs a value representing intensity. This function can take many forms, polynomials or step functions are examples. ### Color maps Makes use of a transfer function but includes color for additional visual effect. # Chapter 12 Illumination Models and Algorithms # Chapter 14 Texturing # Chapter 15 Ray Tracing # Chapter 18 Scientific Visualization Algorithms