This is an old version of the compendium, written May 20, 2013, 2:49 p.m. Changes made in this revision were made by sindreij. View rendered version.

TDT4171: Artificial Intelligence Methods

* Strong/weak AI * Agents * Rationality * Bayesian networks * Syntax and semantics * Uses and problems * Conditional independence * Decision networks/Influence diagrams * Reasoning * Case-based reasoning * Instance-based reasoning * Markov processes * Markov assumptiom * Neural networks * Decision trees # Bayesian Networks A Bayesian network is a model of a set of random variables and their conditional dependencies through a directed acyclic graph. In the graph, the nodes represent the random variables and the edges conditional dependencies. Much of the calculation in a bayesian network is based on Bayes' theorem: $$ P(A|B)=\frac{P(B|A)P(A)}{P(B)} $$.

~~$$\frac{a}{b}$$~~ ## Bayesian networks in use

* Good: Reason with uncertain knowledge. "Easy" to set up with domain experts (medicine, diagnostics). * Worse: Many dependencies to set up for most meaningful networks. * Ugly: Non-discrete models Bad at for example image analysis, ginormous networks. # Markov Chains A Markov chain is a random (and generally discrete) process where the next state only depends on the current state, and not previous states. There also exists higher-order markov chains, where the next state will depend on the previous n states. Note that any k'th-order Markov process can be expressed as a first order Markov process. In a Markov chain, you typically have observable variables which tell us something about something we want to know, but can not observe directly. ## Operations on Markov Chains ### Filtering Calculating the unobservable variables based on the evidence (the observable variables). ### Prediction Trying to predict how the variables will behave in the future based on the current evidence. ### Smoothing Estimate past states including evidence from later states. ### Most likely explanation Find the most likely set of states. ## Bernoulli scheme A special case of Markov chains where the next state is independent of the current state. # Learning ## Decision trees Decision trees are a way of making decision when the different cases are describable by attribute-value pairs. The target function should be discretely valued. Decision trees may also perform well with noisy training data. A decision tree is simply a tree where each internal node represents an attribute, and it's edges represent the different values that attribute may hold. In order to reach a decision, you follow the tree downwards the appropriate values until you reach a leaf node, in which case you have reached the decision. ### Building decision trees The general procedure for building decision is a simple recursive algorithm, where you select an attribute to split the training data on, and then continue until all the data is classified/there are no more attributes to split on. How big the decision tree is depends on how you select which attribute you split on. For an optimal solution, you want to split on the attribute which holds the most information, the attribute which will reduce the entropy the most. #### Overfitting With training sets that are large, you may end up incorporating the noise from the training data which increases the error rate of the decision tree. A way of avoiding this is to calculate how good the tree is at classifying data while removing every node (including the nodes below it), and then greedily remove the one that increases the accuracy the most. # Case-based reasoning Case-based reasoning assumes that similar problems have similar solutions and that what works today still will work tomorrow. It uses this as a framework to learn and deduce what to do in both known and unknown circumstances. ## Main components * Case base * Previous cases * A method for retrieving relevant cases with solutions * Method for adapting to current case * Method for learning from solved problem ## Approaches ### Instance-based *Answer:* Describe the four main steps of the case-based reasoning (CBR) cycle. What is the difference between “instance-based reasoning” and “typical CBR”? An approach based on classical machine learning for *classification* tasks. Attribute-value pairs and a similarity metric. Focus on automated learning without intervention, requires large knowledge base. ### Memory-based reasoning As instance-based, only massively parallel(?), finds distance to every case in the database. ### (Classic) Case-based reasoning Motivated by psychological research. Uses background-knowledge to specialise systems. Is able to adapt similar cases on a larger scale than instance-based. ### Analogy-based Similar to case-based, but with emphasis on solving cross-domain cases, reusing knowledge from other domains.