TDT4280: Distributed AI and Multi Agent Systems (Multi Agent Systems and Game Theory)
Three main types of agent arcitechtures this course deals with: - Reactive - Deliberative - Hybrid
Means end reasoning
Is reasoning about how to reach ends with available means. Essentially it is planning.
Brooks Subsumption architecture
Brooks Subsumption architecture is purely reactive. It has layers, and the layers process the environment independently. Higher layers subsume the action of lower layers (e.g. a exploration layer subsumes a collision aversion layer and doesn't bother with handlig collision detection), therefore lower layers get priority over higher layers when signals are sent to the actuators (An Introduction to MultiAgent Systems, Michael Wooldridge, 2002, page 91)
Belief, Desire, Intention agents
In the BDI arcitechture, the agent has beliefs (its base of knowledge), desires (what it wants to do with its life), and intentions (plans on how to achieve its desires). Through deliberating on the beliefs and desires it possesses (possesses it?) it forms intentions about what it wants to do next.
A layered acitechture, thoug different layers than the subsumption, and unlike subsumption, some of the layers are resposnsible for deliberation. The layers are Model (M), Planing (P), and Reactive (R). They are situated in a control framework, that handles some conflict resolution and message passing. M is responsible for deliberation, reflection, and the long term. P is responsible for the middle term, making intentions and executing plans. R is much like subsumption, handles events that need immediate responses. All layers are directly linked to all sensors and actuators.
Is about players/agents in games. Assumptions are that all agents are rational, and know the rules of the game, and the strategies of the other players. Payoff matrices are used for representing outcomes of the different combinations of strategies agents can employ.
Strategies can dominate other strategies
Is when all outcomes of one strategy are strictly better than the outcomes of the strategy you're comparing it to.
Is when all outcomes of one strategy are better than or equal to the outcomes of another strategy
A players outcome is dependent on the chance of what the other player chooses. The best strategy in mixed strategy is to play so that the opponents strategies all have the same payoff (expected value).
If strongly remove strategies that are strongly dominated one by one. If weakly, remove strategies that are weakly dominated one by one. In weakly itterated elimination order of removal matters, as there can be more than one viable weakly dominant strategy for each player.
Nobody wants to change strategy.
You cant get better outcomes for any player without screwing over at least one player.
Zero sum games
Games where the social welfare of all outcomes is 0. The games are completely adverserial (no cooperation makes any sense), and there are no pure Nash equilibria (only mixed strategies available).
Only relevant in cooperation games, but makes alot of sense there.
Evolutionary game theory
EGT takes time into account, which makes it better suited for learning. The only things that are updated are beliefs and strategies. An evolutionary stable strategy/state is one where no new population member can appear and suddenly change everything (again).
Are about distributing goods. They come in many forms, single item, bundles, partial items, one to many, many to one, many to many, etc. The purpose is to try to maximize the profit of both the seller and the buyer.
Personal value/true value is what an item is worth to you. Common value is how the market values an item. Correlated value is the personal value scewed by the common value.
Auctions can be used to valuate an item if there is no priorly known common value (good for seller, best types of auctions for this is english and I think one other).
Also known as open cry ascending. Each bidder must bid higher than the previous bid, all bidders know the bids made by others, the one with the highers bid when the auction ends gets the item.
Also known as open cry descending. The auctioner starts with a high price, and reduces his price untill one bidder bids the price and wins the auction.
First price sealed bid
All bids are made simultaneously, nobody gets to know the bids of others. The one with the highest bid wins the item.
Second price sealed bid (Vickrey)
Same as first price sealed bid, except that the winner pays the price of the second highest bid. The winning strategy in this auction form is to bid ones own true price.
Auctions where many items are being sold at once, and the bidders can bid on sets they make up themselves. It is therefore both complicated to know what to bid, and what combination of bidders to sell to.
Is when a group of biddes cooperate to win the auction at an artificcially low price, and then somehow split the item and the difference among themselves.
Is a qualitative way of reaching agreements. The conflict deal (
Negotiation for resource division
Negotiation done in rounds, where one part comes with an offer, and the other part can either accept, or reject and come with counteroffer. If known (finite) number of rounds, only the last round matters, because the agent reviewing the deal will accept whatever is proposed that is as good or better than the conflict deal. Fair if unknown ammount of rounds, but can possibly last forever if number of rounds is not limmited.
Task Oriented Domains
There are tasks instead of resources in deals. The negotiation set is the set of reasonable deals there is any meaning to negotiate about (deals that are better than
Monotonic Concession Protocol
Is a way of reaching deals. The way it works is that in the first round, everybody proposes deals. If no deal is accepted, each agent can either come with a new deal or wait in the subsequent rounds. If every agent passes the negotiation ends and the
Is an approach to the monotonic concession situation. It says that your first bit should be the one that is best for yourself. Thereafter you should not concede so little that you waste time by having to concede again because the other players are unlikely to accept the deal, but no so much that give up utility for yourself you wouldn't have had to give up otherwise.
Zeuthen strategy uses the concept of risk willingness for wheterher you should wait or make a proposal in any given round. The agent with the lowest risk willingness should concede, the others should wait. In ties both concede. You have high risk willingness if you have little to lose by the conflict deal becoming the result. If the utility of your current deal is the same as that of the conflict deal to you your risk willingness is 1. Risk willingness is otherwise calculated like this:
Risk willingness = (utility of my proposed deal to me - utility of the deal the other agent proposed to me ) / utility of my proposed deal to me.
Tries to find deals in a graph G=(V,E). Deals are nodes. Edges are offers that takes you to other deals.
Making Group Decisions (Voting)
Decision games have agents and outcomes. Social choice function: Single winner. Social welfare function: Final ranking of candidates. Votes are agents rankings of the available candidate. An example vote is candidate 1 > candidate 2 > candidate 3.
Is a graphical represenation of votes (G=(V,E)). Nodes are candidates, and edges go from candidates that are ranked higher to those that are ranked lower. In case of ties there is no edge.
Four main kinds discussed in this course: Plurality, Iterated plurality, Borda count, and Slather ranking. A condorcet winner is a winner that wins iterated plurality wotes no matter how the matchups are ordered (I.e. the one clear winner of the plurality wote).
Selects one winner by seeing who has most wotes among those most prefferes. With only two candidates simple majority.
Two and two candidates are woted over, one candidate is left as final winner. Is succeptible to manipulation because oredring of pairs may matter.
Assings points based on rankins in votes. Sums points over all votes, and produces one final ordering of candidates as outputs
Looks at the majority graph, and tries to make it acyclic (this will imply there is one clear winner). Does this by trying to come up with new graphs by flipping edges. The new graph with the fewest flips is the winning graph, and the social choice is extracted.
A criterion for good elections that say that if two candidates appear in a certain order across all votes, changing order of any of the other candidates in any of the votes should not affect the relative ranking of the two in the final outcome.
Only a dictatorship (only one agents wote is taken into account when producing the outcome) satisfies both pareto optimality and IIA.
Is when an agent wotes not according to its personal preference, but on similar vote preceived as a more likely outcome.
Coalitional games has agents, goals, and a utility function for goals. Agents make coalitions to fulfill more goals / better. There is a marginal benefit to the coalition of an agent joining, and a marginal benefit to the agent of joining a certain coalition. Coalitions are only feasible if the marginal benefit of joining is positive for each member.
Induced subgraphs are graphs representing coalitions, where nodes are agents, and edges are the benefits the coalitions get from the member the edge goes to.
Marginal nets are attmepting to estimate the benefits of a coalition by saying that the benefit are the sum of the benefits of all possible subcoalitions (down to individual agents).
The Shapley value is a value that tries to say something about how the profits of a coalitions should be shared. It gives a number based on the contributions to the coalition by each agent.
Qualitative Coalitional Games
Games where the agents have goals, and they only requre that a coalition fullfills one of their goals for them to consider it to be worth it to join.
Coalitional Resource Games
Assume that achieving stuff requires resources, and that each agent initially only has a finite set of resources at the beginning of the game.
Is about splitting tasks on distributed agents because the tasks are too large for one agent to handle, or no one can be bothered to do it alone.
In iterated prisoners dilemma, it is agents with different strategies playing against each other, such as allways defect, allways cooperate, allways do what the other agent(s) did the previous round, etc.
The sum of an outcome. E.g. if player p1 plays strategy s1 for which he earns e1 and player p2 plays strategy s2 for which he earns e2, one sais that the social welfare of the outcome of s1 and s2 is e1 + e2.