CSCE475/875 Multiagent Systems

Handout 15: Topics Covered

November 8, 2011

1. Agents

· Agent

· Intelligent Agent

· The five characteristics of environments for agent-based solutions

o Complete vs. incomplete (fully vs. partially observable)

o Certain vs. uncertain (deterministic vs. stochastic)

o Episodic vs. non-episodic

o Static vs. dynamic

o Discrete vs. continuous

2. Chapter 1: Distributed Constraint Satisfaction

· Constraint satisfaction problems

· Solution approaches

o Least-commitment approach

o Backtracking approach

· Filtering algorithms

· Variable and value ordering, minimum remaining values heuristic, degree heuristic, least-constraining-value heuristic

· Min-conflicts algorithm

· Relation to multiagent systems

3. Chapter 2: Distributed Optimization

· Different from constraint satisfaction – now looking for optimal solutions

· Four general family of approaches:

o Distributed dynamic programming

§ Asyncronous Dynamic Programming

§ Learning Real-Time A*

o Distributed solutions to Markov Decision Problems (MDPs)

§ Action selection in MDP, using a value iteration algorithm

o Optimization algorithms with an economic flavor (as applied to matching nad scheduling problems)

§ Contract net and auction – See later Chapter on Auctions

o Coordination via social laws and conventions

§ Voting, social preferences

4. Chapter 3: Noncooperative Game Theory

· Self-interested agents

o Axioms on Completeness, Transitivity, Substitutability, Decomposability, Monotonicity, Continuity, and the von Neumann and Morgenstern Theorem.

· Games in normal form

o Prisoner’s dilemma

o Common-payoff games

o Constant-sum games

· Strategies in normal-form games

o Pure strategy vs. mixed strategy profiles

o Definitions for Support and Expected utility of a mixed strategy

· Analyzing games: from optimality to equilibrium

o The notion of an optimal strategy for a given agent is not meaningful; the best strategy depends on the choices of others

o Pareto domination

o Pareto optimality

o Best response

o Nash equilibrium

o Strict Nash

o Weak Nash

5. Chapter 7: Learning and Teaching

· The interaction between learning and teaching

o Stackelberg game

· What constitutes learning?

· Two categories of theories of learning in MAS: Descriptive and Prescriptive

· Descriptive

o Realism, Convergence

o Convergence properties

§ Show convergence to stationary strategies which form a Nash equilibrium of the stage game

§ Require that the empirical frequency of play converge to a Nash equilibrium

§ Seek convergence to a correlated equilibrium of the stage game

§ Require that the non-stationary policies converge to an interesting state

· Prescriptive

o Strategic normative games – where agents are self-motivated

o Notion of self-play

o Safety, Rationality, and No-Regret, informal

· Fictitious play – an instance of model-=based learning, in which the learner explicitly maintains beliefs about the opponent’s strategy

· Rational learning

· Reinforcement learning

o Q-learning

§ Alpha, beta,

o Belief-based reinforcement learning

o Targeted learning, no-regret learning

6. Chapter 9: Aggregating Preferences: Social Choice

· Plurality voting

o Condorcet condition

· Social choice function, Social choice correspondence, Condorcet winner, Smith set, Social welfare function

· Voting

o Plurality voting, cumulative voting, approval voting, plurality with elimination, Borda voting, pairwise elimination

o Voting paradoxes: Condorcet condition not met, sensitivity to a losing candidate, sensitivity to the agenda setter

· Social welfare functions (ordering!)

o Pareto efficiency (PE), Independence of irrelevant alternatives (IIA), Nondictatorship

o Arrow’s Impossibility Theorem

· Social choice functions (top-ranked outcome!)

o Weak Pareto efficiency, Monotonicity, Nondictatorship

o Muller-Satterhwaite’s Impossibility Theorem

· Ranking system

o Agents are asked to vote to express their opinions about each other, with the goal of determining a social ranking

§ Agents who are ranked higher by others have more weighted votes.

o Approval voting satisfies IIA, PE, and nondictatorship

o Approval voting satisfies Ranked IIA, positive response and anonymity

7. Chapter 10: Protocols for Strategic Agents: Mechanism Design

· Strategic! Assume that agents will behave so as to maximize their individual payoffs

· Why is mechanism design so important to MAS designers?

· Local decision making vs. global, emergent coherence

o Autonomy vs. social chaos

· Bayesian game setting and mechanism

o Implementation in dominant strategies

o Implementation in Bayes-Nash equilibrium

· The truthfulness property

o The revelation principle

· Gibbard-Satterthwaite’s Impossibility Theorem

· A way to get around the impossibility: Quasinlinear Preferences

o Rewards and payments

o Risk attitudes: neutral, averse, and seeking

o Conditional utility independence, valuation, payment

o Basic constraints: Truthfulness, Efficiency, Budget Balance, Ex interim Individual Rationality, Ex post Individual Rationality, Tractability

o Optimization properties: Revenue Maximization, Revenue minimization, Maxmin Fairness, and Price-Of-Anarchy Minimization

· Groves Mechanism

o Truth telling is a dominant strategy under any Groves mechanism

· The Vickrey-Clarke-Groves (VCG) Mechanism

o a.k.a. Pivot Mechanism

o Clarke tax

o How does the mechanism work?

o Drawbacks of VCG

8. Chapter 11: Protocols for Multiagent Resource Allocation: Auction

· How can auctions be used to allocate task or resources?

· Single-good auctions

o English, Japanese, Dutch, and sealed-bid auctions

§ Open-cry, Open-exit, First price vs. second price, Vickrey

· Auctions as negotiations

· Auctions as Bayesian mechanisms

o Independent private value (IPV) setting (as opposed to common value or interdependent value settings)

· In a second-price auction where bidders have independent private values, truth telling is a dominant strategy

· Strategically equivalence, time complexity, communication complexity.

· Revenue equivalence theorem

o Risk attitudes

o Relationships between revenues of various single-good auction protocols

· Other auctions: Reverse auctions, Double auctions, All-pay (with entry costs) auctions

· Collusions

o How does a bidding ring survive?

o How does revenue equivalence theorem factor into a bidder’s decision to join a ring?

· Contract net protocol (CNP)

o Task announcement, Task announcement processing, Bidding, Bid processing, Contract processing, reporting results, and termination, and Negotiation tradeoffs

9. Chapter 12: Teams of Selfish Agents: An Introduction to Coalitional Game Theory

· How self-interested agents can combine to form effective teams

o Which coalitions to form?

o How to distribute payoffs?

· Coalitional game with transferability utility

o The payoffs to a coalition may be freely redistributed among its members

· Examples: voting game, airport game

· Supperadditive game, Additive game, Constant-Sum game, Convex game, Simple game

· Analyzing coalitional games in terms of payoffs to members

o Feasibly payoff, Pre-imputation, Imputation, Individual rationality

· Payoffs should be divided fairly

o Axioms: Symmetry, Dummy player, Additivity

o Given a coalitional game, there is a unique pre-imputation that satisfies the symmetry, dummy player, additivity axioms

o Shapley value!

§ Average marginal contribution

§ Why is it fair?

o The Core

§ The stability issue

§ A payoff vector is in the core of a coalitional game iff for each coalition, the sum of all agents’ rewards is greater than the valuation of the coalition.

§ Is the core always nonempty?

§ Characterizing when a coalition game has a nonempty core

· Balanced Weights and Bondereva-Shapley

· Veto player’s role in simple game – core?

· Every convex game has a nonempty core. The Shapley value is in the core.