CSCE 475/875
Handout 5: From Optimality
to Equilibrium
September
8, 2011
(Based on
Shoham and Leyton-Brown
2011)
Introduction
In single-agent decision theory the key notion is that of an optimal strategy, that is, a strategy that maximizes the agent’s expected payoff for a given environment in which the agent operates. The situation in the single-agent case can be fraught with uncertainty, since the environment might be stochastic, partially observable, and spring all kinds of surprises on the agent. However, the situation is even more complex in a multiagent setting.
Important. Thus the
notion of an optimal strategy for a given agent is not meaningful; the best
strategy depends on the choices of others.
Game
theorists deal with this problem by identifying certain subsets of outcomes,
called solution concepts,
that are interesting in one sense or another. Here we describe two of the most fundamental
solution concepts: Pareto optimality
and Nash equilibrium.
Pareto Optimality
Definition
3.3.1 (Pareto domination) Strategy profile Pareto
dominates strategy profile if for
, and there exists some for which .
In
other words, in a Pareto-dominated strategy profile some player can be made better off without making any other player
worse off.
Pareto
domination gives us a partial
ordering over strategy profiles. Thus, in answer to our question before, we
cannot generally identify a single “best” outcome; instead, we may have a set
of noncomparable optima.
Definition
3.3.2 (Pareto optimality) Strategy profile is
Pareto optimal, or strictly
Pareto efficient, if there does not exist another strategy profile that
Pareto dominates .
We
can easily draw several conclusions about Pareto optimal strategy profiles.
·
First,
every game must have at least one such optimum, and there must always exist at
least one such optimum in which all players adopt pure strategies.
·
Second,
some games will have multiple optima. For example, in zero-sum games, all strategy profiles are strictly Pareto
efficient.
·
Finally,
in common-payoff games, all Pareto optimal strategy profiles have the same
payoffs.
Best Response and Nash Equilibrium
Now
we will look at games from an individual
agent’s point of view, rather than from the vantage point of an outside
observer.
Intuition: Our first observation is that if an agent
knew how the others were going to play, his or her strategic problem would
become simple. Specifically, he or she
would be left with the single-agent problem of choosing a utility-maximizing
action!
Formally,
define , a strategy profile without
agent ’s strategy. Thus we can write . If the agents other than (whom
we denote ) were to commit to play , a utility- maximizing agent would
face the problem of determining his best response.
Definition
3.3.3 (Best response) Player ’s best response to the
strategy profile is
a mixed strategy such that for
all strategies .
The best response is not necessarily unique.
Further, when the support of a best
response includes two or more
actions, the agent must be indifferent among them—otherwise, the agent would
prefer to reduce the probability of playing at least one of the actions to
zero. Thus, similarly, if there are two
pure strategies that are individually best responses, any mixture of the two is
necessarily also a best response.
Important: Of course, in general an agent will not know what strategies the other players plan to adopt. Thus, the notion of best response is not a solution concept—it does not identify an interesting set of outcomes in this general case.
However,
we can leverage the idea of best response to define what is
arguably the most central notion in noncooperative
game theory, the Nash equilibrium.
Definition
3.3.4 (Nash equilibrium) A strategy profile is a Nash
equilibrium if, for all agents , is a best response to .
Intuitively, a Nash
equilibrium is a stable strategy profile: no agent would want to change
his strategy if he knew what strategies the other agents were following. We can divide Nash equilibria
into two categories, strict and weak, depending on whether or not every agent’s
strategy constitutes a unique best response to the other agents’
strategies.
Definition
3.3.5 (Strict Nash) A strategy profile is a strict Nash equilibrium if,
for all agents and
for all strategies
Definition
3.3.6 (Weak Nash) A strategy profile is a weak Nash equilibrium if,
for all agents and
for all strategies