Entangled Equilibria and the Twin Prisoners' Dilemma post by Scott Garrabrant 283 days ago | Vadim Kosoy and Patrick LaVictoire like this | 2 comments In this post, I present a generalization of Nash equilibria to non-CDT agents. I will use this formulation to model mutual cooperation in a twin prisoners’ dilemma, caused by the belief that the other player is similar to you, and not by mutual prediction. (This post came mostly out of a conversation with Sam Eisenstat, as well as contributions from Tsvi Benson-Tilsen and Jessica Taylor) This post is sketchy. If someone would like to go through the work of making it more formal/correct, let me know. Also, let me know if this concept already exists. Let $$A_1,\ldots,A_n$$ be a finite collection of players. Let $$M_i$$ be a finite collection of moves available to $$A_i$$. Let $$U_i:\prod_i M_i\rightarrow\mathbb{R}$$ be a utility function for player $$A_i$$. Let $$\Delta_i$$ be the simplex of all probability distributions over $$M_i$$. Let $$V_i$$ denote the vector space of functions $$v:M_i\rightarrow\mathbb{R}$$, with $$\sum_{m\in M_i}v(m)=0$$. A vector given a distribution in $$p\in\Delta_i$$ and a vector $$v\in V_i$$ we say that $$v$$ is unblocked from $$p$$ if there exists an $$\varepsilon>0$$, such that $$p+\varepsilon v\in\Delta_i$$. i.e., it is possible to move in the direction of $$v$$ from $$p$$, and stay in $$\Delta_i$$. Given a strategy profile $$P=(p_1,\ldots,p_n)$$ in $$\prod_{j\leq n} \Delta_j$$, and a vector $$V=(v_1,\ldots,v_n)$$ in $$\prod_{j\leq n} V_j$$, we say that $$V$$ improves $$P$$ for $$A_i$$ if $\lim_{\varepsilon\rightarrow 0}\frac{\sum_{m_1\in M_1}\dots\sum_{m_n\in M_n}U_i(m_1,\dots,m_n)\left(\prod_{j\leq n}p_j(m_j)+\varepsilon v_j(m_j)-\prod_{j\leq n}p_j(m_j)\right)}{\varepsilon}>0.$ We call this limit the utility differential for $$A_i$$ at $$P$$ in the direction of $$V$$. i.e., $$V$$ improves $$P$$ for $$A_i$$ if $$U_i$$ is increased when $$P$$ is moved an infinitesimal amount in the direction of $$V$$. (not that this is defined even when the vectors are not unblocked from the distributions) A “counterfactual gradient” for player $$A_i$$ is a linear function from $$V_i$$ to $$\prod_{j\leq n} V_j$$, such that the function into the $$V_i$$ component is the identity. This represents how much player $$A_i$$ expects the probabilities of the other players to move when she moves her probabilities. A “counterfactual system” for $$A_i$$ is a continuous function which takes is a strategy profile $$P$$ in $$\prod_{j\leq n} \Delta_j$$, and outputs a counterfactual gradient. This represents the fact that a player’s counterfactual gradient could be different depending on the strategy profile the game ends up in. We fix a counterfactual system $$C_i$$ for each player. Claim: There exists a strategy profile $$P=(p_1,\ldots,p_n)$$ in $$\prod_{j\leq n} \Delta_j$$ such that for all players $$A_i$$, if $$v_i\in V_i$$ is unblocked from $$p_i$$, then $$C_i(P)(v_i)$$ does not improve $$P$$. I will call such a point an entangled equilibrium. Proof Sketch: (Very sketchy, and I have not verified this. There is probably a better way. It might not be true.) We will construct a continuous function from $$\prod_{j\leq n} \Delta_j$$ to itself. We do this by moving $$p_i$$ by adding the gradient of function from $$p_i$$ to to $$U_i$$, assuming all other players probabilities change according to the linear function $$C_i(P)$$. If adding this gradient would take the point out of the simplex $$\Delta_i$$, you hit a boundary after moving some proportion $$\alpha$$ of the gradient. Then, now that you are on the boundary, you use move by $$1-\alpha$$ times the gradient of the same function restricted to the boundary, and repeat. This function has a fixed point, by Brouwer. For each player, this fixed point must have 0 gradient, or be on the boundary, pointing outward. If it is on a boundary, it has the same property when restricted to that boundary. If there was an unblocked direction a player could move that would improve its utility, then the gradient would be nonzero. If the player is on a boundary, the gradient is pointing outward, and there is an unblocked direction that would be an improvement, there will be such a direction that stays on that boundary, and the gradient restricted to that boundary would be nonzero. $$\square$$ Example 1 Consider a twin prisoners’ dilemma game. Two players can either cooperate or defect. They get 0 utility for being exploited, 3 for exploiting, 2 for mutual cooperation, and 1 for mutual defection. Both players believe that the other is using a decision procedure that is entangled with their own, but they are not completely sure. Formally, both players think that when they increase their probability by $$\varepsilon$$, the other player increases by $$2\varepsilon/3$$, regardless of where the probabilities start. Thus $$C_1(p,q)(v_1)=(v_1,2v_1)/3$$, where $$v_1$$ is the vector ($$A_1$$ cooperates)-($$A_1$$ defects) in $$V_1$$, and $$v_2$$ is the analogous vector for player 2. Similarly, $$C_2(p,q)(v_2)=(2v_2/3, v_2)$$. Since both players always think that increasing their probability of cooperation increases utility, the only equilibrium is when both players cooperate with probability 1. Example 2 Now let look at a case where the gradients are a function of the distributions. Again, both players believe that the other is using a decision procedure that is entangled with their own, but they are not completely sure. Formally, player 1 thinks that when they increase (or decrease) their probably of cooperation by $$\varepsilon$$, the other player will increase (or decrease) their probability by $$\delta\varepsilon$$, where $$\delta$$ is one minus the square of the difference between their two probabilities. Player 2 on the other hand, thinks the other player will increase (or decrease) their probability by $$\delta\varepsilon$$, where $$\delta$$ is one minus the absolute value of the difference between their two probabilities. Thus $$C_1(p,q)(v_1)=(v_1,(1-(p-q)^2)\cdot v_2)$$, where $$v_1$$ is the vector ($$A_1$$ cooperates)-($$A_1$$ defects) in $$V_1$$, and $$v_2$$ is the analogous vector for player 2. Similarly, $$C_2(p,q)(v_2)=((1-|p-q|)\cdot v_1, v_2)$$. Thus, if player 1 cooperates with probability $$p$$, and player 2 cooperates with probability $$q$$, then player 1 expects to lose gain $$(2(1-(p-q)^2)-1)\varepsilon$$ utility by increasing his probability by $$\varepsilon$$. This function is only 0 if $$|p-q|=1/\sqrt{2}$$. Thus the only entangled equilibria with mixed strategies for player 1 have $$(p-q)=1/\sqrt{2}$$. Similarly, the only entangled equilibria with mixed strategies for player 2 have $$|p-q|=1/2$$. Thus, there are no mixed strategies for both players in equilibria. The only pure equilibrium is $$p=q=1$$ If $$p=0$$, then $$(2(1-(p-q)^2)-1)$$ must be nonpositive, so $$q$$ must be at least $$1/\sqrt{2}$$. None of these points are equilibria for player 2. If $$p=1$$, then $$(2(1-(p-q)^2)-1)$$ must be nonnegative, so $$q$$ must be at least $$1-1/\sqrt{2}$$. This is in equilibrium if $$q=1/2$$. If $$q=0$$, then $$p$$ must be at least $$1/2$$, This in in equilibrium if $$p=1/\sqrt{2}$$. If q=1, then $$p$$ must be at least $$1/2$$, and there is no equilibrium. Thus, there are three entangled equilibria (1,1), (1,1/2), (1/$$\sqrt{2}$$,0).

 by Vadim Kosoy 282 days ago | link I think you meant to divide by $$\epsilon$$ in the equation for the utility differential? A quick thought (haven’t thought much about this): It seems natural to require that the counterfactual system is involutive (of course for two strategy games the condition is trivial). I don’t know what the game theoretic implications would be. reply
 by Scott Garrabrant 282 days ago | link Fixed the $$\varepsilon$$, thanks. reply

### NEW DISCUSSION POSTS

Unfortunately, it's not just
 by Vadim Kosoy on Catastrophe Mitigation Using DRL | 0 likes

>We can solve the problem in
 by Wei Dai on The Happy Dance Problem | 1 like

Maybe it's just my browser,
 by Gordon Worley III on Catastrophe Mitigation Using DRL | 2 likes

At present, I think the main
 by Abram Demski on Looking for Recommendations RE UDT vs. bounded com... | 0 likes

In the first round I'm
 by Paul Christiano on Funding opportunity for AI alignment research | 0 likes

Fine with it being shared
 by Paul Christiano on Funding opportunity for AI alignment research | 0 likes

I think the point I was
 by Abram Demski on Predictable Exploration | 0 likes

(also x-posted from
 by Sören Mindermann on The Three Levels of Goodhart's Curse | 0 likes

(x-posted from Arbital ==>
 by Sören Mindermann on The Three Levels of Goodhart's Curse | 0 likes

>If the other players can see
 by Stuart Armstrong on Predictable Exploration | 0 likes

 by Abram Demski on Predictable Exploration | 0 likes

> So I wound up with
 by Abram Demski on Predictable Exploration | 0 likes

Hm, I got the same result
 by Alex Appel on Predictable Exploration | 1 like

Paul - how widely do you want
 by David Krueger on Funding opportunity for AI alignment research | 0 likes

I agree, my intuition is that
 by Abram Demski on Smoking Lesion Steelman III: Revenge of the Tickle... | 0 likes