by Jessica Taylor 611 days ago | Stuart Armstrong likes this | link | parent You might be interested in a way of ensuring that 2 players always have the same mixed strategy in all Nash equilibria of some game: Assume we have a player $$A$$ and a player $$B$$. Player $$A$$ has some already-specified utility function; we would like player $$B$$ to play the same mixed strategy as $$A$$. Introduce a new player $$C$$ who gets to observe either $$A$$ or $$B$$’s action (unknown with 50% probability for each), and tries to determine who took this action (getting a utility of 1 for guessing correctly and 0 otherwise). $$B$$’s utility function is 1 if $$C$$ guesses incorrectly, and 0 if $$C$$ guesses correctly. $$B$$ will use the same mixed strategy as $$A$$ in all Nash equilibria. A similar method is used in the appendix A of the reflective oracles paper.

### NEW DISCUSSION POSTS

This is exactly the sort of
 by Stuart Armstrong on Being legible to other agents by committing to usi... | 0 likes

When considering an embedder
 by Jack Gallagher on Where does ADT Go Wrong? | 0 likes

The differences between this
 by Abram Demski on Policy Selection Solves Most Problems | 0 likes

Looking "at the very
 by Abram Demski on Policy Selection Solves Most Problems | 0 likes

 by Paul Christiano on Policy Selection Solves Most Problems | 1 like

>policy selection converges
 by Stuart Armstrong on Policy Selection Solves Most Problems | 0 likes

Indeed there is some kind of
 by Vadim Kosoy on Catastrophe Mitigation Using DRL | 0 likes

Very nice. I wonder whether
 by Vadim Kosoy on Hyperreal Brouwer | 0 likes

Freezing the reward seems
 by Vadim Kosoy on Resolving human inconsistency in a simple model | 0 likes

Unfortunately, it's not just
 by Vadim Kosoy on Catastrophe Mitigation Using DRL | 0 likes

>We can solve the problem in
 by Wei Dai on The Happy Dance Problem | 1 like

Maybe it's just my browser,
 by Gordon Worley III on Catastrophe Mitigation Using DRL | 2 likes

At present, I think the main
 by Abram Demski on Looking for Recommendations RE UDT vs. bounded com... | 0 likes

In the first round I'm
 by Paul Christiano on Funding opportunity for AI alignment research | 0 likes

Fine with it being shared
 by Paul Christiano on Funding opportunity for AI alignment research | 0 likes