Strategy Nonconvexity Induced by a Choice of Potential Oracles discussion post by Alex Appel 22 days ago | Abram Demski likes this | discuss One notable aspect of the extant cooperative oracle results is that they are in a different setting from the usual reflective oracle results. The usual reflective oracle results show a fixed point in the space of potential oracles,$$[0,1]^{M\times[0,1]\cup\mathbb{Q}}\times[0,1]^{M}$$. The first component of this is a function $$query:M\times[0,1]\cup\mathbb{Q}\to[0,1]$$ which dictates the probability of the potential oracle outputting 1 (0 otherwise), when asked about whether turing machine M outputs 1 with probability greater than some rational number. The second component of this is a function that maps each turing machine using the oracle to a probability of outputting a 1. The cooperative oracle results elsewhere on IAFF show a fixed point in the space of strategies for a game, $$\prod_{i\in I}\Delta(S_{i})$$, where $$S_{i}$$ is the set of strategies for the i’th player. I was able to successfully translate the cooperative oracle existence proof from strategy-space to function-space. However, the further step of taking an arbitrary point that’s a fixed point in strategy space, and explicitly building a true reflective oracle in function-space that results in that fixed point being played in the associated game, ran into some problems. There were assumptions that needed to be made to get that to work out, but the most unrealistic by far was the requirement of strategy continuity. As an example, in the game of matching pennies, if the probability of the other player outputting “heads” is 0.5, the set of acceptable strategies goes from a simple “heads” or “tails” to “any possible mixture of heads and tails”, while this requirement mandates that the strategy chosen just shifts very fast from “all heads” to “all tails” in the vicinity of 0.5. At the heart of this insufficiently strong result is the question of “just what does it mean to have a convex set of possible responses without one picked out in particular?” One possible interpretation is that the convex set comes from the turing machine/player having a probability of not halting, so there’s a degree of freedom in what probability distribution the oracle says about it. This doesn’t seem realistic. An oracle-equipped player playing matching pennies against coin-flip bot has their “set of acceptable strategies” include all mixtures of heads and tails, but it doesn’t make sense to say “ah, well, the player just doesn’t ever halt and select a move, and they can only be said to play heads with 50% probability by consulting the oracle about their behavior, upon which the oracle hallucinates an arbitrary nonstandard play at the end of time” Well then, assuming halting, it’s possible to interpret “convex set of possible responses” to refer to the convex set being produced by the choice of oracle. (an “oracle” will be defined as something that has the sharp 0-1 transition behavior around the probability of another turing machine outputting something, and accurately has the transition take place at the true probability. Further restrictions on the randomization behavior at the threshold are not imposed) In the previous example against coinflip bot, the strategy “call the oracle on (coinflipbot,0.5) and respond with the opposite of what the oracle outputs” does indeed produce a convex set of possible responses, induced by what probability the oracle reports back when you give it something that’s exactly on the threshold. As a quick refresher, reflective oracles can randomize when you give them a query where the rational number is the exact probability of the machine outputting 1, and there’s a degree of freedom in what probability they randomize with. So, these “convex sets of possible responses” come from the choice of oracle, right? If you specify oracle access to a particular strategy, there are lots of different possible oracles that could be chosen, and when you plot all the responses produced by choosing a particular oracle back into strategy-space, it makes a convex set, doesn’t it?" Well, no. Here’s the explicit counterexample. A player/turing machine whose set of “possible responses” (where possible responses are strategies that could be played by an appropriate choice of oracle) to another player is nonconvex in strategy-space. Player A just flips a coin to decide their move, and Player B implements the following code. IF(oracle(A,0.5)=1)  IF(oracle(A,0.5)=1) Output Move_1 ELSE Output Move_2 Else Output Move_3 Now, since player A flips a coin, the probability distribution between moves 1, 2, and 3 will be dictated by how the oracle randomizes on (A,0.5) as an input. Select the oracle that outputs 1 with 50% probability, else 0. Then the probability distribution is 25% for move 1, 25% for move 2, and 50% for move 3. Now select the oracle that outputs 1 with 100% probability. Then the probability distribution is 100% for move 1. A mixture of these two points is the probability distribution 62.5% for move 1, 12.5% for move 2, and 25% for move 3. Thus, the oracle must output 1 on the query (A,0.5) $$\sqrt{.625}$$ of the time. And we arrive at approximate probabilities of 62.5% for move 1, 16.6% for move 2, and 20.9% for move 3, thus our 50/50 mixture of those two points does not have an associated oracle which could produce it. Ok, so what’s the takeaway? The requirement for “the strategy of a player is continuous” in the theorem that links cooperative oracles in strategy-space to cooperative oracles in function-space can’t be disposed of, until someone figures out an interpretation of “convex set of possible responses” that isn’t “the player doesn’t halt” or “the choice of oracle dictates which of the possible responses will be chosen”

### NEW DISCUSSION POSTS

[Delegative Reinforcement
 by Vadim Kosoy on Stable Pointers to Value II: Environmental Goals | 1 like

Intermediate update: The
 by Alex Appel on Further Progress on a Bayesian Version of Logical ... | 0 likes

Since Briggs [1] shows that
 by 258 on In memoryless Cartesian environments, every UDT po... | 2 likes

This doesn't quite work. The
 by Nisan Stiennon on Logical counterfactuals and differential privacy | 0 likes

I at first didn't understand
 by Sam Eisenstat on An Untrollable Mathematician | 1 like

This is somewhat related to
 by Vadim Kosoy on The set of Logical Inductors is not Convex | 0 likes

This uses logical inductors
 by Abram Demski on The set of Logical Inductors is not Convex | 0 likes

Nice writeup. Is one-boxing
 by Tom Everitt on Smoking Lesion Steelman II | 0 likes

Hi Alex! The definition of
 by Vadim Kosoy on Delegative Inverse Reinforcement Learning | 0 likes

A summary that might be
 by Alex Appel on Delegative Inverse Reinforcement Learning | 1 like

I don't believe that
 by Alex Appel on Delegative Inverse Reinforcement Learning | 0 likes

This is exactly the sort of
 by Stuart Armstrong on Being legible to other agents by committing to usi... | 0 likes

When considering an embedder
 by Jack Gallagher on Where does ADT Go Wrong? | 0 likes

The differences between this
 by Abram Demski on Policy Selection Solves Most Problems | 1 like

Looking "at the very
 by Abram Demski on Policy Selection Solves Most Problems | 0 likes