Intelligent Agent Foundations Forumsign up / log in
Forum Digest: Reflective Oracles
post by Jessica Taylor 1064 days ago | Kaya Stechly, Sam Eisenstat, Benja Fallenstein, Nate Soares and Patrick LaVictoire like this | discuss

Summary: This is a quick expository recap, with links, of writing (in papers and on this forum) on reflective oracles, through 3/21/15. Read this if you want to learn more about reflective oracles, or if you’re curious about what we’ve been working on lately!

Introduction

Solomonoff induction can be used to predict a sequence of bits produced by a computable environment. However, if the environment that produces the bits contains the Solomonoff inductor (or other Solomonoff inductors), then the environment will be uncomputable, and Solomonoff induction will no longer have any guarantee of accuracy. This motivates the study of the naturalized induction problem. This is still an open problem, but we have recently made progress using the framework of reflective oracles.

In this framework, programs are allowed to call a reflective oracle that (probabilistically) predicts the output of other programs (which may also call reflective oracles). Roughly speaking, the oracle answers questions of the form: “Is the probability that machine \(M\) returns 1 greater than \(p\)?”? If the probability that \(M\) returns 1 is less than \(p\), then the oracle must return 0; if it is greater than \(p\), it must return 1; and if it is exactly \(p\), then it may randomize between 0 and 1 arbitrarily.

It is due to the randomization that paradoxes do not arise. For example, we could define a program that asks the oracle if it returns 1 with “greater than” 50% probability, and then outputs 0 if this is true or 1 if it is false. In this case, the oracle returns 1 half the time and 0 the other half of the time for the query the program makes. Then the program also returns 1 half the time, so the oracle was indeed allowed to randomize its answer to the query.

While reflective oracles aren’t a complete solution to the naturalized induction problem, they are a useful model that lets us define naturalized algorithms and prove interesting properties about them. In particular, it is possible to define reflective variants of causal decision theory, Solomonoff induction, and AIXI.

Papers

  • Reflective oracles as a Foundation for Classical Game Theory, Benja Fallenstein, Jessica Taylor, and Paul Christiano. This paper formalizes reflective oracles, shows that they always exist, and defines causal decision theory for agents in a world containing reflective oracles. When multiple agents use this reflective version of CDT, they will play a Nash equilibrium. In fact, there is a one-to-one correspondence between reflective oracles and Nash equilibria.

  • Reflective variants of Solomonoff Induction and AIXI, Benja Fallenstein, Nate Soares, and Jessica Taylor. This paper uses reflective oracles to define reflective variants of Solomonoff induction and AIXI. A reflective Solomonoff inductor will make predictions consistent with the fact that its output affects its environment, and a reflective AIXI will take actions based on predictions consistent with AIXI being part of the environment.

Development of reflective oracles on the forum

Although you only need to read the papers to know about the current state of research in reflective oracles, forum posts on the subject show how the research developed.

  • Topological truth predicates: Towards a model of perfect Bayesian agents, Benja Fallenstein. This post introduces the setting where agents can predict each other using reflective truth predicates, and shows that a reflective truth predicate exists. This is similar to reflective oracles, but using logic instead of Turing machines. Paul commented on this post with a suggestion to use Turing machines instead of logic.

  • Oracle machines instead of topological truth predicates, Benja Fallenstein. This version of the reflective oracles formalism is similar to the one that is currently used, but is different in that it specifies that the oracle returns 0 for any program that does not halt for every oracle. The post shows that these reflective oracles always exist.

  • Simplicity priors with reflective oracles, Benja Fallenstein. It is possible to formalize a variant of Solomonoff induction using the reflective oracles from the previous post. There are some complications here that require “protecting” queries.

  • Multibit reflective oracles, Benja Fallenstein. This is a new version of reflective oracles that (a) extends the formalism to programs outputting a stream of bits (not just a single bit) and (b) changes how the oracles behave on non-halting machines. Instead of answering 0 in response to a query about a machine that does not halt for all oracles, a reflective oracle will now have somewhat arbitrary behavior for programs that do not halt (depending on the probability that the program does not halt). Formalizing Solomnoff induction is significantly simpler with these new oracles.

  • Single-bit reflective oracles are enough, Benja Fallenstein. It turns out that we can use single-bit reflective oracles to get the same functionality as multi-bit reflective oracles by asking them for conditional probabilities given previous bits.

Exploratory applications of reflective oracles

  • Oracle machines for automatic philosophy, Nisan Stiennon (writing up Paul Christiano’s idea). We can use reflective oracles to define a method to automate philosophy by setting up a thought experiment where a person can ask themselves questions and get responses back immediately, in a recursive fashion. This is similar to an earlier approach that has been dubbed “Paul-boxing”, but seems to have some nicer properties. See Paul’s comment for more of his posts on the subject.

  • UDT in the land of probabilistic oracles, Jessica Taylor. This is a first step towards trying to define a UDT variant using reflective oracles. However, the formalism in this post is unsatisfactory because when making a decision, an agent will only act as if it its decision determines the actions of exact copies of itself.



NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

[Delegative Reinforcement
by Vadim Kosoy on Stable Pointers to Value II: Environmental Goals | 1 like

Intermediate update: The
by Alex Appel on Further Progress on a Bayesian Version of Logical ... | 0 likes

Since Briggs [1] shows that
by 258 on In memoryless Cartesian environments, every UDT po... | 2 likes

This doesn't quite work. The
by Nisan Stiennon on Logical counterfactuals and differential privacy | 0 likes

I at first didn't understand
by Sam Eisenstat on An Untrollable Mathematician | 1 like

This is somewhat related to
by Vadim Kosoy on The set of Logical Inductors is not Convex | 0 likes

This uses logical inductors
by Abram Demski on The set of Logical Inductors is not Convex | 0 likes

Nice writeup. Is one-boxing
by Tom Everitt on Smoking Lesion Steelman II | 0 likes

Hi Alex! The definition of
by Vadim Kosoy on Delegative Inverse Reinforcement Learning | 0 likes

A summary that might be
by Alex Appel on Delegative Inverse Reinforcement Learning | 1 like

I don't believe that
by Alex Appel on Delegative Inverse Reinforcement Learning | 0 likes

This is exactly the sort of
by Stuart Armstrong on Being legible to other agents by committing to usi... | 0 likes

When considering an embedder
by Jack Gallagher on Where does ADT Go Wrong? | 0 likes

The differences between this
by Abram Demski on Policy Selection Solves Most Problems | 1 like

Looking "at the very
by Abram Demski on Policy Selection Solves Most Problems | 0 likes

RSS

Privacy & Terms