Intelligent Agent Foundations Forumsign up / log in
Forum Digest: Reflective Oracles
post by Jessica Taylor 975 days ago | Kaya Stechly, Sam Eisenstat, Benja Fallenstein, Nate Soares and Patrick LaVictoire like this | discuss

Summary: This is a quick expository recap, with links, of writing (in papers and on this forum) on reflective oracles, through 3/21/15. Read this if you want to learn more about reflective oracles, or if you’re curious about what we’ve been working on lately!

Introduction

Solomonoff induction can be used to predict a sequence of bits produced by a computable environment. However, if the environment that produces the bits contains the Solomonoff inductor (or other Solomonoff inductors), then the environment will be uncomputable, and Solomonoff induction will no longer have any guarantee of accuracy. This motivates the study of the naturalized induction problem. This is still an open problem, but we have recently made progress using the framework of reflective oracles.

In this framework, programs are allowed to call a reflective oracle that (probabilistically) predicts the output of other programs (which may also call reflective oracles). Roughly speaking, the oracle answers questions of the form: “Is the probability that machine \(M\) returns 1 greater than \(p\)?”? If the probability that \(M\) returns 1 is less than \(p\), then the oracle must return 0; if it is greater than \(p\), it must return 1; and if it is exactly \(p\), then it may randomize between 0 and 1 arbitrarily.

It is due to the randomization that paradoxes do not arise. For example, we could define a program that asks the oracle if it returns 1 with “greater than” 50% probability, and then outputs 0 if this is true or 1 if it is false. In this case, the oracle returns 1 half the time and 0 the other half of the time for the query the program makes. Then the program also returns 1 half the time, so the oracle was indeed allowed to randomize its answer to the query.

While reflective oracles aren’t a complete solution to the naturalized induction problem, they are a useful model that lets us define naturalized algorithms and prove interesting properties about them. In particular, it is possible to define reflective variants of causal decision theory, Solomonoff induction, and AIXI.

Papers

  • Reflective oracles as a Foundation for Classical Game Theory, Benja Fallenstein, Jessica Taylor, and Paul Christiano. This paper formalizes reflective oracles, shows that they always exist, and defines causal decision theory for agents in a world containing reflective oracles. When multiple agents use this reflective version of CDT, they will play a Nash equilibrium. In fact, there is a one-to-one correspondence between reflective oracles and Nash equilibria.

  • Reflective variants of Solomonoff Induction and AIXI, Benja Fallenstein, Nate Soares, and Jessica Taylor. This paper uses reflective oracles to define reflective variants of Solomonoff induction and AIXI. A reflective Solomonoff inductor will make predictions consistent with the fact that its output affects its environment, and a reflective AIXI will take actions based on predictions consistent with AIXI being part of the environment.

Development of reflective oracles on the forum

Although you only need to read the papers to know about the current state of research in reflective oracles, forum posts on the subject show how the research developed.

  • Topological truth predicates: Towards a model of perfect Bayesian agents, Benja Fallenstein. This post introduces the setting where agents can predict each other using reflective truth predicates, and shows that a reflective truth predicate exists. This is similar to reflective oracles, but using logic instead of Turing machines. Paul commented on this post with a suggestion to use Turing machines instead of logic.

  • Oracle machines instead of topological truth predicates, Benja Fallenstein. This version of the reflective oracles formalism is similar to the one that is currently used, but is different in that it specifies that the oracle returns 0 for any program that does not halt for every oracle. The post shows that these reflective oracles always exist.

  • Simplicity priors with reflective oracles, Benja Fallenstein. It is possible to formalize a variant of Solomonoff induction using the reflective oracles from the previous post. There are some complications here that require “protecting” queries.

  • Multibit reflective oracles, Benja Fallenstein. This is a new version of reflective oracles that (a) extends the formalism to programs outputting a stream of bits (not just a single bit) and (b) changes how the oracles behave on non-halting machines. Instead of answering 0 in response to a query about a machine that does not halt for all oracles, a reflective oracle will now have somewhat arbitrary behavior for programs that do not halt (depending on the probability that the program does not halt). Formalizing Solomnoff induction is significantly simpler with these new oracles.

  • Single-bit reflective oracles are enough, Benja Fallenstein. It turns out that we can use single-bit reflective oracles to get the same functionality as multi-bit reflective oracles by asking them for conditional probabilities given previous bits.

Exploratory applications of reflective oracles

  • Oracle machines for automatic philosophy, Nisan Stiennon (writing up Paul Christiano’s idea). We can use reflective oracles to define a method to automate philosophy by setting up a thought experiment where a person can ask themselves questions and get responses back immediately, in a recursive fashion. This is similar to an earlier approach that has been dubbed “Paul-boxing”, but seems to have some nicer properties. See Paul’s comment for more of his posts on the subject.

  • UDT in the land of probabilistic oracles, Jessica Taylor. This is a first step towards trying to define a UDT variant using reflective oracles. However, the formalism in this post is unsatisfactory because when making a decision, an agent will only act as if it its decision determines the actions of exact copies of itself.



NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

Unfortunately, it's not just
by Vadim Kosoy on Catastrophe Mitigation Using DRL | 0 likes

>We can solve the problem in
by Wei Dai on The Happy Dance Problem | 1 like

Maybe it's just my browser,
by Gordon Worley III on Catastrophe Mitigation Using DRL | 2 likes

At present, I think the main
by Abram Demski on Looking for Recommendations RE UDT vs. bounded com... | 0 likes

In the first round I'm
by Paul Christiano on Funding opportunity for AI alignment research | 0 likes

Fine with it being shared
by Paul Christiano on Funding opportunity for AI alignment research | 0 likes

I think the point I was
by Abram Demski on Predictable Exploration | 0 likes

(also x-posted from
by Sören Mindermann on The Three Levels of Goodhart's Curse | 0 likes

(x-posted from Arbital ==>
by Sören Mindermann on The Three Levels of Goodhart's Curse | 0 likes

>If the other players can see
by Stuart Armstrong on Predictable Exploration | 0 likes

Thinking about this more, I
by Abram Demski on Predictable Exploration | 0 likes

> So I wound up with
by Abram Demski on Predictable Exploration | 0 likes

Hm, I got the same result
by Alex Appel on Predictable Exploration | 1 like

Paul - how widely do you want
by David Krueger on Funding opportunity for AI alignment research | 0 likes

I agree, my intuition is that
by Abram Demski on Smoking Lesion Steelman III: Revenge of the Tickle... | 0 likes

RSS

Privacy & Terms