Intelligent Agent Foundations Forumsign up / log in
Resource-Limited Reflective Oracles
post by Alex Appel 47 days ago | Abram Demski and Jessica Taylor like this | discuss

Reflective oracles accurately answer questions about what arbitrary halting probabilistic oracle machines output. It is possible to make a variant of a reflective oracle that accurately answers questions about what sufficiently short-running Turing machines with access to the same oracle output.

These oracles are explicitly computable, fairly powerful in a computational complexity sense, and can probably be used to make a reflective version of AIXItl, (which will not happen in this particular post)

The theorems in this post are not exhaustive at all, just the stuff I was able to figure out in ~2 hours, so there is almost certainly low-hanging fruit in figuring out how these interact with the standard arsenal of theorems in computational complexity theory.


Let’s say we want to make an oracle that is reflective over all TM’s that run in polynomial time. How do we do that?

The obvious approach is to let a query consist of a tuple of a turing machine \(T\), a number \(n\), a bitstring \(x\), and a probability \(p\). If you build the tree of all possible execution paths of the turing machine that are poly(\(n\)) steps long(for some polynomial), and come up with the probability of every possible output (including a probability for timing out) the oracle must return 0 if the true probability of \(x\) is less than \(p\), 1 if the true probability of \(x\) is greater than \(p\), and randomize otherwise.

This obvious approach doesn’t work, because it’s possible to find out whether any halting turing machine, no matter how long-running, returns \(x\), by recursive oracle calls. Let \(A\) be some inputless turing machine that runs for a very long number of turns, like the \(BB(6)\) machine. Let \(Em(A)\) be the turing machine that takes a number \(n\) as input, simulates \(A\) for \(n\) turns, and if \(A\) halts in \(n\) turns, outputs what \(A\) would. If \(A\) doesn’t halt in \(n\) turns, this algorithm queries the oracle about whether \(Em(A)(2^{n})\) outputs \(x\), and if so, outputs \(x\).

It is possible to emulate \(n\) steps of a turing machine in \(\mathcal{O}(n\log n)\) time, and it is possible to write down \(2^{n}\) in \(\mathcal{O}(n)\) time, and it is possible to write down \(x\) in constant time, so \(Em(A)\) runs in poly(\(n\)) time. Therefore, the oracle should be accurate about what this turing machine outputs for any given \(n\). However, if you look at what this turing machine is doing, it is giving itself an exponentially longer time bound on each oracle call, so it will eventually be able to fully simulate whatever turing machine it wants.

The obvious patch to this is ensuring that, given some monotonically increasing function \(f\) for the oracle (linear, polynomial, exponential), when calling the oracle again, \(n_{1}\le n_{0}\). You can’t have a chain where the oracles keep recursively calling themselves and giving themselves more computational resources, so all chains of recursive oracle calls have to be bounded above by \(f(n)\).

So, the complexity class \(\mathcal{F}^{FRO}\) is the set of all problems solvable in \(f(n)\) time, with access to an \(f(n)\)-reflective oracle.

A \(C\)-reflective oracle (where C is some constant) is an oracle that accurately answers any query posed about a probabilistic turing machine that halts with probability 1 in \(C\) time or less, and has access to an \(C\)-reflective oracle.

Pretty much, once \(n\) is put into the starting algorithm, every algorithm it calls the oracle upon must have runtime and oracle query length bounded by \(f(n)\), and all of the algorithms that one calls, and so on. This prevents an unboundedly long chain of different oracle calls to simulate arbitrary turing machines, because eventually you’ll get either a finite set of turing machines all calling each other, and an equilibrium exists (there are only finitely many oracle queries of length f(n)), or the condition of bounded runtime and oracle query length will be violated, and the oracle can return anything at that point.

Immediate Theorems

Theorem 1: There is some constant c such that, for any function \(f\), \(\mathcal{FSPACE}\subseteq\mathcal{cF}^{cFRO}\)

Proof Sketch: We’ll be having a sequence of oracle machines, where each one simulates a finite number of steps, and then passes the intermediate state to the next oracle machine, via the oracle query. The overall result is something like how, if an algorithm can be carried out on a sheet of paper, it can be split up among a long chain of people where everyone does some work on the paper, and passes it to the next person.

Proof: Let \(A_{k}\) be the turing machine \(A\), advanced by \(k\) time steps. Consider the turing machine \(TM(A_{0})_{k}\) that emulates \(A\) for \(k\) steps, and then queries the oracle about \(TM(A_{k})_{c}\). Because \(A\) is in \(\mathcal{FSPACE}\), all the intermediate states \(A_{c}\) that are recursively called have length bounded above by \(\mathcal{O}(f)\), and adding the code to simulate it ahead \(c\) steps is an \(\mathcal{O}(1)\) increase in length. Furthermore, emulating \(k\) steps of a turing machine can be done in \(\mathcal{O}(k\log k)\) time, which is a constant.

Therefore, since it takes constant time to emulate the turing machine, and takes \(cf(n)\) time to write down the query to the oracle, none of the oracle queries or turing machines run over time, and once an answer is found, it gets passed back along the chain by the condition where the oracle must be accurate about any algorithm where the runtime stays under the time bound.

Theorem 2: There is some constant \(c\) such that for any function \(f\) that is computable in \(\mathcal{O}(f)\) time, \(\mathcal{NF}\subseteq{cF}^{cFRO}\).

Any nondeterministic turing machine \(A\) that implements a function in \(\mathcal{NF}\) has a corresponding randomized turing machine \(A'\) that generates a random string of length \(\mathcal{O}(f)\), and uses that to decide which path to go down. Now consider the turing machine that makes (and returns the answer to) the single oracle query \((A',1,2^{-cf(n)})\). Pretty much, this asks whether there is any random string that leads to \(A'\) accepting, and it perfectly correlates to whether \(A\) accepts or rejects. Writing down \(A'\) takes constant time, and writing down \(cf(n)\) takes \(\mathcal{O}(f(n))\) time, and computing \(f(n)\) in the first place takes \(\mathcal{O}(f(n))\) time by assumption.


This is super-preliminary, there’s certainly more computational complexity results to be found by pushing in this direction. The ability to recursively call algorithms with access to the same oracle is powerful, being able to encompass both space and nondeterminism, but is still a step down from true omniscience. It’s pretty likely that a variant of AIXItl can be constructed with this tool.





Note: I currently think that
by Jessica Taylor on Predicting HCH using expert advice | 0 likes

Counterfactual mugging
by Jessica Taylor on Doubts about Updatelessness | 0 likes

What do you mean by "in full
by David Krueger on Doubts about Updatelessness | 0 likes

It seems relatively plausible
by Paul Christiano on Maximally efficient agents will probably have an a... | 1 like

I think that in that case,
by Alex Appel on Smoking Lesion Steelman | 1 like

Two minor comments. First,
by Sam Eisenstat on No Constant Distribution Can be a Logical Inductor | 1 like

A: While that is a really
by Alex Appel on Musings on Exploration | 0 likes

> The true reason to do
by Jessica Taylor on Musings on Exploration | 0 likes

A few comments. Traps are
by Vadim Kosoy on Musings on Exploration | 1 like

I'm not convinced exploration
by Abram Demski on Musings on Exploration | 0 likes

Update: This isn't really an
by Alex Appel on A Difficulty With Density-Zero Exploration | 0 likes

If you drop the
by Alex Appel on Distributed Cooperation | 1 like

Cool! I'm happy to see this
by Abram Demski on Distributed Cooperation | 0 likes

Caveat: The version of EDT
by 258 on In memoryless Cartesian environments, every UDT po... | 2 likes

[Delegative Reinforcement
by Vadim Kosoy on Stable Pointers to Value II: Environmental Goals | 1 like


Privacy & Terms