Intelligent Agent Foundations Forumsign up / log in
Second Failure of Inductive Learning: The Entangled Benford Test
post by Scott Garrabrant 1271 days ago | Nate Soares and Patrick LaVictoire like this | discuss

This is a followup to the concrete failure of the Solomonoff Induction inspired approach to inductive learning with a delay in feedback. This second failure is not one that we have an algorithm to deal with yet.

Let \(f:\mathbb{N}\rightarrow \{0,1\}\) be a function such that (for large \(n\)) \(f(n)\) can be computed in time \(2^n\), but there is no algorithm that can predict \(f(n)\) in time less than \(2^n\) consistently better than a coin. Let \(E(n)=f(\lfloor\sqrt{n}\rfloor)\).

Now, consider an algorithm which tries to inductively learn to assign probabilities to \(E(n)\) in polynomial time. Ideally, such an algorithm will learn to converge to probability \(1/2\).

What I mean by this is an algorithm which considers Turing machines which run in a given (say \(O(n^2)\)) amount of time, and somehow has them compete with each other, and use the results of this competition to output probabilities of algorithms that did well.

This is a fuzzy category, but examples of such algorithms include the Solomonoff Induction Inspired Approach, the Benford Learner, as well as several other algorithms that have been discussed at MIRI, but have not made it to the forum yet.

I claim that all algorithms we have considered so far fail on this problem. The reason that they all fail, is that there is a single algorithm, \(G\), which runs quickly, and does unboundedly better that the constant \(1/2\) algorithm.

\(G(n)\) outputs 2/3 if \(\lfloor\sqrt n\rfloor\) is of the form \(3\uparrow^m 3\), and 1/2 otherwise.

When comparing \(G\) to the constant 1/2 algorithm, you only need to consider their scores on the places where they make different predictions. Note that when \(\lfloor\sqrt n\rfloor\) is of the form \(3\uparrow^m 3\), \(G\) will output \(2/3\) on a large number of predictions in a row, all of which will turn out to be the same bit. If that bit is shown to be a 1, there are enough predictions to dwarf any mistakes \(G\) has made by saying \(2/3\) in the past, causing \(G\) to appear to do much better than the constant 1/2 algorithm (until it is later shown that it output a 2/3 on a later bit which turns out to be a 0).

\(G\) does not consistently do better than the constant 1/2, but it does do arbitrarily better infinitely often, which stops the constant 1/2 from taking over the inductive algorithms.

Note that this is not an impossibility result. I am just pointing out a concrete way that the algorithms we have now are failing.

I suspect that the way we get past this problem will take advantage of the fact that the constant 1/2 is special in that no other algorithm can do consistently better than it. i.e., there is no algorithm \(M\) such that the difference in Bayes score between \(M\) and the constant 1/2 converges to infinity. (Note that it is important to say that it convenes to infinity, not just that it is unbounded. We already have an algorithm (which has not been posted yet) which can find hypotheses for which no other algorithm can cause the difference in bases score to be unbounded. The problem that such hypotheses often do not exist.)

\(G\) does not have this property, because an algorithm which does what \(G\) says, except it says 1/2 when the input is of the form \((3\uparrow^m 3)^2\) will do consistently better than \(G\), since the algorithm makes the same prediction except for an infinite sequence of INDEPENDENT fair coins where \(G\) says 2/3.

This new failure, can also be expressed in the more complicated environment of assigning probabilities to logical sentences. We say that an algorithm passes “The Entangled Benford Test” if the limit as \(n\) goes to infinity of the probability assigned to “The first digit of \(3\uparrow^{\lfloor\sqrt n\rfloor}3\) is a 1,” at time \(2^n\) is \(\log_{10}(2)\).

Is there any algorithm which solves overcomes this failure, and satisfies the entangled Benford test?



NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

[Note: This comment is three
by Ryan Carey on A brief note on factoring out certain variables | 0 likes

There should be a chat icon
by Alex Mennen on Meta: IAFF vs LessWrong | 0 likes

Apparently "You must be
by Jessica Taylor on Meta: IAFF vs LessWrong | 1 like

There is a replacement for
by Alex Mennen on Meta: IAFF vs LessWrong | 1 like

Regarding the physical
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think that we should expect
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think I understand your
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

This seems like a hack. The
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

After thinking some more,
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yes, I think that we're
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

My intuition is that it must
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

To first approximation, a
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Actually, I *am* including
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yeah, when I went back and
by Alex Appel on Optimal and Causal Counterfactual Worlds | 0 likes

> Well, we could give up on
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

RSS

Privacy & Terms