Intelligent Agent Foundations Forumsign up / log in
by Patrick LaVictoire 608 days ago | link | parent

Sam Eisenstat produced the following counterexample to a stronger version of the revived trolljecture (one using soundness rather than consistency in the escalated theories):

Take a statement \(X\) in the language of PA such that neither \(X\) nor \(\neg X\) has a short proof (less than length \(L\)) in PA+Sound(PA); \(X\) can be provably true, provably false, or undecidable in that system.

Define \(U\) to equal 10 if \(X\wedge A()=a\), 0 if \(\neg X\wedge A()=a\), and 5 if \(A()\neq a\).

Define \(A\) to equal \(a\) if there is a short proof in PA (less than length \(L\)) that \(A()=a\to U()=10\), and \(b\) otherwise. Clearly \(A()=b\) and \(U()=5\).

Now I claim that in the formal system PA+Sound(PA)+\((A()=a)\), there is a very short proof of \(U()=10\). And this is the case whether \(X\) is true, false, or undecidable!

In order to prove \(U()=0\), take the axiom \(A()=a\). It follows from the definition of \(A\) that there is a short proof in PA (less than length \(L\)) that \(A()=a\to U()=10\). And it follows from the soundness of PA that in fact \(A()=a\to U()=10\), and thus \(U()=10\).

On the other hand, we cannot quickly prove \(U()=0\), for if we could, then we could prove \(\neg X\). Things get tricky because we’re using the extra (inconsistent) axiom \(A()=a\), but intuitively, if \(X\) is some arithmetical assertion that has nothing to do with the decision problem, then there shouldn’t be any short proof of \(\neg X\) in this system either, and so the shortest proof of \(U()=0\) should be at least comparable to length \(L\).





NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

The AI defers to anything
by Paul Christiano on Corrigibility thoughts II: the robot operator | 0 likes

Thus anything that can
by Stuart Armstrong on Corrigibility thoughts II: the robot operator | 0 likes

Ah, thanks! That seems more
by Stuart Armstrong on Loebian cooperation in the tiling agents problem | 0 likes

It doesn't mean computation
by Vladimir Slepnev on Loebian cooperation in the tiling agents problem | 1 like

I'm not sure this would work,
by Stuart Armstrong on Loebian cooperation in the tiling agents problem | 0 likes

>How can the short term
by Stuart Armstrong on Humans are not agents: short vs long term | 0 likes

I expect a workable approach
by Paul Christiano on Corrigibility thoughts II: the robot operator | 0 likes

Not sure what your argument
by Stuart Armstrong on Corrigibility thoughts II: the robot operator | 0 likes

It is ‘a preference for
by Stuart Armstrong on Humans are not agents: short vs long term | 0 likes

Note that we don't need to
by Paul Christiano on ALBA requires incremental design of good long-term... | 0 likes

If I want my boat to travel
by Paul Christiano on Corrigibility thoughts II: the robot operator | 0 likes

I don't think it's much like
by Abram Demski on An Approach to Logically Updateless Decisions | 0 likes

Yeah, I like tail dependence.
by Sam Eisenstat on An Approach to Logically Updateless Decisions | 0 likes

This is basically the
by Paul Christiano on Cooperative Oracles: Stratified Pareto Optima and ... | 1 like

I think AsDT has a limited
by Abram Demski on An Approach to Logically Updateless Decisions | 2 likes

RSS

Privacy & Terms