Intelligent Agent Foundations Forumsign up / log in
A Loophole for Self-Applicative Soundness
discussion post by Alex Appel 71 days ago | Abram Demski likes this | 4 comments

See comments. This is a rediscovery of a result from the 1980’s that allow concluding \(\phi\) from \(\Box_{n}(\phi)\) via a \(\mathcal{O}(n)\)-length proof, and even the statement that the theory has no disproof of length \(n\) or less has a single \(\mathcal{O}(n)\)-length proof.

This result is just an idea developed very recently, and I’d put ~2:1 odds on it having a fatal flaw, but it looks extremely promising if it works out. EDIT: It works. None of the proof theory checks have been done yet, but it does causes both Lob’s theorem, and Critch’s Bounded Parametric Lob result to fail.

So, to begin with, if a theory thinks it is sound, then it is inconsistent. Proof by Lob’s theorem.


Well that didn’t work.

What if we give the theory a soundness schema over any proof which is of bounded length? Maybe the “there exists a proof” in the standard provability predicate is causing problems.

Well then Critch’s Bounded Parametric Lob comes in to ruin our day. The entire proof will be reproduced below.

Let \(g(k)\), \(f(k)\), and \(h(k)\) be such that \(f(k)\ge g(k)+h(k)+\log(k)\), \(\mathcal{E}g(k)<h(k)\), and \(\log(f(k))<g(k)\), asymptotically.

As a specific example, this can be done by \(g(k)=k\), \(h(k)=k^{2}\), and \(f(k)=k^{3}\).

If it takes a constant number of steps to derive a specific proof regardless of \(k\), the number on it will be suppressed for readability. Also, technically, the original proof has \(\mathcal{O}\log(k)\) instead of \(\log(k)\), but this change doesn’t alter much.

\[G(\lceil\psi\rceil,k):=\Box_{g(k)}\psi(k)\to\bot\] \[\vdash\forall k:\psi(k)\leftrightarrow G(\lceil\psi\rceil,k)\] (Parametric Diagonal Lemma) \[\vdash\Box\forall k:\psi(k)\leftrightarrow G(\lceil\psi\rceil,k)\] (Bounded Necessitation) \[\vdash\forall k:\Box_{\log(k)}(\psi(k)\to G(\lceil\psi\rceil,k))\] (Quantifier Distribution) \[\vdash\forall k,a:\Box_{a}\psi(k)\to\Box_{a+\log(k)}G(\lceil\psi\rceil,k)\] (Implication Distribution) \[\vdash\forall k,a,b:\Box_{a}\psi(k)\to(\Box_{b}\Box_{g(k)}\psi(k)\to\Box_{a+b+\log(k)}\bot)\] (Implication Distribution) Now specialize to a=g(k), b=h(k). Also, \(f(k)\ge g(k)+h(k)+\log(k)\) for sufficiently large k above \(k_{1}\). \[\vdash\forall k>k_{1}:\Box_{g(k)}\psi(k)\to(\Box_{h(k)}\Box_{g(k)}\psi(k)\to\Box_{f(k)}\bot)\] \[\vdash\forall k,a:\Box_{a}\psi(k)\to\Box_{\mathcal{E}a}\Box_{a}\psi(k)\] (Bounded Inner Necessitation) Now, Specializing to a=g(k), we get \[\vdash\forall k:\Box_{g(k)}\psi(k)\to\Box_{\mathcal{E}g(k)}\Box_{g(k)}\psi(k)\] Now, since \(\mathcal{E}g(k)<h(k)\) after some time \(k_{2}\), \[\vdash\forall k>k_{2}:\Box_{g(k)}\psi(k)\to\Box_{f(k)}\bot\] Pick a specific value of k, \(k'\), which is sufficiently large. \[\vdash\Box_{g(k')}\psi(k')\to\Box_{f(k')}\bot\] By the soundness schema, \(\Box_{f(k')}\bot\to\bot\). The length of this proof isn’t constant, because \(f(k')\) might be really big, so then it’d take about \(\log(f(k'))\) characters to write down the single application of the soundness schema. \[\vdash_{\log(f(k'))}\Box_{g(k')}\psi(k')\to\bot\] \[\vdash_{\log(f(k'))}\psi(k')\] (By the definition of \(\psi(k')\)) \[\vdash_{\mathcal{E}\log(f(k'))}\Box_{\log(f(k'))}\psi(k')\] \(\log(f(k))<g(k)\) eventually. Since \(k'\) was previously selected to be sufficiently big, \[\vdash_{\mathcal{E}\log(f(k'))}\Box_{g(k')}\psi(k')\] Now we no longer care about \(\log(f(k'))\)’s size, because \(k'\) has been fixed, so \[\vdash\bot\]

However, not all is lost. If you look carefully at this, you see that the introduction of the soundness scheme lead to a minor proof blowup. Sure, it’s not enough of a blowup to surpass \(g(k)\), so the proof of \(\bot\) still goes through, but this seems like it might be exploitable….

So then the next step is to make sure that \(\Box_{n}\phi\to\phi\) can only ever be proved by a proof of \(n\) steps or more. The agent can conclude its own soundness for bounded proofs, but it may take a while. I’m pretty sure this is doable by having an axiom schema of the form \(\Box_{n}\phi\to\Box_{n-1}\phi\) for each individual \(n\) and \(\phi\) seperately, but I’m not entirely sure of that. EDIT: Doesn’t work, see comments.

Regardless, assume that any proof of \(\Box_{n}\phi\to\phi\) via iterated soundness axioms will always take \(c\) or more steps. EDIT: You don’t need to assume this, there’s an explicit \(\mathcal{O}(n)\) proof of that statement.

Then, if we go through the proof again, something interesting happens. Instead of \(\log(f(k))\) making its way to the length of the proof, we get \(f(k)\) (or more). \(f(k)\ge g(k)\) in order to get an earlier step of the proof to go through, but then the resulting proof of \(\psi(k)\) takes at least \(f(k)\) steps to work, so that proof cannot be used with the definition of \(\psi(k)\) to conclude \(\bot\).

I’d suspect there’s some more subtle argument that can conclude inconsistency, but breaking the proof of Lob’s theorem is promising.

What’s the use of this, though, if it doesn’t shorten proofs?

Well, there’s a lot of \(n\)-length proofs, but the “royal road” of abstractly establishing soundness or consistency would allow concluding the sentence directly without having to embark on an exhaustive search for the original proof.

If you spent a while proving something, and then wrote down the result in your notebook, and forgot about it, then, upon observing the notebook page, you can conclude \(\Box\phi\) and, since you can bound how good you are at math, you can establish an upper bound on the length of the proof, \(\Box_{n}\phi\). This provides a way to establish \(\phi\) without having to reprove the thing from scratch, by just thinking for a while about “Do I trust my own bounded proofs? Yes I do.”

This solves the Notebook Problem in Vingean Reflection.

by Sam Eisenstat 71 days ago | Alex Appel likes this | link

As you say, this isn’t a proof, but it wouldn’t be too surprising if this were consistent. There is some \(k \in \mathbb{N}\) such that \(\square_n \phi \to \phi\) has a proof of length \(n^k\) by a result of Pavel Pudlák (On the length of proofs of finitistic consistency statements in first order theories). Here I’m making the dependence on \(n\) explicit, but not the dependence on \(\phi\). I haven’t looked at it closely, but the proof strategy in Theorems 5.4 and 5.5 suggests that \(k\) will not depend on \(\phi\), as long as we only ask for the weaker property that \(\square_n \phi \to \phi\) will only be provable in length \(n^k\) for sentences \(\phi\) of length at most \(k\).


by Alex Appel 64 days ago | link

I found an improved version by Pavel, that gives a way to construct a proof of \(\phi\) from \(\Box_{n}\phi\) that has a length of \(\mathcal{O}(n)\). The improved version is here.

There are restrictions to this result, though. One is that the C-rule must apply to the logic. This is just the ability to go from \(\exists x:\phi(x)\) to instantiating a \(c\) such that \(\phi(c)\). Pretty much all reasonable theorem provers have this.

The second restriction is that the theory must be finitely axiomatizable. No axiom schemas allowed. Again, this isn’t much of a restriction in practice, because NBG set theory, which proves the consistency of ZFC, is finitely axiomatizable.

The proof strategy is basically as follows. It’s shown that the shortest proof of a statement with quantifier depth n must have a length of \(\Omega(n^2)\), if the maximum quantifier depth in the proof is \(2n\) or greater.

This can be flipped around to conclude that if there’s a length-n proof of \(\phi\), the maximum quantifier depth in the proof can be at most \(\mathcal{O}(\sqrt{n})\).

The second part of the proof involves constructing a bounded-quantifier version of a truth predicate. By Tarski’s undefinability of truth, a full truth predicate cannot be constructed, but it’s possible to exhibit a formula for which it’s provable that \[qd(\overline{\psi})\le n\to(Sat_{n}(\overline{\psi},x)\leftrightarrow\Sigma(Sat_{n},\overline{\psi},x))\] (\(\Sigma\) is the formula laying out Tarski’s conditions for something to be a truth predicate). Also, if \(n\ge\) quantifier depth of \(\psi\), there’s a proof of \[Sat_{n}(\overline{\psi},x)\leftrightarrow\psi[x]\] (\(\psi[x]\) is the sentence \(\psi\) with its free variables substituted for the elements enumerated in the list \(x\)) Also, there’s a proof that \(Sat_{n}\) is preserved under inference rules and logical axioms, as long as everything stays below a quantifier depth of \(n\).

All these proofs can be done in \(\mathcal{O}(n^2)\) lines. One factor of \(n\) comes from the formula abbreviated as \(Sat_{n}(x,y)\) getting longer at a linear rate, and the other factor comes from having to prove \(Sat_{n}\) for each \(n\) seperately as an ingredient for the next proof.

Combining the two parts, the \(\mathcal{O}(\sqrt{n})\) bound on the quantifier depth and the \(\mathcal{O}(n^2)\) bound on how long it takes to prove stuff about the truth predicate, make it take \(\mathcal{O}(n)\) steps to prove all the relevant theorems about a sufficiently large bounded quantifier depth truth predicate, and then you can just go “the statement that we are claiming to have been proved must have \(Sat_{n}\) apply to it, and we’ve proved this is equivalent to the statement itself”

As a further bonus, a single \(\mathcal{O}(n)\)-length proof can establish the consistency of the theory itself for all \(n\)-length proofs.

It seems like a useful project to develop a program that will automatically write a proof of this form, to assess whether abstract unpacking of bounded proofs is usable in practice, but it will require digging into a bunch of finicky details of exactly how to encode a math theory inside itself.


by Alex Appel 71 days ago | link

Caught a flaw with this proposal in the currently stated form, though it is probably patchable.

When unpacking a proof, at some point the sentence \(\Box_{1}\phi\) will be reached as a conclusion, which is a false statement.


by Sam Eisenstat 70 days ago | link

I misunderstood your proposal, but you don’t need to do this work to get what you want. You can just take each sentence \(\square_n \phi \to \phi\) as an axiom, but declare that this axiom takes \(n\) symbols to invoke. This could be done by changing the notion of length of a proof, or by taking axioms \(\psi_{\phi,n} \to (\square_n \phi \to \phi)\) and \(\psi_{\phi,n}\) with \(\psi_{\phi,n}\) very long.






There should be a chat icon
by Alex Mennen on Meta: IAFF vs LessWrong | 0 likes

Apparently "You must be
by Jessica Taylor on Meta: IAFF vs LessWrong | 1 like

There is a replacement for
by Alex Mennen on Meta: IAFF vs LessWrong | 1 like

Regarding the physical
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think that we should expect
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think I understand your
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

This seems like a hack. The
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

After thinking some more,
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yes, I think that we're
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

My intuition is that it must
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

To first approximation, a
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Actually, I *am* including
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yeah, when I went back and
by Alex Appel on Optimal and Causal Counterfactual Worlds | 0 likes

> Well, we could give up on
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

> For another thing, consider
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes


Privacy & Terms