Intelligent Agent Foundations Forumsign up / log in
Waterfall Truth Predicates
post by Abram Demski 1050 days ago | Benja Fallenstein, Jessica Taylor, Patrick LaVictoire and Scott Garrabrant like this | 2 comments

The waterfall-based approaches to the Löbian obstacle offer a way around finitely-terminating sequences of trust which we get by adding towers of soundness schemas to \(PA\). This creates a kind of illusion of self-trust by way of a non-well-founded chain of trust.

Another familiar situation where we are normally faced with the ability to construct arbitrarily high towers but not a single self-referential system is that of truth predicates. Tarski’s undefinability theorem blocks the existence of a full truth predicate within the same language as the one which it describes. Perhaps a similar waterfall construction can be applied, to get an infinite descending chain of languages.

Extend the language of \(PA\) with a family of truth predicates \(Tr_n\). A Tarski-style approach would assert a T-schema \(Tr_n(\ulcorner \phi \urcorner) \leftrightarrow \phi\) for \(\phi\) which contain truth predicates indexed strictly lower than \(n\). (\(\ulcorner \phi \urcorner\) is the Gödel number of \(\phi\).) Here, we wish to flip this, and assert a T-schema which allows strictly higher \(n\).

This brings to mind Yablo’s Paradox. A contradiction can likely be worked out in a way resembling that, but instead I’ll note that this theory implies the naive soundness waterfall in which we construct a sequence of theories \(T_{n} = T_{n+1} + Sound(T_{n+1})\). This is because we can use the truth predicate \(Tr_n\) to carry out a proof of soundness for the axioms and inference rules, with the exception of instances of the T-schema involving \(Tr_{m \leq n}\). This gives us the naive soundness waterfall, which we know to be inconsistent. (Note that I have not checked this in detail, however.)

My idea for fixing the T-schema, then, is to introduce the same \(\psi(n)\) predicate which asserts that \(n\) is not the Gödel number of a proof of contradiction in \(ZFC\). We make the new schema:

  • \(\psi(n) \rightarrow \big[ Tr_n(\ulcorner \phi \urcorner) \leftrightarrow \phi \big]\), where \(\phi\) contains only \(Tr_{m>n}\).

Because we can prove any particular \(\psi(n)\), we can still apply the schema in specific cases. It seems likely that we can still carry out soundness arguments, as well, constructing the consistent version of the soundness waterfall. If so, the theory ends up being unsound as a result.

Here’s a proof that the theory is unsound. Consider again Yablo’s paradox. We construct an infinite sequence of statements, \(A_0, A_1, ...\) each of which assert that the subsequent statements in the sequence are all false. Specifically: \[A_n \leftrightarrow \forall_{m>n}: \neg Tr_n(\ulcorner A_m \urcorner)\]

Considering any particular \(A_n\), we see that it implies \(\neg Tr_n(\ulcorner A_{n+1} \urcorner)\), and also \(\forall_{m>n+1}: \neg Tr_n(\ulcorner A_m \urcorner)\). By an application of the T-schema, however, these two statements are just the negation of each other. Therefore, the theory proves \(\neg A_n\). The choice of \(n\) was generic, so we can see that the system eventually proves every sentence in the sequence false. From outside, we can see that this implies that each of them is true, however.

If the system proposed here can also carry out that reasoning, then it will be inconsistent.

Even if it is consistent, it’s still unsound, so it’s unlikely to be very useful. It would be interesting if truth-predicate versions of the other solutions to the Löbian obstacle could be constructed. (My intuition is that this won’t be possible for the consistency waterfall, but is likely possible for model polymorphism.)

by Benja Fallenstein 1049 days ago | Patrick LaVictoire likes this | link

I would suggest changing this system by defining \(\psi(n)\) to mean that no \(m\le n\) is the Gödel number of a proof of an inconsistency in ZFC (instead of just asserting that \(n\) isn’t). The purpose of this is to make it so that if ZFC were inconsistent, then we only end up talking about a finite number of levels of truth predicate. More specifically, I’d define \(T_n\) to be PA plus the axiom schema

\[\forall m\ge n.\;\psi(m)\to\forall x.\;\mathrm{Tr}_m(\ulcorner\varphi(\overline x)\urcorner)\leftrightarrow\varphi(x).\]

Then, it seems that Jacob Hilton’s proof that the waterfalls are consistent goes through for this waterfall:

Work in ZFC and assume that ZFC is inconsistent. Let \(n\) be the lowest Gödel number of a proof of an inconsistency. Let \(M\) be the following model of our language: Start with the standard model of PA; it remains to give interpretations of the truth predicates. If \(m\ge n\), then \(\mathrm{Tr}_m(k)\) is false for all \(k\). If \(m<n\), then \(\mathrm{Tr}_m(k)\) is true iff \(k\) is the Gödel number of a true formula involving only \(\mathrm{Tr}_{m'}\) for \(m'>m\). Then, it’s clear that \(T_0\), and hence all \(T_m\) (since \(T_0\) is the strongest of the systems) is sound on \(M\), and therefore consistent.

Thus, we have proven in ZFC that if ZFC is inconsistent, then \(T_0\) is consistent; or equivalently, that if \(T_0\) is inconsistent, then ZFC is consistent. Stepping out of ZFC, we can see that if \(T_0\) is inconsistent, then ZFC proves this, and therefore in this case ZFC proves its own consistency, implying that it is inconsistent. Hence, if ZFC is consistent, then so is \(T_0\).

(Moreover, we can formalize this reasoning in ZFC. Hence, we can prove in ZFC (i) that if ZFC is inconsistent, then \(T_0\) is consistent, and (ii) that if ZFC is consistent, then \(T_0\) is consistent. By the law of the excluded middle, ZFC proves that \(T_0\) is consistent.)


by Benja Fallenstein 1049 days ago | link

We should be more careful, though, about what we mean by saying that \(\varphi(x)\) only depends on \(\mathrm{Tr}_{m}\) for \(m>n\), though, since this cannot be a purely syntactic criterion if we allow quantification over the subscript (as I did here). I’m pretty sure that something can be worked out, but I’ll leave it for the moment.






Note: I currently think that
by Jessica Taylor on Predicting HCH using expert advice | 0 likes

Counterfactual mugging
by Jessica Taylor on Doubts about Updatelessness | 0 likes

What do you mean by "in full
by David Krueger on Doubts about Updatelessness | 0 likes

It seems relatively plausible
by Paul Christiano on Maximally efficient agents will probably have an a... | 1 like

I think that in that case,
by Alex Appel on Smoking Lesion Steelman | 1 like

Two minor comments. First,
by Sam Eisenstat on No Constant Distribution Can be a Logical Inductor | 1 like

A: While that is a really
by Alex Appel on Musings on Exploration | 0 likes

> The true reason to do
by Jessica Taylor on Musings on Exploration | 0 likes

A few comments. Traps are
by Vadim Kosoy on Musings on Exploration | 1 like

I'm not convinced exploration
by Abram Demski on Musings on Exploration | 0 likes

Update: This isn't really an
by Alex Appel on A Difficulty With Density-Zero Exploration | 0 likes

If you drop the
by Alex Appel on Distributed Cooperation | 1 like

Cool! I'm happy to see this
by Abram Demski on Distributed Cooperation | 0 likes

Caveat: The version of EDT
by 258 on In memoryless Cartesian environments, every UDT po... | 2 likes

[Delegative Reinforcement
by Vadim Kosoy on Stable Pointers to Value II: Environmental Goals | 1 like


Privacy & Terms