Intelligent Agent Foundations Forumsign up / log in
Reflective probabilistic logic cannot assign positive probability to its own coherence and an inner reflection principle
post by Jessica Taylor 872 days ago | Kaya Stechly, Benja Fallenstein, Nate Soares, Patrick LaVictoire and Stuart Armstrong like this | 5 comments

Summary: although we can find an assignment of probabilities to logical statements that satisfies an outer reflection principle, we have more difficulty satisfying both the outer reflection principle and an inner reflection principle. This post presents an impossibility result.


Introduction

In probabilistic logic, we construct a \(\mathbb{P}\) that assigns a probability to each logical proposition (in, say, the language of set theory plus a \(\mathbb{P}\) symbol) in a way that satisfies some coherence axioms. Additionally, we can construct a \(\mathbb{P}\) that assigns probability 1 to \(\mathbb{P}\)’s coherence and satisfies an “outer” reflection principle*:

\[\forall \varphi \in L', a \in \mathbb{Q}, b \in \mathbb{Q} : (a < \mathbb{P}(\varphi) < b) \Rightarrow \mathbb{P}(a < \mathbb{P}(\ulcorner \phi \urcorner) < b) = 1 \]

Using the contrapositive, the outer reflection principle equivalently states

\[\forall \varphi \in L', a \in \mathbb{Q}, b \in \mathbb{Q} : (a \leq \mathbb{P}(\varphi) \leq b) \Leftarrow \mathbb{P}(a \leq \mathbb{P}(\ulcorner \phi \urcorner) \leq b) > 0 \]

Note that the outer reflection principle is not stated within \(\mathbb{P}\); instead, it is stated in a meta-language that talks about \(\mathbb{P}\). To push reasoning about the system into the system itself, we might want an “inner” reflection principle:

\[ \forall \varphi \in L', a \in \mathbb{Q}, b \in \mathbb{Q} : \mathbb{P}((a < \mathbb{P}(\ulcorner \varphi \urcorner ) < b) \rightarrow \mathbb{P}(\ulcorner a < \mathbb{P}(\ulcorner \phi \urcorner) < b \urcorner) = 1) = 1\]

There exists a coherent \(\mathbb{P}\) assigning probability 1 to its own coherence and satisfying this inner reflection principle (but not the outer reflection principle)**. However, it has been previously proven that no coherent \(\mathbb{P}\) satisfies both the outer reflection principle and this inner reflection principle.

Impossibility result

In this post, we will show a more general result. We can have no coherent \(\mathbb{P}\) satisfying the outer reflection principle and assigning positive probability to the its own coherence and the inner reflection principle: that is, there is no coherent \(\mathbb{P}\) satisfying the outer reflection principle and

\[\exists \epsilon > 0 : \forall \varphi \in L', a \in \mathbb{Q}, b \in \mathbb{Q} : \mathbb{P}(\text{Coherent}(\mathbb{P}) \wedge ((a < \mathbb{P}(\ulcorner \varphi \urcorner ) < b) \Rightarrow \mathbb{P}(\ulcorner a < \mathbb{P}(\ulcorner \phi \urcorner) < b \urcorner)) = 1) > \epsilon\]

Note that this version of the inner reflection principle plus self-coherence is weaker than

\[\mathbb{P}(\text{Coherent}(\mathbb{P}) \wedge \forall \varphi \in L', a \in \mathbb{Q}, b \in \mathbb{Q} : (a < \mathbb{P}(\ulcorner \varphi \urcorner ) < b) \Rightarrow \mathbb{P}(\ulcorner a < \mathbb{P}(\ulcorner \phi \urcorner < b \urcorner) = 1)) > 0\]

Proof:

Suppose such a coherent \(\mathbb{P}\) and \(\epsilon\) exist. Construct a fixed-point statement \[\varphi : \Leftrightarrow \mathbb{P}(\varphi) < 1 - \epsilon\]

Also define as shorthand \[IRP :\Leftrightarrow \text{Coherent}(\mathbb{P}) \wedge (\mathbb{P}(\ulcorner \varphi \urcorner) < 1 - \epsilon \Rightarrow \mathbb{P}(\ulcorner \mathbb{P}(\ulcorner \varphi \urcorner) < 1 - \epsilon \urcorner) = 1)\]

We know that \(\mathbb{P}(IRP) > \epsilon\) due to the inner reflection principle.

Suppose \(\mathbb{P}(\mathbb{P}(\ulcorner\varphi\urcorner) < 1 - \epsilon \wedge IRP) > 0\). Then:

\[\mathbb{P}(\mathbb{P}(\ulcorner \varphi \urcorner) < 1 - \epsilon \wedge IRP \wedge \mathbb{P}(\ulcorner\mathbb{P}(\ulcorner \varphi \urcorner) < 1 - \epsilon\urcorner) = 1) > 0\]

Due to \(IRP\) implying \(\text{Coherent}(\mathbb{P})\) which implies \(\mathbb{P}(\ulcorner \varphi \urcorner) = \mathbb{P}(\ulcorner \mathbb{P}(\ulcorner \varphi \urcorner) < 1 - \epsilon \urcorner)\):

\[\mathbb{P}(\mathbb{P}(\ulcorner\varphi\urcorner) < 1 - \epsilon \wedge IRP \wedge \mathbb{P}(\ulcorner\varphi\urcorner) = 1) > 0\]

\[\mathbb{P}(\bot) > 0\]

which contradicts \(\mathbb{P}\)’s coherence. Therefore, we must have \(\mathbb{P}(\mathbb{P}(\ulcorner\varphi\urcorner) < 1 - \epsilon \wedge IRP) = 0\).

Since \(\mathbb{P}(IRP) > \epsilon\), it follows that \(\mathbb{P}(\mathbb{P}(\ulcorner\varphi\urcorner) < 1 - \epsilon) < 1 - \epsilon\). Equivalently, \(\mathbb{P}(\varphi) < 1 - \epsilon\). By the outer reflection principle, \(\mathbb{P}(\mathbb{P}(\ulcorner\varphi\urcorner) < 1 - \epsilon) = 1\), or equivalently \(\mathbb{P}(\varphi) = 1\), which is a contradiction when combined with \(\mathbb{P}(\varphi) < 1 - \epsilon\).

Q. E. D.

Conclusion

This post developed out of some work with Paul and Benja to try to create a probabilistic logic that can reason about reflective oracles and thereby reason about itself in a useful way. The idea would be that it would (indirectly) assign nonzero probability to its own reflection principle, so that we could gain more reflective power through repeated conditioning. Due to the impossibility result in this post, this cannot work. It might still be possible to construct a system satisfying some weaker inner reflection principle.


* This result in stronger than the one stated in the original probabilistic logic paper (which only proves that some coherent \(\mathbb{P}\) satisfies the outer reflection principle, not that it also assigns probability 1 to its own coherence); its proof might be shown in a future post.

** This result has not been written up either.



by Alex Appel 870 days ago | Benja Fallenstein and Jessica Taylor like this | link

If you are looking for a weaker inner reflection principle, does \(\mathbb{P}((a < \mathbb{P}(\ulcorner \varphi \urcorner ) < b) \rightarrow \mathbb{P}(\ulcorner a-\epsilon < \mathbb{P}(\ulcorner \varphi \urcorner) < b+\epsilon \urcorner) = 1) = 1\) for some finite \(\epsilon\) sound viable, or are there fatal flaws with it?

This came about while trying to figure out how to break the proof in the probabilistic procrastination paper. Making the reflection principle unable to prove that P(eventually presses button) is above \(1-\epsilon\) came up as a possible way forward.

reply

by Benja Fallenstein 867 days ago | Jessica Taylor likes this | link

If you replace the inner “\(= 1\)” by “\(> 1-\epsilon\)”, then the literal thing you wrote follows from the reflection principle: Suppose that the outer probability is \(< 1\). Then

\[\mathbb{P}[(a < \mathbb{P}[\varphi] < b) \wedge \mathbb{P}[a-\epsilon < \mathbb{P}[\varphi] < b+\varepsilon] \le 1 - \epsilon] > 0.\]

Now, \(\mathbb{P}[a < \mathbb{P}[\varphi] < b] > 0\) implies \(\mathbb{P}[a \le \mathbb{P}[\varphi] \le b] > 0\), which by the converse of the outer reflection principle yields \(a \le \mathbb{P}[\varphi] \le b\), whence \(a - \epsilon < \mathbb{P}[\varphi] < b + \epsilon\). Now, by the forward direction of the outer reflection principle, we have

\[\mathbb{P}[a - \epsilon < \mathbb{P}[\varphi] < b + \epsilon] = 1 > 1 - \epsilon,\]

which, by the outer reflection principle again, implies

\[\mathbb{P}[\mathbb{P}[a - \epsilon < \mathbb{P}[\varphi] < b + \epsilon] > 1 - \epsilon] = 1,\]

a contradiction to the assumption that \(\cdots \le 1 - \epsilon\) had outer probability \(> 0\).

However, what we’d really like is an inner reflection principle that assigns probability one to the statement *quantified over all \(a\), \(b\), and Gödel numbers \(\ulcorner\varphi\urcorner\). I think Paul Christiano has a proof that this is impossible for small enough \(\epsilon\), but I don’t remember how the details worked.

reply

by Paul Christiano 867 days ago | Benja Fallenstein likes this | link

Here is the basic problem. I think that you can construct an appropriate liar’s sentence by using a Lipschitz function without an approximate fixed point. But someone might want to check that more carefully and write it up, to make sure and to see where the possible loopholes are. I think that it may not have ruled out this particular principle, just something slightly stronger (but the two were equivalent for the kinds of proof techniqeus we were considering).

reply

by Stuart Armstrong 868 days ago | link

Neat. What if the ϵ were allowed to vary as a function of φ?

reply

by Jessica Taylor 868 days ago | link

We’d need to have \(\epsilon\) for the fixed point \(\phi_\delta : \Leftrightarrow \mathbb{P}(\phi_\delta) < \delta\) be less than \(\delta\). Maybe this will work, but I don’t really see a good rule for determining \(\epsilon\) that would lead to this.

reply



NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

Note that the problem with
by Vadim Kosoy on Open Problems Regarding Counterfactuals: An Introd... | 0 likes

Typos on page 5: *
by Vadim Kosoy on Open Problems Regarding Counterfactuals: An Introd... | 0 likes

Ah, you're right. So gain
by Abram Demski on Smoking Lesion Steelman | 0 likes

> Do you have ideas for how
by Jessica Taylor on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

I think I understand what
by Wei Dai on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

>You don’t have to solve
by Wei Dai on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

Your confusion is because you
by Vadim Kosoy on Delegative Inverse Reinforcement Learning | 0 likes

My confusion is the
by Tom Everitt on Delegative Inverse Reinforcement Learning | 0 likes

> First of all, it seems to
by Abram Demski on Smoking Lesion Steelman | 0 likes

> figure out what my values
by Vladimir Slepnev on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

I agree that selection bias
by Jessica Taylor on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

>It seems quite plausible
by Wei Dai on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

> defending against this type
by Paul Christiano on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

2. I think that we can avoid
by Paul Christiano on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

I hope you stay engaged with
by Wei Dai on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

RSS

Privacy & Terms