Fixed point theorem in the finite and infinite case
discussion post by Victoria Krakovna 1328 days ago | Jim Babcock and Patrick LaVictoire like this | discuss

Janos and I started working on extending the fixed point theorem to the infinite case at MIRI’s June decision theory workshop with Patrick and Benja. This post attempts to exhibit the finite version of the theorem and speculates about a plausible extension.

# Algorithm for the finite case

Suppose we’re given variables $$p_1,\dots,p_n$$, and statements $$p_i\leftrightarrow \phi_i(p_1,\dots,p_n)$$ where each $$\phi_i$$ is fully modalized (no variables occur outside modal subformulas). We will describe an algorithm for constructing sentences $$\psi_i$$ with no free variables, such that the original statements are equivalent to $$p_1\leftrightarrow \psi_1,\dots,p_n\leftrightarrow \psi_n$$.

For simplification purposes, we will assume that each $$\phi_i$$ is only singly modalized (none of the modal subformulas contain further modal subformulas). If not, we can introduce new variables for each subformula of the form $$\square \phi_{i,j}$$ that occurs in a fully modalized context.

Now consider a sequence of theories $$W_0,W_1,\dots$$, where $$W_i\equiv PA\cup\{\square^{i+1} \bot,\lnot\square^i\bot\}$$.

In $$W_0$$, it’s easy to determine the truth value of each $$p_i$$: every modal subformula can be replaced with $$\top$$, leaving a propositional formula with no variables.

Now suppose we’ve done this for $$W_0,\dots,W_{n-1}$$. Then, a statement $$\square \phi$$ will be true in $$W_n$$ iff

$PA\vdash(\square^{n+1}\bot\land\lnot\square^n\bot)\to\square\phi \Leftrightarrow\square^n\bot\lor\square(\square^n \bot \rightarrow \phi) \Leftrightarrow\square^n\bot\lor\square\bigwedge_{i=0}^{n-1}((\square^{i+1} \bot\land \lnot\square^i \bot)\rightarrow \phi)$

Therefore $$\square \phi$$ will be true in $$W_n$$ iff $$\phi$$ is true in $$W_0,\dots,W_{n-1}$$; this will let us evaluate the truth value of $$p_i$$ in all of the theories $$W_1,\dots,W_n$$.

It’s clearly the case that every modal subformula will have truth value stabilizing, therefore every $$p_i$$ will also. So there is an $$N$$ such that $$W_n$$ has the same truth values as $$W_N$$ for $$n>N$$. Now if $$W_N\vdash p_i$$, construct $$\psi_i\equiv\lnot\bigvee_{i:W_i\vdash \lnot p_i} (\square^{i+1}\bot\land\lnot\square^i\bot)$$; otherwise construct $$\psi_i\equiv\bigvee_{i:W_i\vdash p_i} (\square^{i+1}\bot\land\lnot\square^i\bot)$$. These are variable-free formulas.

# Infinite case example: Procrastination Bot

def ProcrastinationBot$$_N$$(X):

if $$PA+N \vdash \square X(PB_{N+1}) = C$$:

return C

else: return D

Constructing the fixed point

Let $$p_{ij}$$ denote whether $$PB_i$$ cooperates with $$PB_j$$. Then $p_{ij} \leftrightarrow \square_i p_{j, i+1} =\square_i \square_j p_{i+1,j+1}$ where $$\square_i$$ stands for $$\lnot \square^i \bot \rightarrow \square$$ (e.g. $$\square_0 = \square$$).

Then the following statements hold $p_{00} \leftrightarrow\square \square p_{11}$ $p_{11} \leftrightarrow \square_1 \square_1 p_{22} \equiv \lnot \square \bot \rightarrow \square (\lnot \square \bot \rightarrow \square p_{22})$ $\dots$

In $$W_0$$, any statement with a box in front of it is true, so any statement where a boxed statement is implied is also true. Thus, all the relevant statements are true in $$W_0$$:

Theory $$p_{00}$$ $$\square \square p_{11}$$ $$p_{11}$$ $$\lnot \square \bot \rightarrow \square (\lnot \square \bot \rightarrow \square p_{22})$$ $$\lnot \square \bot \rightarrow \square p_{22}$$ $$p_{22}$$ $$\dots$$
$$W_0$$ $$\top$$ $$\top$$ $$\top$$

In $$W_{i+1}$$, all the implied statements are boxed statements that were true in $$W_i$$, e.g. $$\square p_{22}$$ or $$\lnot \square \bot \rightarrow \square p_{22}$$. Thus, all the relevant statements are true in $$W_{i+1}$$:

Theory $$p_{00}$$ $$\square \square p_{11}$$ $$p_{11}$$ $$\lnot \square \bot \rightarrow \square (\lnot \square \bot \rightarrow \square p_{22})$$ $$\lnot \square \bot \rightarrow \square p_{22}$$ $$p_{22}$$ $$\dots$$
$$W_0$$ $$\top$$ $$\top$$ $$\top$$ $$\top$$ $$\top$$ $$\top$$
$$\vdots$$ $$\vdots$$ $$\vdots$$ $$\vdots$$ $$\vdots$$ $$\vdots$$ $$\vdots$$
$$W_{i+1}$$ $$\top$$ $$\top$$ $$\top$$

Thus, any Procrastination Bot cooperates with any other Procrastination Bot.

Existence and uniqueness of the fixed point via quining

Given a formula $$\psi(m,n)$$, there exists formula $$\phi(n)$$ such that $$\vdash\phi(n)\leftrightarrow\psi(\ulcorner\phi\urcorner,n)$$. The quine construction (thanks Benja!) is $\phi(n)\equiv\psi(\mbox{let }k=\ulcorner\psi(\mbox{let }k=x\mbox{ in }\operatorname{subst}_{\ulcorner x\urcorner}(\operatorname{quote}(k),k))\urcorner\mbox{ in }\operatorname{subst}_{\ulcorner x\urcorner}(\operatorname{quote}(k),k),n)$

Let $$\psi(\ulcorner \phi \urcorner, n) = \square\ulcorner \phi(\overline{n+1}) \urcorner$$. Then we have $PA \vdash \forall n: (\phi(n) \leftrightarrow \square\ulcorner \phi(\overline{n+1}) \urcorner).$

Now consider that $$\square \ulcorner \forall n: \phi(n) \urcorner \rightarrow \square\ulcorner \phi(\overline{n+1}) \urcorner$$. Using the above, we have $PA \vdash \square \ulcorner \forall n: \phi(n) \urcorner \rightarrow \forall n: \phi(n).$

By Lob’s theorem, $$PA \vdash \forall n: \phi(n)$$, so a fixed point exists.

Assume there are two fixed points $$\phi$$ and $$\phi'$$. Then we have $PA \vdash \forall n: [\phi(n) \leftrightarrow \psi(\ulcorner \phi \urcorner,n)] \land [\phi'(n) \leftrightarrow \psi(\ulcorner \phi' \urcorner,n)],$ $PA \vdash \forall \ulcorner \phi \urcorner, \ulcorner \phi' \urcorner: \square \ulcorner \forall n: \phi(n) \leftrightarrow \phi'(n) \urcorner \rightarrow [\psi(\ulcorner \phi \urcorner, n) \leftrightarrow \psi(\ulcorner \phi' \urcorner, n)]$ Thus, $$PA \vdash \forall n: \phi(n) \leftrightarrow \phi'(n)$$, so the fixed points are the same.

### NEW DISCUSSION POSTS

[Note: This comment is three
 by Ryan Carey on A brief note on factoring out certain variables | 0 likes

There should be a chat icon
 by Alex Mennen on Meta: IAFF vs LessWrong | 0 likes

Apparently "You must be
 by Jessica Taylor on Meta: IAFF vs LessWrong | 1 like

There is a replacement for
 by Alex Mennen on Meta: IAFF vs LessWrong | 1 like

Regarding the physical
 by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think that we should expect
 by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think I understand your
 by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

This seems like a hack. The
 by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

After thinking some more,
 by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yes, I think that we're
 by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

My intuition is that it must
 by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

To first approximation, a
 by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Actually, I *am* including
 by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yeah, when I went back and
 by Alex Appel on Optimal and Causal Counterfactual Worlds | 0 likes

> Well, we could give up on
 by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes