Intelligent Agent Foundations Forumsign up / log in
Thoughts on Logical Dutch Book Arguments
post by Abram Demski 1301 days ago | Jessica Taylor, Patrick LaVictoire and Scott Garrabrant like this | discuss

This post examines the application of Dutch Book arguments to logical uncertainty, as part of an attempt to fill out the ideas I speculated on in this post.


An important question we’re currently facing is how to relate logical uncertainty to logical priors.

By logical prior, I mean an assignment of real numbers representing degree of belief to sentences, \(\mathbb{P}(\phi)\). This assignment is usually required to be coherent; I’ll briefly review what Dutch Book arguments have to say about this before dealing with logical uncertainty.

By logical uncertainty, I mean the problem of making predictions under time constraints, so that we are unable to use all of our relevant knowledge. We can formalize this as the problem of designing a quickly computable probability function \(p(x)\) predicting a more slowly computable \(f(x)\). I have in mind things like Scott’s three levels of difficulty.

This division between the problem of logical priors and logical uncertainty has been implicit in Scott and I’s work for a while now, but we initially didn’t make a full distinction between them. These really are different problems. We want the two to be connected, but that’s an open problem for now.

The most obvious way for them to be connected would be for \(P_t\) to have the kinds of good properties which are discussed in connection with logical uncertainty. (Level 3 of Scott’s hierarchy is especially interesting in this respect.) However, in this post I’ll take a different route, and discuss what kinds of Dutch book arguments we might apply to \(P_t\).

Doubting Coherence

I won’t review the whole of the classic Dutch Book argument, but it is detailed here. The conclusion is this. A function \(\mathbb{P}(\phi)\) expressing degrees of belief for an agent must satisfy three conditions:

  1. \(0 \leq \mathbb{P}(\phi) \leq 1\)
  2. If \(\phi\) is a tautology, then \(\mathbb{P}(\phi)=1\).
  3. (Additivity) If \(\phi\) and \(\psi\) are mutually exclusive, then \(\mathbb{P}(\phi \vee \psi) = \mathbb{P}(\phi) + \mathbb{P}(\psi)\).

The argument proceeds by showing that if any of the constraints are violated, then a bookie can put together a set of bets which the agent will be willing to buy or sell, such that the bookie is guaranteed to profit on the whole no matter what the outcome of the bets may be. This set of bets is the “Dutch Book”.

One might assume that this argument can be employed to justify coherence for logical priors. However, this isn’t clear. The paper From Classical to Intuitionistic Probability points out that Dutch Book arguments assume every bet will eventually be decided one way or the other, which is quite unrealistic in the logical setting. An open conjecture might never be conclusively proved or disproved, so that mathematicians taking best on the conclusion would never collect.

Consider the equation for additivity. The usual argument for this rule goes as follows. \(\phi \vee \psi\) pays out exactly when one (and only one) of \(\phi\) or \(\psi\) does. This means a bookie can profit the difference, if the two sides of the equation aren’t equal. If \(\mathbb{P}(\phi \vee \psi) > \mathbb{P}(\phi) + \mathbb{P}(\psi)\), the bookie can sell the agent a bet on \(\phi \vee \psi\), while buying one on \(\phi\) and \(\phi\) individually. If \(\mathbb{P}(\phi \vee \psi) < \mathbb{P}(\phi) + \mathbb{P}(\psi)\), the reverse strategy works. In the case where not all bets eventually get decided, however, the trick only works in one direction. We might be able to verify \(\phi \vee \psi\) without ever verifying which of \(\phi\) or \(\psi\) is true. For example, we can deduce \(\phi \vee \neg \phi\) without having any information about which side is true. So, we can only get:

Sub-additivity: For mutually exclusive sentences \(\phi\) and \(\psi\), we must have \(\mathbb{P}(\phi) + \mathbb{P}(\psi) \leq \mathbb{P}(\phi \vee \psi)\).

(If this is murky from my brief description, I suggest looking through the details of the Dutch book argument here.)

From Classical to Intuitionistic Probability, which I mentioned earlier, argues that we ought to allow for undecidable bets but keep additivity, and provides a way of doing so: intuitionistic probability. The paper gives a Dutch Book argument justifying intuitionistic probability in the case of potentially undecidable bets. (The argument makes the assumption that the agents use intuitionistic logic, and are logically omniscient.)

So, it seems that we can’t quite argue for coherence of \(\mathbb{P}\); we can use the argument to advocate sub-additive beliefs, or intuitionistic ones. (Both of these allow a classically coherent distribution as a special case – this isn’t a proof that we shouldn’t have a coherent distribution, only that we may not have a reason to want one.)

Logical Bets at Finite Time

Another objection which has been raised to Dutch Book arguments is that they assume sufficient processing power, and so may not bear on what beliefs should look like on normal timescales. This is only made worse by the fact that coherent logical priors are uncomputable, not merely intractable. No computable function can assign probability 0 to contradictions and probability 1 to tautologies. What we actually must use to make gambles is a computable approximation \(P_t(\phi)\). This is asked to converge to the uncomputable prior: \(\lim_{t \rightarrow \infty} P_t(\phi) = \mathbb{P}(\phi)\). The argument for this setup would seem to be that \(P_t(\phi)\) will always be vulnerable to some Dutch Book, due to its incoherence, but approximating \(\mathbb{P}(\phi)\) as well as we can will eliminate any particular vulnerability over time. However, this argument seems somewhat unclear. Can we translate Dutch-Book arguments to the finite-time case and conclude anything like this?

In principle, \(P_t(\phi)\) could be using any sort of crazy algorithm to approximate \(\mathbb{P}(\phi)\). In practice, however, it makes sense to assume that \(P_t\) has access to a theorem prover, and is making its estimates based on what has so far been proved.

Let’s suppose that the theorem prover access can be reduced to a computable function \(\texttt{con}(\Gamma, n)\) which takes a set of sentences, \(\Gamma\), and a time limit \(n\), and searches for a contradiction. If one is found by time \(n\), it outputs \(1\); otherwise, \(0\).

Further suppose that we can decompose \(P_t(\phi)\) into two parts. First, a part \(I(t)\) which runs \(\texttt{con}\) a certain amount based on \(t\), outputting a set of input/output pairs \(\mathcal{S}=I(t)\); I’ll write these pairs as equations, like \(\texttt{con}(\Gamma, n)=0\). Second, a part \(\mathcal{P}_\mathcal{S}(\phi)\) which takes \(\mathcal{S}\) and turns it into an approximate probability distribution. This is a large assumption about the structure of \(P_t\), but not unusual in terms of the kinds of \(P_t\) we’ve explored. Note that this also gives us a form of counterfactual, because we can use fake \(\mathcal{S}\).

I’ll call the output-sets \(\mathcal{S}\) logical knowledge. The constraining information in the logical knowledge, those outputs of the form \(\texttt{con}(\Gamma, n)=1\), will be called constraints.

The idea behind a Dutch Book argument in this setting will be that based on our current logical knowledge we’d be predictably exploitable if our beliefs didn’t conform to certain axioms. We can think of this as a Dutch Book argument where the bookie is equally logically ignorant, or we can think of it as defending ourselves against a bookie as well as we can given our logical ignorance.

First I’ll discuss \(P_t(\phi)\) for fixed \(t\), and then I’ll discuss constraints which apply across \(t\).

To determine the behavior at fixed \(t\), we consider fixed \(\mathcal{S}\). We can apply the basic Dutch Book to argue for the following modified axioms:

Relative Coherence:

  1. \(0 \leq \mathcal{P}_\mathcal{S}(\phi) \leq 1\)

  2. If \(\texttt{con}(\{ \neg \phi \} ,n)=1 \in \mathcal{S}\), then \(\mathcal{P}_\mathcal{S}(\phi)\)=1.

  3. If \(\texttt{con}(\{ \phi, \psi \} ,n)=1 \in \mathcal{S}\), then \(\mathcal{P}_\mathcal{S}(\phi) + \mathcal{P}_\mathcal{S}(\psi) = \mathcal{P}_\mathcal{S}(\phi \vee \psi)\).

I.E., \(\mathcal{P}_\mathcal{S}\) ought to be coherent with respect to the currently known constraints. Notice that I haven’t shown that this is always possible, however. In fact, it’s not possible for arbitrary \(\mathcal{S}\); for example, it’s not possible if \(\mathcal{S}\) contains both \(\texttt{con}(\{ \phi \}, n)\) and \(\texttt{con}(\{ \neg \phi \}, m)\).

Question: How can we characterize the conditions under which a probability distribution satisfying 1-3 exist?

Since \(P_t\) handles information coming from a real proof system, it should always be nice enough that it’s possible to satisfy 1-3. We conclude that \(P_t(\phi)\) should always be coherent with respect to the constraints \(I(s)\).

As before, we might argue for the sub-additive version of (3) or for intuitionistic beliefs, if we want to account for the possibility that bets on sentences are never decided one way or the other.

I also note that the properties 1-3 don’t seem to fit very strongly with the structure of our logical knowledge. While our logical knowledge tells us about the inconsistency of sets of sentences, the relative consistency properties only involve such sets containing one or two sentences. This makes it feel like a poor fit. Dutch Book arguments only argue that if properties were to be violated, then a Dutch Book would exist. An inverse Dutch Book argument shows that when a set of properties are satisfied, no Dutch Book exists. Inverse Dutch Book arguments are typically somewhat harder to make, because they require careful specification of the kinds of bets the bookie is allowed to use, to show that no combination of these yields a Dutch Book.

Question: Are the properties 1-3 sufficient to make an inverse Dutch Book argument? If not, how might they be reformulated?

Now, let’s constrain how beliefs change as \(t\) increases, by varying \(\mathcal{S}\). We will consider the move from \(\mathcal{S}_1\) to a new set of information \(\mathcal{S}^*\), such that \(\mathcal{S}^*\) is an extension of \(\mathcal{S}\) – it contains all the information of \(\mathcal{S}\), plus possibly more.

Notice, conditions (1-3) above did not make use of information of the form \(\texttt{con}(\{ \phi, \psi \} ,n)=0\). This will become important now. \(\mathcal{S}^*\) may contain a newly discovered contradiction \(\texttt{con}(\Gamma, n)=1\) where \(\mathcal{S}_1\) only had \(\texttt{con}(\Gamma, m)=0\) for \(m<n\), or on the other hand information that a contradiction still hasn’t been found after more searching (represented by the knowledge \(\texttt{con}(\Gamma, n)=1\)). Intuitively, we want something like conservation of expected evidence (CEE) to apply; if the discovery of a new contradiction moves the belief function in one direction, searching longer without finding a contradiction should make an opposite change. To put it formally:

(CEE) \[\mathcal{P}_{\mathcal{S}}(\phi) = \sum_{\mathcal{S}_i} \mathcal{P}_{\mathcal{S}_i}(\phi) w(\mathcal{S}_i)\]

where the \(\mathcal{S}_i\) are a set of mutually exclusive and exhaustive extensions of \(\mathcal{S}\), with \(\mathcal{S}^*\) among them.

A Diachronic Dutch Book argument looks potentially relevant; an argument like that could show that the move from \(\mathcal{P}_{\mathcal{S}}\) to \(\mathcal{P}_{\mathcal{S}^*}\) should be a Bayesian update. However, in order to establish this it makes the assumption that we already have a probability distribution over possible \(\mathcal{S}^*\); these become the weights \(w(\mathcal{S}_i)\). Furthermore, it assumes probabilities have been assigned for joint events between sentences and consistency information. Where does this come from? What kind of coherence constraints does it need to follow?

To address this, let’s assume that the agent possesses a predicate \(C(n,\ulcorner \Gamma \urcorner)\) which represents the behavior of the function \(\texttt{con}(n,\ulcorner \Gamma \urcorner)\). For example, if the language we are expressing beliefs in includes \(PA\), \(C(n,\ulcorner \Gamma \urcorner)\) could be an encoding in \(PA\) of \(\texttt{con}(n,\Gamma)\). \(\ulcorner \Gamma \urcorner\) can be the Goedel number of a conjunction of the sentences in the set.

This gives us a way for the agent to talk about its logical information in its own language, so that we already have a (relatively coherent) distribution here. I don’t want to assume that this predicate reflects direct self-knowledge (although our argument will conclude that it should), only that if \(\texttt{con}(n,\Gamma)\) becomes true or false, a bookie could eventually obtain enough information to adjudicate bets on \(C(n, \Gamma)\).

Now we can apply the diachronic Dutch Book to conclude that adding new information to our set must result in a Bayesian update. For one added statement: \(\mathcal{P}_{\mathcal{S} \cup \texttt{con}(n,\Gamma)}(\psi) = \mathcal{P}_\mathcal{S}(\psi \wedge C(n,\gamma)) / \mathcal{P}_\mathcal{S}(C(n,\gamma))\). Similarly, for any set of added information \(\mathcal{G}\) which differentiates \(\mathcal{S}\) from \(\mathcal{S}^*\), calling the conjunction of the \(C\)-sentences representing that set \(G\), we have:

Information-Adding Coherence: \[\mathcal{P}_{\mathcal{S} \cup \mathcal{G}}(\psi) = \mathcal{P}_\mathcal{S}(\psi \wedge G) \mathcal{P}_\mathcal{S}(G)\]

A reasonable objection to this application of the diachronic Dutch Book argument is that we said earlier (when dealing with fixed \(t\)) that the bookie should be as logically ignorant as the agent. I’m giving the bookie additional powers here, to observe things about \(\texttt{con}(n,\Gamma)\) and conclude things about \(C(n,\Gamma)\). This step is not justified except that it makes a diachronic argument go through, and that it’s apparently still possible to satisfy this requirement.

This doesn’t actually let me get (CEE), though. Because the probability distribution is still logically ignorant, it might not follow nice rules; for example, it might be that \(P_t(\phi | \psi \wedge \chi) \neq P_t(\phi | \chi \wedge \psi)\) because equivalence of re-ordered conjunctions isn’t recognized in every case. I need a weaker rule, which I’ll call reflective conservation of expected evidence (RCEE). First, a definition. For arbitrary \(\mathcal{S}_i\), let \(\mathcal{G}_i = \mathcal{S}_i - \mathcal{S}\), and \(G_i\) be the conjunction of \(C\)-sentences representing the information in \(\mathcal{G}_i\). Now:

Reflective Partition: Consider a sequence of output-sets, \(\mathcal{S}_i\) for \(i = 1 ... n\). Define \(H_i\) as \(\phi \wedge G_i\), and \(J_i\) as the disjunction of all \(H_j\) where \(j \geq i\). The sequence \(\mathcal{S}_i\) is a reflective partition of \(\phi\) in \(\mathcal{S}\) iff \(\mathcal{S}\) includes \(\texttt{con}( \{ H_i, J_{i+1} \} , t_i)=1\) for \(i = 1 ... n-1\), plus \(\texttt{con}(\neg (\phi \vee \neg J_1),t_j)=1\), \(\texttt{con}(\{ \phi, \neg J_1 \},t_k)=1\), \(\texttt{con}(\neg (J_1 \vee \neg J_1),t_l)=1\), and \(\texttt{con}(\{ J_1, \neg J_1 \} ,t_m)=1\) (for some \(t_{ \{ 1...n \} }\), \(t_j\), \(t_k\), \(t_l\), and \(t_m\)).

Theorem (RCEE): \(\mathcal{P}_{\mathcal{S}}(\phi) = \sum_{\mathcal{S}_i} \mathcal{P}_{\mathcal{S}_i}(\phi) w(\mathcal{S}_i)\) where \(\mathcal{S}_i\) is a reflective partition of \(\phi\) in \(\mathcal{S}\).

Proof: We know \(\mathcal{P}_{\mathcal{S}_i}(\phi) = \mathcal{P}_\mathcal{S}(\phi | G_i)\). We can take \(w(\mathcal{S}_i) = \mathcal{P}_\mathcal{S}(G_i)\). It now suffices to show that \(\mathcal{P}_{\mathcal{S}}(\phi) = \Sigma_{G_i} \mathcal{P}_\mathcal{S}(\phi \wedge G_i)\). This follows from the definition of reflective partition. We can apply our relative coherence properties (2) and (3) to the constraints \(\texttt{con}(\neg (\phi \vee \neg J_1),t_j)=1\), \(\texttt{con}(\{ \phi, \neg J_1 \},t_k)=1\), \(\texttt{con}(\neg (J_1 \vee \neg J_1),t_l)=1\), and \(\texttt{con}(\{ J_1, \neg J_1 \} ,t_m)=1\) to show that \(\mathcal{P}_\mathcal{S}(\psi) = \mathcal{P}_\mathcal{S}(J_1)\). We then decompose \(\mathcal{P}_\mathcal{S}(J_1)\) into a sum by applying (3) with the statemens \(\texttt{con}( \{ H_i, J_{i+1} \} , t_i)=1\). \(\Box\)

Importantly, I still haven’t shown that something exists satisfying these properties.

Question: Under what conditions can both relative coherence and information-adding coherence be satisfied?

I attempted to construct such a distribution, but it turned out to be more complex that I had hoped, and it will have to wait for another post. Intuitively, it seems possible to do this by combining (1) a computable distribution over infinite sets of constraints (for which relative consistency must always be possible), (2) a simple waiting-time distribution to model when the consistency checker is likely to find a constraint, and (3) a process building random theories consistent with the constraints, analogous to the process which defines the Demski prior.

If it works, this seems promising for solving Paul Christiano’s Paradox of Ignorance as well. In the present formalism, observing a proof from a trusted source is exactly the same as getting the information from the built-in consistency checker, due to the way I’ve defined information-adding coherence in terns of the reflective predicate \(C\).

Uniform Coherence

Uniform coherence has been proposed as a stronger desirable property than coherence, which places conditions on \(P_t\) rather than only \(\mathbb{P}\). In light of the current discussion, it is interesting to note that uniform coherence is not justified by a Dutch Book argument at present. I haven’t thought a great deal about this, but it seems interesting to explore such a justification.



NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

[Note: This comment is three
by Ryan Carey on A brief note on factoring out certain variables | 0 likes

There should be a chat icon
by Alex Mennen on Meta: IAFF vs LessWrong | 0 likes

Apparently "You must be
by Jessica Taylor on Meta: IAFF vs LessWrong | 1 like

There is a replacement for
by Alex Mennen on Meta: IAFF vs LessWrong | 1 like

Regarding the physical
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think that we should expect
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think I understand your
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

This seems like a hack. The
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

After thinking some more,
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yes, I think that we're
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

My intuition is that it must
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

To first approximation, a
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Actually, I *am* including
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yeah, when I went back and
by Alex Appel on Optimal and Causal Counterfactual Worlds | 0 likes

> Well, we could give up on
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

RSS

Privacy & Terms