Intelligent Agent Foundations Forumsign up / log in
The Two-Update Problem, Part 1
post by Abram Demski 1036 days ago | Jessica Taylor, Patrick LaVictoire and Scott Garrabrant like this | discuss

This is a problem which Scott Garrabrant and I were thinking about quite a bit last fall, but moved on from as other problems in logical uncertainty captured our interest. As far as I know, it remains unsolved!

Scott and I were working on the problem of probabilistic priors, or as we’ve started calling them now, random extensions. Suppose you have a set (possibly empty) of axioms in a logic, and you want a “reasonable” probability distribution over the rest of the sentences, which are neither proved nor disproved by your set of sentences. This probability assignment is required to be coherent in the sense given by Paul Christiano or me: they “respect the logic”, namely (using Paul’s version):

  • For any \(\phi\) and \(\psi\), \(\mathbb{B}(\phi) = \mathbb{B}(\phi \wedge \psi)+\mathbb{B}(\phi \wedge \neg \psi)\).
  • For any tautology \(\phi\), \(\mathbb{B}(\phi)=1\).
  • For any contradiction \(\phi\), \(\mathbb{B}(\phi)=0\).

Paul proves that this is equivalent to requiring \(\mathbb{P}\) to be derived from a probability distribution on models (so that the probability of a sentence is the probability that the sentence is true).

Denote the random extension starting with theory \(T\) as \(\mathbb{P}_T (\phi)\). We additionally require that for \(\phi \in T\), \(\mathbb{B}(\phi)=1\).

We also wanted these probability distributions to be approximable, among other desirable properties.

The two-update problem began as the observation that, for my prior, adding a sentence \(\phi\) to the initial theory is quite different from performing a Bayesian update on \(\phi\). As our research progressed, we noticed that this seemed to be true of all the proposals we came up with. There were two distinct ways to “update” on a sentence: either do a Bayesian update to make its probability 1, or add it to the initial set of axioms which are true before the prior’s construction.

Stated as a formal problem, this may look easy to “solve”. For finite non-empty \(T\), we can simply re-define \(\mathbb{P}_T (\phi) = \mathbb{P}_{\emptyset}(\phi|T)\). (There is a complication if you try to do this for infinite \(T\), which is explained here. ) However, this seems to miss the point. Despite re-defining things in this way, if you look at the actual specification of the priors we were considering, a kind of non-bayesian update was quite visible. The non-bayesian update doesn’t go away if we take away the set of axioms (AKA the theory we’re extending into a probability distribution). Rather, it is visible in how we update on proofs.

As we approximate \(\mathbb{P}\) to increasing degree, we observe proofs of one thing or another and incorporate this into the distribution. This “incorporation” step was different depending on the prior \(\mathbb{P}\), but all of our options seemed to have it, and none of them looked like a Bayesian update. This “incorporation” step is a different way of understanding the meaning of the two-update problem, but harder to formalize exactly. It seems to us that this is really the same two-update problem, if you don’t make the ad-hoc patch mentioned above.

To put it simply: updating on an observed proof does not look like a Bayesian update. This seems potentially concerning.

Paul Christiano hit upon the same idea in his paradox of ignorance (Section 6.1.5 of the non-omniscience paper). He presented the following scenario:

Suppose two agents are trying to determine the truth of \(\forall x: \phi (x)\) for computable \(\phi\). These agents are both using the same definition of \(\mathbb{P}\), and the same set of axioms (we can suppose these are \(PA\)). However, one agent has a large amount of computing power with which to approximate \(\mathbb{P}\), and the other has relatively little. Both agents have been given a resource: a machine which computes \(\phi(x)\) for any value of \(x\). Unfortunately, this machine gets slower and slower the larger the value of \(x\) is asked for.

Suppose that \(\forall x: \phi (x)\) is actually true, but that this fact is not provable in \(PA\). Now, it naively seems that the agent with a large amount of computing power is at an advantage. However, it’s not quite so clear. What that computing power does is allow \(\mathbb{P}\) to be computed more accurately. In particular, the strong agent will have time to prove many more cases of \(\phi(x)\) than the weak agent. This means that the strong agent cannot update on those cases of \(\phi(x)\) when computed by the machine: the small examples already have probability 1, because the agent has proved them. To improve its knowledge, this agent must wait for large examples to slowly compute. The small agent, on the other hand, can look at many small values very quickly.

This is far from a formal proof of the paradox, and Paul suggests several possible ways out. It’s possible that the approximation of \(\mathcal{P}\) raises the probability of \(\forall x: \phi (x)\) just as much when it computes a new case of \(\phi (x)\) as it’s raised if we do a Bayesian update on its provability (observing the machine compute it). However, this is harder than it may initially sound.

Continued in part 2.





I found an improved version
by Alex Appel on A Loophole for Self-Applicative Soundness | 0 likes

I misunderstood your
by Sam Eisenstat on A Loophole for Self-Applicative Soundness | 0 likes

Caught a flaw with this
by Alex Appel on A Loophole for Self-Applicative Soundness | 0 likes

As you say, this isn't a
by Sam Eisenstat on A Loophole for Self-Applicative Soundness | 1 like

Note: I currently think that
by Jessica Taylor on Predicting HCH using expert advice | 0 likes

Counterfactual mugging
by Jessica Taylor on Doubts about Updatelessness | 0 likes

What do you mean by "in full
by David Krueger on Doubts about Updatelessness | 0 likes

It seems relatively plausible
by Paul Christiano on Maximally efficient agents will probably have an a... | 1 like

I think that in that case,
by Alex Appel on Smoking Lesion Steelman | 1 like

Two minor comments. First,
by Sam Eisenstat on No Constant Distribution Can be a Logical Inductor | 1 like

A: While that is a really
by Alex Appel on Musings on Exploration | 0 likes

> The true reason to do
by Jessica Taylor on Musings on Exploration | 0 likes

A few comments. Traps are
by Vadim Kosoy on Musings on Exploration | 1 like

I'm not convinced exploration
by Abram Demski on Musings on Exploration | 0 likes

Update: This isn't really an
by Alex Appel on A Difficulty With Density-Zero Exploration | 0 likes


Privacy & Terms