I’d like to credit Daniel Demski for helpful discussion.
So, the Eisenstat prior mentioned earlier seemed like a rather fundamental advance on the state of the art in logical uncertainty. Given that it was essentially a modification of the Demski prior, the obvious next question was “what’s the analogue of the modified Demski prior in this setting?”
As a quick refresher, the Demski prior sampled sentences, while the modified Demski prior sampled turing machines that would spit out a (potentially infinite) list of sentences. The key advance of the modified Demski prior over the Demski prior is that the modified Demski prior enables assigning positive probability to certain computable infinite sequences of sentences. The prototypical example was assigning positive probability to firstorder PA with its infinite axiom schema, but this property also permitted the result in the logical induction paper where a turing machine could be asymptotically trusted to only output true things, even if the proof length of the statements grows everfurther beyond our ability to check.
The Demski and modified Demski prior, in their true forms, involve calling a halting oracle to determine whether a contradiction is present when a sentence/infinite sequence of sentences is added to the “sentence pool”.
A rather simple and general (though not very efficient) process for approximating the probability of an event under some prior involves specifying some sort of “priorgenerating process”, and then montecarlo sampling it a whole lot of times and seeing how many runs generated the event. In the case of the Demski and modified Demski prior, the use of a halting oracle means the priorgenerating process can’t be emulated. However, since a halting oracle is limitcomputable, throwing a whole lot of computing resources at consistency checking yields a better and better ability to emulate what the true version of the Demski/modified Demski priors are doing.
The key property of the Eisenstat prior that makes it \(\varepsilon\)approximable instead of just limitcomputable is that there’s a particular time by which you’re nearly guaranteed to run into either \(\phi\) or \(\neg\phi\). Any sort of “modified Eisenstat prior” should preserve this property.
So, without further ado, I’ll present the algorithm, go into the philosophy of why it is the way it is, handwave at a few nice properties it probably has, and point out future directions.
DEFINITIONS:
\(\mu_{\Phi}\) is a probability measure over sentences, s.t. \(\forall n\in\mathbb{N}:\mu_{\Phi}((\neg)^{2n}\phi)=\mu_{\Phi}((\neg)^{2n+1}\phi)\). In short, you’re as likely to run into \(\phi\) as \(\neg\phi\).
\(\mu_{M}\) is a probability measure over turing machines.
If x is a string of sentences, \(x\phi\) is the string of sentences concantated with \(\phi\).
\(\lambda\) is the empty string.
\(y\sqsubseteq x\) means y is a prefix of x (a prefix in the sense of a substring of sentences, not a prefix of x interpreted as a bitstring).
\(r(t(x))\) is the runtime of turing machine \(t\) on input \(x\).
\(t(x)\) is the bitstring that is the output of turing machine \(t\) on input \(x\).
\(f(x)\) is an arbitrary function that bounds the runtime of a turing machine in terms of the length of the input sentence string as measured in bits (so a long sentence would increase \(x\) more than a short one would).
In short, we sample a turing machine, check to make sure it obeys our runtime bounds and accepts the past history of sentences as valid and hasn’t been blacklisted yet, and throw it into the pool of “possibly valid constraints”. Then we generate a sentence \(\phi\), test all turing machines in the pool of “possibly valid constraints” against \(x\phi\) and \(x\neg\phi\) and blacklist every TM that times out or rejects both extensions. Then we divide the remaining turing machines into “team A” and “team B”, which favor \(\phi\) and \(\neg\phi\) as the extensions, use the ratio of measures of the teams to add either \(\phi\) or \(\neg\phi\) as a sentence, and blacklist all members of the losing team. Then just iterate.
In pseudocode, this is:
\(T^{}:=\emptyset\)
\(T:=\emptyset\)
\(x:=\lambda\)
FOR \(n\in\mathbb{N}\)
. . Sample a turing machine \(t\) from \(\mu_{M}\)
. . IF \(t\not\in T^{}\)
. . . . Failurebool:=0
. . . . . . FOR all \(y\sqsubseteq x\)
. . . . . . . . IF \(r(t(y))>f(y)\vee t(y)\neq 1\)
. . . . . . . . . . Failurebool:=1, END
. . . . IF Failurebool=0
. . . . . . \(T:=T\cup{t}\)
. . . . . . ELSE \(T^{}:=T^{}\cup{t}\)
. . Sample a sentence \(\phi\) from \(\mu_{\Phi}\)
. . FOR all \(t\in T\)
. . . . IF \(r(t(x\phi))>f(x\phi)\vee r(t(x\phi))>f(x\neg\phi)\vee(t(x\phi)\neq 1\wedge t(x\phi)\neq 1)\)
. . . . . . \(T:=T/{t}\)
. . . . . . \(T^{}:=T^{}\cup{t}\)
. . \(T_{A}:=\emptyset\)
. . \(T_{B}:=\emptyset\)
. . FOR all \(t\in T\)
. . . . IF \(t(x\phi)=1\wedge t(x\neg\phi)\neq 1\)
. . . . . . \(T_{A}:=T_{A}\cup{t}\)
. . . . ELSE IF \(t(x\neg\phi)=1\wedge t(x\phi)\neq 1\)
. . . . . . \(T_{B}:=T_{B}\cup{t}\)
. . FLIP with probability \(\frac{\sum_{t\in T_{A}}\mu_{M}(t)}{\sum_{t\in T_{A}\cup T_{B}}\mu_{M}(t)}\)
. . . . IF heads
. . . . . . \(x:=x\phi\)
. . . . . . \(T:=T/{T_{B}}\)
. . . . ELSE
. . . . . . \(x:=x\neg\phi\)
. . . . . . \(T:=T/{T_{A}}\)
Ok, so why is this the way it is?
PHILOSOPHY:
Well, we want to have a certain time by which we are pretty much guaranteed to either run into \(\phi\) or \(\neg\phi\), as this property is what permits \(\varepsilon\)approximation.
Consider the standard modified Demski prior, except that we just sample one turing machine and that’s it, and also it’ll magically stop cold as soon as it hits the first sentence that’s propositionally inconsistent with all those that came before. \(\mathbb{P}(\phi)\) is the probability that x will be output by this process. Note that as described, \(\mathbb{P}(\phi)+ \mathbb{P}(\neg\phi)<1\). Even in this really simple case where you don’t have to worry about inconsistency testing or multiple turing machines, there are sufficiently small \(\varepsilon\) where it’s impossible to \(\varepsilon\)approximate these probabilities within a knownaheadoftime time bound, because of the (much greater than \(\varepsilon\)) probability of sampling a turing machine that outputs a propositionally consistent list of sentences, and also outputs \(\phi\) ahead of whatever time bound you named.
Thus, we seem to have a fully general problem with trusting in turing machines that enumerate infinite, or even very big, lists of sentences, because we have uncertainty about whether they’ll ever output some \(\phi\). Is this a fundamental obstacle to having a prior that assigns positive probability to some infinite list of sentences?
In my opinion, yes. Also, I don’t think the lack of this property is fatal, and if we look at what “positive probability to an infinite list of sentences” actually buys us, we can get something just about as good.
The “positive probability to an infinite list of sentences” property seems to cash out in practice as “if we verify a bunch of sentences spat out by a particular turing machine, we should boost our probability assigned to the rest of them”. It is what allows inductively learning that something in the environment has superior computing powers, and can be trusted.
Imagine the concrete case where you see a sentence get spat out by a trusted turing machine. You go from uncertainty in it to nearcertainty that it’s true. You’ve just updated on the logical fact \(t(n)=\phi\) and it boosted your probability in \(\phi\). But you don’t know in advance what the turing machine will output on larger inputs of n. So it sure seems like it’s possible to believe that a turing machine outputs true things, but the prior assigns 0 probability to the whole infinite list of sentences because you don’t know in advance what it will output, so you’d have to update on infinitely many things in order to assign positive probability to the whole conjunction, ie, the sequence of observations of the form “turing machine outputted \(\phi\)”.
You don’t actually need positive probability to infinite conjunctions to have abstract trust in the output of things with long runtime! Or at least it sure seems that way from our thought experiment.
Ok, but what sort of thing could the constraint “trust in turing machine t” correspond to? Well, maybe, in addition to a propositional consistency checker, we also have another checker that throws out all strings of theorems that contain both \(t(n)=\phi\) and \(\neg\phi\). This wouldn’t take much runtime at all to implement.
Wait, put that way, propositional consistency and trust in a turing machine both seem like the same sort of thing! The original Eisenstat prior was just the one where propositional consistency was the only constraint.
Also, if we’re using the “turing machine outputs a list of sentences” thing, we may be waiting arbitrarily long, but if we instead consider turing machines that take a string of sentences and accept or reject them, and operate within a known time (both constraints we came up with are of this form), we’re back to being able to generate strings of sentences and have a known time by which we are nearly guaranteed to hit \(\phi\) or \(\neg\phi\)! So \(\varepsilon\)approximation is preserved if all your (finite) set of constraints have a known runtime bound.
We want to learn all constraints. So, the obvious next step is having a pool of constraints/turing machines that is added to by random sampling, and turing machines are kicked out of it for running over time, rejecting the past history of the string of sentences (this condition suggested by Daniel Demski), or rejecting both possible extensions of a string (this is to nearly guarantee hitting \(\phi\) or \(\neg\phi\) after known time).
And, by doing Bayesian updates on sentences that are more likely to be present when certain constraints are present, it’s possible to inductively learn which constraints apply to the true environment!
There is a bit of weirdness here, because in full generality, these turing machines can care about order, unlike the P.C. and turingmachinetrust constraints. Also, since the list of constraints gets added to over time, this prior expects the future to be more ordered/obey more constraints. Also, since a “pseudoconstraint” that only times out or rejects both extensions in very rare circumstances can hang around for a long time before being blacklisted, this prior has parts that have a certain constraint apply for a long time, and then it lifts.
Still to do: What happens when constraints disagree? (Daniel Demski pointed out that my initial stab at this was wrong, and I generated this next stretch as a fix)
Well, the intended use case of this is to update on a list of theorems. Image that you know that either \(\phi\) or \(\neg\phi\) is the next theorem coming in, and the constraints disagree. You can split them into three teams. Team A says that \(x\phi\) is the proper extension, Team B says that \(x\neg\phi\) is the proper extension, and Team C accepts both as valid, and we can just ignore them.
Although it steps outside the intended use case, if you imagine generating constraints way faster than the theorems come to you, it’s generically expected that your constraints will disagree on the upcoming thing because you used enough search power to find some constraints that got the entire past history right, but disagree on the upcoming sentence. If you decide between the teams by flipping a coin, then you’d get half of all problems wrong, forever. So, deciding by the probability mass of the turing machines was used instead.
If we imagine updating on the true list of theorems, the true constraints that reality follows will stick around forever. Call that team A. And define team \(B_{n}\) as the turing machines that disagree with team A on turn n, and team \(B\) as the union of all of them. Each turn, everyone in team \(B_{n}\) dies. So the total probability mass that team \(B\) can muster limits to 0 as time goes on and more and more theorems get seen, while team A’s probability mass remains constant. Thus, in the limit, the constraints on team A will be trusted and used to generate extensions that follow the “rules of reality” for increasingly long times.
HANDWAVING:
It should be possible to \(\varepsilon\)approximate the probability of a sentence in knowable time.
It should be possible to implement Bayesian updates on either sentences or constraints by the same path mentioned in the original post.
In the limit of updating on theorems, all “constraints” on sets of sentences that are computable in [insert time bound here] time that apply to reality should be learned by this prior.
I’m pretty dang sure you don’t need the full strength of propositional consistency to get a coherent distribution in the limit of updating on evidence. Every propositionally inconsistent world has a disproof of the form \(a\wedge b\wedge c…\to\neg\phi\) when \(\phi\) is true, so a sufficient (though maybe not minimal) constraint that imposes a coherent distribution would be “the string cannot contain \(a\wedge b\wedge c…\to\neg\phi\), and \(a\), and \(b\)… and \(\phi\) and if it does, reject it” because that should take out every propositionally inconsistent world in the limit.
Note that the above permits logical uncertainty over very complicated Booleans that would take a long time to evaluate, like hard 3SAT instances. Full propositional consistency renders logical uncertainty over those sorts of things impossible, while it very plainly exists in reality.
NEXT STEPS:
Find a hole in this thing if one is present.
How can this sort of framework be made to interact sensibly with sequence prediction? Does it already interact sensibly with sequence prediction, or do important modifications need to be made?
Actually, in general, sequence prediction and “what is true?” prediction seem quite hard to marry in this framework, because a nearguarantee that you’ll hit either \(\phi\) or \(\neg\phi\) after a known amount of time doesn’t interact well with “I believe that neither \(\phi\) or \(\neg\phi\) will show up in my observed list of theorems, unless an astronomical amount of time has gone by”. Some sort of resolution must be found.
What should it be named if it does turn out to be important? I was considering the “EisenstatAppel Prior”, or EA Prior for short, but apparently it’s bad taste to name things after yourself and someone probably has a better idea.
