Intelligent Agent Foundations Forumsign up / log in

I misunderstood your proposal, but you don’t need to do this work to get what you want. You can just take each sentence \(\square_n \phi \to \phi\) as an axiom, but declare that this axiom takes \(n\) symbols to invoke. This could be done by changing the notion of length of a proof, or by taking axioms \(\psi_{\phi,n} \to (\square_n \phi \to \phi)\) and \(\psi_{\phi,n}\) with \(\psi_{\phi,n}\) very long.


As you say, this isn’t a proof, but it wouldn’t be too surprising if this were consistent. There is some \(k \in \mathbb{N}\) such that \(\square_n \phi \to \phi\) has a proof of length \(n^k\) by a result of Pavel Pudlák (On the length of proofs of finitistic consistency statements in first order theories). Here I’m making the dependence on \(n\) explicit, but not the dependence on \(\phi\). I haven’t looked at it closely, but the proof strategy in Theorems 5.4 and 5.5 suggests that \(k\) will not depend on \(\phi\), as long as we only ask for the weaker property that \(\square_n \phi \to \phi\) will only be provable in length \(n^k\) for sentences \(\phi\) of length at most \(k\).


by Alex Appel 369 days ago | link

I found an improved version by Pavel, that gives a way to construct a proof of \(\phi\) from \(\Box_{n}\phi\) that has a length of \(\mathcal{O}(n)\). The improved version is here.

There are restrictions to this result, though. One is that the C-rule must apply to the logic. This is just the ability to go from \(\exists x:\phi(x)\) to instantiating a \(c\) such that \(\phi(c)\). Pretty much all reasonable theorem provers have this.

The second restriction is that the theory must be finitely axiomatizable. No axiom schemas allowed. Again, this isn’t much of a restriction in practice, because NBG set theory, which proves the consistency of ZFC, is finitely axiomatizable.

The proof strategy is basically as follows. It’s shown that the shortest proof of a statement with quantifier depth n must have a length of \(\Omega(n^2)\), if the maximum quantifier depth in the proof is \(2n\) or greater.

This can be flipped around to conclude that if there’s a length-n proof of \(\phi\), the maximum quantifier depth in the proof can be at most \(\mathcal{O}(\sqrt{n})\).

The second part of the proof involves constructing a bounded-quantifier version of a truth predicate. By Tarski’s undefinability of truth, a full truth predicate cannot be constructed, but it’s possible to exhibit a formula for which it’s provable that \[qd(\overline{\psi})\le n\to(Sat_{n}(\overline{\psi},x)\leftrightarrow\Sigma(Sat_{n},\overline{\psi},x))\] (\(\Sigma\) is the formula laying out Tarski’s conditions for something to be a truth predicate). Also, if \(n\ge\) quantifier depth of \(\psi\), there’s a proof of \[Sat_{n}(\overline{\psi},x)\leftrightarrow\psi[x]\] (\(\psi[x]\) is the sentence \(\psi\) with its free variables substituted for the elements enumerated in the list \(x\)) Also, there’s a proof that \(Sat_{n}\) is preserved under inference rules and logical axioms, as long as everything stays below a quantifier depth of \(n\).

All these proofs can be done in \(\mathcal{O}(n^2)\) lines. One factor of \(n\) comes from the formula abbreviated as \(Sat_{n}(x,y)\) getting longer at a linear rate, and the other factor comes from having to prove \(Sat_{n}\) for each \(n\) seperately as an ingredient for the next proof.

Combining the two parts, the \(\mathcal{O}(\sqrt{n})\) bound on the quantifier depth and the \(\mathcal{O}(n^2)\) bound on how long it takes to prove stuff about the truth predicate, make it take \(\mathcal{O}(n)\) steps to prove all the relevant theorems about a sufficiently large bounded quantifier depth truth predicate, and then you can just go “the statement that we are claiming to have been proved must have \(Sat_{n}\) apply to it, and we’ve proved this is equivalent to the statement itself”

As a further bonus, a single \(\mathcal{O}(n)\)-length proof can establish the consistency of the theory itself for all \(n\)-length proofs.

It seems like a useful project to develop a program that will automatically write a proof of this form, to assess whether abstract unpacking of bounded proofs is usable in practice, but it will require digging into a bunch of finicky details of exactly how to encode a math theory inside itself.


Two minor comments. First, the bitstrings that you use do not all correspond to worlds, since, for example, \(\rm{Con}(\rm{PA}+\rm{Con}(\rm{PA}))\) implies \(\rm{Con}(\rm{PA})\), as \(\rm{PA}\) is a subtheory of \(\rm{PA} + \rm{Con}(\rm{PA})\). This can be fixed by instead using a tree of sentences that all diagonalize against themselves. Tsvi and I used a construction in this spirit in A limit-computable, self-reflective distribution, for example.

Second, I believe that weakening #2 in this post also cannot be satisfied by any constant distribution. To sketch my reasoning, a trader can try to buy a sequence of sentences \(\phi_1, \phi_1 \wedge \phi_2, \dots\), spending \(\$2^{-n}\) on the \(n\)th sentence \(\phi_1 \wedge \dots \wedge \phi_n\). It should choose the sequence of sentences so that \(\phi_1 \wedge \dots \wedge \phi_n\) has probability at most \(2^{-n}\), and then it will make an infinite amount of money if the sentences are simultaneously true.

The way to do this is to choose each \(\phi_n\) from a list of all sentences. If at any point you notice that \(\phi_1 \wedge \dots \wedge \phi_n\) has too high a probability, then pick a new sentence for \(\phi_n\). We can sell all the conjunctions \(\phi_1 \wedge \dots \wedge \phi_k\) for \(k \ge n\) and get back the original amount payed by hypothesis. Then, if we can keep using sharper continuous tests of the probabilities of the sentences \(\phi_1 \wedge \dots \wedge \phi_n\) over time, we will settle down to a sequence with the desired property.

In order to turn this sketch into a proof, we need to be more careful about how these things are to be made continuous, but it seems routine.


by Sam Eisenstat 507 days ago | Abram Demski likes this | link | parent | on: An Untrollable Mathematician

I at first didn’t understand your argument for claim (2), so I wrote an alternate proof that’s a bit more obvious/careful. I now see why it works, but I’ll give my version below for anyone interested. In any case, what you really mean is the probability of deciding a sentence outside of \(\Phi\) by having it announced by nature; there may be a high probability of sentences being decided indirectly via sentences in \(\Phi\).

Instead of choosing \(\Phi\) as you describe, pick \(\Phi\) so that the probability \(\mu(\Phi)\) of sampling something in \(\Phi\) is greater than \(1 - \mu(\psi) \cdot \varepsilon / 2\). Then, the probability of sampling something in \(\Phi - \{\psi\}\) is at least \(1 - \mu(\psi) \cdot (1 + \varepsilon / 2)\). Hence, no matter what sentences have been decided already, the probability that repeatedly sampling from \(\mu\) selects \(\psi\) before it selects any sentence outside of \(\Phi\) is at least

\begin{align*} \sum_{k = 0}^\infty (1 - \mu(\psi) \cdot (1 + \varepsilon / 2))^k \cdot \mu(\psi) & = \frac{\mu(\psi)}{\mu(\psi) \cdot (1 + \varepsilon / 2)} \\ & > 1 - \varepsilon / 2 \end{align*}

as desired.

Furthermore, this argument makes it clear that the probability distribution we converge to depends only on the set of sentences which the environment will eventually assert, not on their ordering!

Oh, I didn’t notice that aspect of things. That’s pretty cool.


A few thoughts:

I agree that the LI criterion is “pointwise” in the way that you describe, but I think that this is both pretty good and as much as could actually be asked. A single efficiently computable trader can do a lot. It can enforce coherence on a polynomially growing set of sentences, search for proofs using many different proof strategies, enforce a polynomially growing set of statistical patterns, enforce reflection properties on a polynomially large set of sentences, etc. So, eventually the market will not be exploitable on all these things simultaneously, which seems like a pretty good level of accurate beliefs to have.

On the other side of things, it would be far to strong to ask for a uniform bound of the form “for every \(\varepsilon > 0\), there is some day \(t\) such that after step \(t\), no trader can multiply its wealth by a factor more than \(1 + \varepsilon\)”. This is because a trader can be hardcoded with arbitrarily many hard-to-compute facts. For every \(\delta\), there must eventually be a day \(t' > t\) on which the belief of your logical inductor assign probability less than \(\delta\) to some true statement, at which point a trader who has that statement hardcoded can multiply its wealth by \(1/\delta\). (I can give a construction of such a sentence using self-reference if you want, but it’s also intuitively natural - just pick many mutually exclusive statements with nothing to break the symmetry.)

Thus, I wouldn’t think of traders as “mistakes”, as you do in the post. A trader can gain money on the market if the market doesn’t already know all facts that will be listed by the deductive process, but that is a very high bar. Doing well against finitely many traders is already “pretty good”.

What you can ask for regarding uniformity is for some simple function \(f\) such that any trader \(T\) can multiply its wealth by at most a factor \(f(T)\). This is basically the idea of the mistake bound model in learning theory; you bound how many mistakes happen rather than when they happen. This would let you say a more than the one-trader properties I mentioned in my first paragraph. In fact, \(\tt{LIA}\) has this propery; \(f(T)\) is just the initial wealth of the trader. You may therefore want to do something like setting traders’ initial wealths according to some measure of complexity. Admittedly this isn’t made explicit in the paper, but there’s not much additional that needs to be done to think in this way; it’s just the combination of the individual proofs in the paper with the explicit bounds you get from the initial wealths of the traders involved.

I basically agree completely on your last few points. The traders are a model class, not an ensemble method in any substantive way, and it is just confusing to connect them to the papers on ensemble methods that the LI paper references. Also, while I use the idea of logical induction to do research that I hope will be relevant to practical algorithms, it seems unlikely than any practical algorithm will look much like a LI. For one thing, finding fixed points is really hard without some property stronger than continuity, and you’d need a pretty good reason to put it in the inner loop of anything.


Universal Prediction of Selected Bits solves the related question of what happens if the odd bits are adversarial but the even bits copy the preceding odd bits. Basically, the universal semimeasure learns to do the right thing, but the exact sense in which the result is positive is subtle and has to do with the difference between measures and semimeasures. The methods may also be relevant to the questions here, though I don’t see a proof for either question yet.


Yeah, the 5 and 10 problem in the post actually can be addressed using provability ideas, in a way that fits in pretty natually with logical induction. The motivation here is to work with decision problems where you can’t prove statements \(A = a \to U = u\) for agent \(A\), utility function \(U\), action \(a\), and utility value \(u\), at least not with the amount of computing power provided, but you want to use inductive generalizations instead. That isn’t necessary in this example, so it’s more of an illustration.

To say a bit more, if you make logical inductors propositionally consistent, similarly to what is done in this post, and make them assign things that have been proven already probability 1, then they will work on the 5 and 10 problem in the post.

It would be interesting if there was more of an analogy to explore between the provability oracle setting and the inductive setting, and more ideas could be carried over from modal UDT, but it seems to me that this is a different kind of problem that will require new ideas.


It’s hard to analyze the dynamics of logical inductors too precisely, so we often have to do things that feel like worst-case analysis, like considering an adversarial trader with sufficient wealth. I think that problems that show up from this sort of analysis can be expected to correspond to real problems in superintelligent agents, but that is a difficult question. The malignancy of the universal prior is part of the reason.

As to your proposed solution, I don’t see how it would work. Scott is proposing that the trader makes conditional contracts, which are in effect voided if the event that they are conditional on doesn’t happen, so the trader doesn’t actually lose anything is this case. (It isn’t discussed in this post, but conditional contracts can be built out of the usual sort of bets, with payoffs given by the definition of conditional probability.) So, in order to make the trader lose money, the events need to happen sometimes, not just be expect to happen with some nonnegligable probability by the market.


In counterfactual mugging with a logical coin, AsDT always uses a logical inductor’s best-estimate of the utility it would get right now, so it sees the coin as already determined, and sees no benefit from giving Omega money in the cases where Omega asks for money.

The way I would think about what’s going on is that if the coin is already known at the time that the expectations are evaluated, then the problem isn’t convergent in the sense of AsDT. The agent that pays up whenever asked has a constant action, but it doesn’t receive a constant expected utility. You can think of the averaging as introducing artificial logical uncertainty to make more things convergent, which is why it’s more updateless. (My understanding is that this is pretty close to how you’re thinking of it already.)


by Abram Demski 734 days ago | Sam Eisenstat and Jack Gallagher like this | link

I think AsDT has a limited notion of convergent problem. It can only handle situations where the optimal strategy is to make the same move each time. Tail-dependence opens this up, partly by looking at the limit of average payoff rather than the limit of raw payoff. This allows us to deal with problems where the optimal strategy is complicated (and even somewhat dependent on what’s done in earlier instances in the sequence).

I wasn’t thinking of it as introducing artificial logical uncertainty, but I can see it that way.


by Sam Eisenstat 734 days ago | link

Yeah, I like tail dependence.

There’s this question of whether for logical uncertainty we should think of it more as trying to “un-update” from a more logically informed perspective rather than trying to use some logical prior that exists at the beginning of time. Maybe you’ve heard such ideas from Scott? I’m not sure if that’s the right perspective, but it’s what I’m alluding to when I say you’re introducing artificial logical uncertainty.


by Abram Demski 733 days ago | link

I don’t think it’s much like un-updating. Un-updating takes a specific fact we’d like to pretend we don’t know. Plus, the idea there is to back up the inductor. Here I’m looking at average performance as estimated by the latest stage of the inductor. The artificial uncertainty is more like pretending you don’t know which problem in the sequence you’re at.


This isn’t too related to your main point, but every ordered field can be embedded into a field of Hahn series, which might be simpler to work with than surreals.

That page discusses the basics of Hahn series, but not the embedding theorem. (Ehrlich, 1995) treats things in detail, but is long and introduces a lot of definitions. The embedding theorem is stated on page 23 (24 in the pdf).







[Note: This comment is three
by Ryan Carey on A brief note on factoring out certain variables | 0 likes

There should be a chat icon
by Alex Mennen on Meta: IAFF vs LessWrong | 0 likes

Apparently "You must be
by Jessica Taylor on Meta: IAFF vs LessWrong | 1 like

There is a replacement for
by Alex Mennen on Meta: IAFF vs LessWrong | 1 like

Regarding the physical
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think that we should expect
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think I understand your
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

This seems like a hack. The
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

After thinking some more,
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yes, I think that we're
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

My intuition is that it must
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

To first approximation, a
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Actually, I *am* including
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yeah, when I went back and
by Alex Appel on Optimal and Causal Counterfactual Worlds | 0 likes

> Well, we could give up on
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes


Privacy & Terms