Intelligent Agent Foundations Forumsign up / log in
No Constant Distribution Can be a Logical Inductor
discussion post by Alex Appel 16 days ago | Abram Demski, Jessica Taylor and Stuart Armstrong like this | 1 comment

The title says it all.

Consider some valuation over sentences, \(\mathbb{P}:\mathcal{S}\to [0,1]\).

Now consider some set of unproved sentences that make a binary tree, where it is propositionally provable that exactly one of the sentences at the m’th level in the tree are true. The most conceptually clear example is sentences of the form “the first m bits of an infinite bitstring \(\sigma\) are [bitstring]”, as in universal inductors.

However, every deductive process for theorems of a consistent theory has something like this, by the incompleteness theorem. In particular, for the instance where the deductive process outputs all theorems of \(\mathcal{PA}\), you can translate bitstrings into unprovable sentences that fulfill this property. 010 would translate to \[\neg\text{con}(\mathcal{PA})\wedge \text{con}(\mathcal{PA}+\neg\text{con}(\mathcal{PA}))\wedge \neg\text{con}(\mathcal{PA}+\neg\text{con}(\mathcal{PA})+\text{con}(\mathcal{PA}+\neg\text{con}(\mathcal{PA})))\]

Now, consider the trader that, on day n, sells back all the shares it has purchased, and buys 1 dollar worth of shares in the n’th sentence (or just a very large number of shares). This trader has a worst-case value of -1.

There cannot be any arbitrageable propositional inconsistencies in the constant distribution, otherwise, a trader could exploit this for unbounded money.

Because all the sentences in the m’th “stage” of the binary tree are provably mutually exclusive and exhaustive, their prices must add up to 1. Because there are \(2^{m}\) sentences in the m’th stage, at least one must have a price less than \(2^{-m}\). When the trader gets around to buying that one, it will have a best-case value of \(2^{m}\) or more.

As the trader goes through all sentences, its best-case value will be unbounded, as it buys up larger and larger piles of sentences with lower and lower prices. This behavior is forbidden by the logical induction criterion. And it isn’t necessarily in a different world each time. By propositional consistency, there’s some infinite bitstring where the price of the prefix drops by a factor of 2 or more on each time step (always pick the lowest-price continuation of the bitstring prefix), so if we consider that as the world of interest, the supremum of the value of the trader is unbounded above.

This doesn’t seem like much, but it gets extremely weird when you consider that the limit of a logical inductor, \(\mathbb{P}_{\infty}\), is a constant distribution, and by this result, isn’t a logical inductor! If you skip to the end and use the final, perfected probabilities of the limit, there’s a trader that could rack up unboundedly high value!

This appears to be indicating that the logical induction criterion is too strong. It seems to be a restriction on the dance of the traders as they approach the limit, instead of a constraint on the limiting behavior. As Sam pointed out, there seem to be two key criteria to a logical inductor, and looking at the logical inductor paper, the proofs also split into these two categories. The first is a “no arbitrage” condition, where it is impossible to gain value according to all possible worlds, and the second is an Occamian condition, where the logical inductor can’t severely underprice anything, because otherwise some trader could get unbounded value in some possible world by buying shares in that sentence for cheap.

This particular example seems to point to a subtle flaw in the Occam condition. Defending against unbounded value in any possible world is too restrictive to permit a logical inductor limit to be a logical inductor. Two ideas we came up with for weakenings of the concept of “exploitation” were (this isn’t exhaustive, there’s probably more options)

1: The set of worlds where a trader has unbounded value must be positive-measure, according to some appropriate measure on worlds. If you go all-in on some measure 0 world, where the price of various sentences limit to 0, it isn’t reasonable to not get exploited in that case. In particular, note that the world where the trader has unbounded value has no finite description! If that bitstring was computably enumerable, all the prefixes would have a price bounded below by some \(\epsilon\), by Uniform Non-Dogmatism (Theorem 4.6.3 in the LI paper). Maybe it isn’t reasonable to demand not getting exploited on worlds like that, that are impossible to name.

2: The value of the trader must actually limit to infinity, not just have a supremum that limits to infinity. Note that for the trader in this example, if we look at the value according to the world where it exploits the market, it’s at -1 most of the time, and occasionally spikes up to a large value.

by Sam Eisenstat 9 days ago | Jessica Taylor likes this | link

Two minor comments. First, the bitstrings that you use do not all correspond to worlds, since, for example, \(\rm{Con}(\rm{PA}+\rm{Con}(\rm{PA}))\) implies \(\rm{Con}(\rm{PA})\), as \(\rm{PA}\) is a subtheory of \(\rm{PA} + \rm{Con}(\rm{PA})\). This can be fixed by instead using a tree of sentences that all diagonalize against themselves. Tsvi and I used a construction in this spirit in A limit-computable, self-reflective distribution, for example.

Second, I believe that weakening #2 in this post also cannot be satisfied by any constant distribution. To sketch my reasoning, a trader can try to buy a sequence of sentences \(\phi_1, \phi_1 \wedge \phi_2, \dots\), spending \(\$2^{-n}\) on the \(n\)th sentence \(\phi_1 \wedge \dots \wedge \phi_n\). It should choose the sequence of sentences so that \(\phi_1 \wedge \dots \wedge \phi_n\) has probability at most \(2^{-n}\), and then it will make an infinite amount of money if the sentences are simultaneously true.

The way to do this is to choose each \(\phi_n\) from a list of all sentences. If at any point you notice that \(\phi_1 \wedge \dots \wedge \phi_n\) has too high a probability, then pick a new sentence for \(\phi_n\). We can sell all the conjunctions \(\phi_1 \wedge \dots \wedge \phi_k\) for \(k \ge n\) and get back the original amount payed by hypothesis. Then, if we can keep using sharper continuous tests of the probabilities of the sentences \(\phi_1 \wedge \dots \wedge \phi_n\) over time, we will settle down to a sequence with the desired property.

In order to turn this sketch into a proof, we need to be more careful about how these things are to be made continuous, but it seems routine.






I think that in that case,
by Alex Appel on Smoking Lesion Steelman | 1 like

Two minor comments. First,
by Sam Eisenstat on No Constant Distribution Can be a Logical Inductor | 1 like

A: While that is a really
by Alex Appel on Musings on Exploration | 0 likes

> The true reason to do
by Jessica Taylor on Musings on Exploration | 0 likes

A few comments. Traps are
by Vadim Kosoy on Musings on Exploration | 1 like

I'm not convinced exploration
by Abram Demski on Musings on Exploration | 0 likes

Update: This isn't really an
by Alex Appel on A Difficulty With Density-Zero Exploration | 0 likes

If you drop the
by Alex Appel on Distributed Cooperation | 1 like

Cool! I'm happy to see this
by Abram Demski on Distributed Cooperation | 0 likes

Caveat: The version of EDT
by 258 on In memoryless Cartesian environments, every UDT po... | 2 likes

[Delegative Reinforcement
by Vadim Kosoy on Stable Pointers to Value II: Environmental Goals | 1 like

Intermediate update: The
by Alex Appel on Further Progress on a Bayesian Version of Logical ... | 0 likes

Since Briggs [1] shows that
by 258 on In memoryless Cartesian environments, every UDT po... | 2 likes

This doesn't quite work. The
by Nisan Stiennon on Logical counterfactuals and differential privacy | 0 likes

I at first didn't understand
by Sam Eisenstat on An Untrollable Mathematician | 1 like


Privacy & Terms