 1.  An Untrollable Mathematician  
 Followup to All Mathematicians are Trollable.
It is relatively easy to see that no computable Bayesian prior on logic can converge to a single coherent probability distribution as we update it on logical statements. Furthermore, the nonconvergence behavior is about as bad as could be: someone selecting the ordering of provable statements to update on can drive the Bayesian’s beliefs arbitrarily up or down, arbitrarily many times, despite only saying true things. I called this wild nonconvergence behavior “trollability”. Previously, I showed that if the Bayesian updates on the provabilily of a sentence rather than updating on the sentence itself, it is still trollable. I left open the question of whether some other side information could save us. Sam Eisenstat has closed this question, providing a simple logical prior and a way of doing a Bayesian update on it which (1) cannot be trolled, and (2) converges to a coherent distribution.
 
       7.  Smoking Lesion Steelman  
 It seems plausible to me that any example I’ve seen so far which seems to require causal/counterfactual reasoning is more properly solved by taking the right updateless perspective, and taking the action or policy which achieves maximum expected utility from that perspective. If this were the right view, then the aim would be to construct something like updateless EDT.
I give a variant of the smoking lesion problem which overcomes an objection to the classic smoking lesion, and which is solved correctly by CDT, but which is not solved by updateless EDT.
 
              20.  Existence of distributions that are expectationreflective and know it  
 We prove the existence of a probability distribution over a theory \({T}\) with the property that for certain definable quantities \({\varphi}\), the expectation of the value of a function \({E}[{\ulcorner {\varphi}\urcorner}]\) is accurate, i.e. it equals the actual expectation of \({\varphi}\); and with the property that it assigns probability 1 to \({E}\) behaving this way. This may be useful for selfverification, by allowing an agent to satisfy a reflective consistency property and at the same time believe itself or similar agents to satisfy the same property. Thanks to Sam Eisenstat for listening to an earlier version of this proof, and pointing out a significant gap in the argument. The proof presented here has not been vetted yet.  
  21.  A limitcomputable, selfreflective distribution  

We present a \(\Delta_2\)definable probability distribution \({\Psi}\) that satisfies Christiano’s reflection schema for its own defining formula. The strategy is analogous to the chicken step employed by modal decision theory to obfuscate itself from the eyes of \({\mathsf{PA}}\); we will prevent the base theory \({T}\) from knowing much about \({\Psi}\), so that \({\Psi}\) can be coherent over \({T}\) and also consistently believe in reflection statements. So, the method used here is technical and not fundamental, but it does at least show that limitcomputable and reflective distributions exist. These results are due to Sam Eisenstat and me, and this post benefited greatly from extensive notes from Sam; any remaining errors are probably mine.
Prerequisites: we assume familiarity with Christiano’s original result and the methods used there. In particular, we will freely use Kakutani’s fixed point theorem. See Christiano et al.’s paper.
 
    24.  Exploiting EDT  
 The problem with EDT is, as David Lewis put it, its “irrational policy of managing the news” (Lewis, 1981): it chooses actions not only because of their effects of the world, but also because of what the fact that it’s taking these actions tells it about events the agent can’t affect at all. The canonical example is the smoking lesion problem.
I’ve long been uncomfortable with the smoking lesion problem as the case against EDT, because an AI system would know its own utility function, and would therefore know whether or not it values “smoking” (presumably in the AI case it would be a different goal), and if it updates on this fact it would behave correctly in the smoking lesion. (This is an AIcentric version of the “tickle defense” of EDT.) Nate and I have come up with a variant I find much more convincing: a way to get EDT agents to pay you for managing the news for them, which works by the same mechanism that makes these agents onebox in Newcomb’s problem. (It’s a variation of the thought experiment in my LessWrong post on “the sin of updating when you can change whether you exist”.)
 
 

