  2.  Logical Inductors Converge to Correlated Equilibria (Kinda)   post by Alex Appel 197 days ago  Sam Eisenstat and Jessica Taylor like this  1 comment  
 Logical inductors of “similar strength”, playing against each other in a repeated game, will converge to correlated equilibria of the oneshot game, for the same reason that players that react to the past plays of their opponent converge to correlated equilibria. In fact, this proof is essentially just the proof from Calibrated Learning and Correlated Equilibrium by Forster (1997), adapted to a logical inductor setting.
 
    5.  An Untrollable Mathematician   post by Abram Demski 320 days ago  Alex Appel, Sam Eisenstat, Vadim Kosoy, Jack Gallagher, Jessica Taylor, Paul Christiano, Scott Garrabrant and Vladimir Slepnev like this  1 comment  
 Followup to All Mathematicians are Trollable.
It is relatively easy to see that no computable Bayesian prior on logic can converge to a single coherent probability distribution as we update it on logical statements. Furthermore, the nonconvergence behavior is about as bad as could be: someone selecting the ordering of provable statements to update on can drive the Bayesian’s beliefs arbitrarily up or down, arbitrarily many times, despite only saying true things. I called this wild nonconvergence behavior “trollability”. Previously, I showed that if the Bayesian updates on the provabilily of a sentence rather than updating on the sentence itself, it is still trollable. I left open the question of whether some other side information could save us. Sam Eisenstat has closed this question, providing a simple logical prior and a way of doing a Bayesian update on it which (1) cannot be trolled, and (2) converges to a coherent distribution.
 
   7.  Smoking Lesion Steelman   post by Abram Demski 525 days ago  Tom Everitt, Sam Eisenstat, Vadim Kosoy, Paul Christiano and Scott Garrabrant like this  10 comments  
 It seems plausible to me that any example I’ve seen so far which seems to require causal/counterfactual reasoning is more properly solved by taking the right updateless perspective, and taking the action or policy which achieves maximum expected utility from that perspective. If this were the right view, then the aim would be to construct something like updateless EDT.
I give a variant of the smoking lesion problem which overcomes an objection to the classic smoking lesion, and which is solved correctly by CDT, but which is not solved by updateless EDT.
 
        14.  The Ubiquitous Converse Lawvere Problem   post by Scott Garrabrant 608 days ago  Marcello Herreshoff, Sam Eisenstat, Jessica Taylor and Patrick LaVictoire like this  discuss  
 In this post, I give a stronger version of the open question presented here, and give a motivation for this stronger property. This came out of conversations with Marcello, Sam, and Tsvi.
Definition: A continuous function \(f:X\rightarrow Y\) is called ubiquitous if for every continuous function \(g:X\rightarrow Y\), there exists a point \(x\in X\) such that \(f(x)=g(x)\).
Open Problem: Does there exist a topological space \(X\) with a ubiquitous function \(f:X\rightarrow[0,1]^X\)?
 
        21.  A limitcomputable, selfreflective distribution   post by Tsvi BensonTilsen 1120 days ago  Sam Eisenstat, Vadim Kosoy, Abram Demski, Jessica Taylor, Nate Soares, Patrick LaVictoire, Paul Christiano and Scott Garrabrant like this  1 comment  

We present a \(\Delta_2\)definable probability distribution \({\Psi}\) that satisfies Christiano’s reflection schema for its own defining formula. The strategy is analogous to the chicken step employed by modal decision theory to obfuscate itself from the eyes of \({\mathsf{PA}}\); we will prevent the base theory \({T}\) from knowing much about \({\Psi}\), so that \({\Psi}\) can be coherent over \({T}\) and also consistently believe in reflection statements. So, the method used here is technical and not fundamental, but it does at least show that limitcomputable and reflective distributions exist. These results are due to Sam Eisenstat and me, and this post benefited greatly from extensive notes from Sam; any remaining errors are probably mine.
Prerequisites: we assume familiarity with Christiano’s original result and the methods used there. In particular, we will freely use Kakutani’s fixed point theorem. See Christiano et al.’s paper.
 
     
Older 
 NEW POSTSNEW DISCUSSION POSTS[Note: This comment is three
by Ryan Carey on A brief note on factoring out certain variables  0 likes 
There should be a chat icon
Apparently "You must be
There is a replacement for
Regarding the physical
by Vadim Kosoy on The LearningTheoretic AI Alignment Research Agend...  0 likes 
I think that we should expect
by Vadim Kosoy on The LearningTheoretic AI Alignment Research Agend...  0 likes 
I think I understand your
by Jessica Taylor on The LearningTheoretic AI Alignment Research Agend...  0 likes 
This seems like a hack. The
by Jessica Taylor on The LearningTheoretic AI Alignment Research Agend...  0 likes 
After thinking some more,
by Vadim Kosoy on The LearningTheoretic AI Alignment Research Agend...  0 likes 
Yes, I think that we're
by Vadim Kosoy on The LearningTheoretic AI Alignment Research Agend...  0 likes 
My intuition is that it must
by Vadim Kosoy on The LearningTheoretic AI Alignment Research Agend...  0 likes 
To first approximation, a
by Vadim Kosoy on The LearningTheoretic AI Alignment Research Agend...  0 likes 
Actually, I *am* including
by Vadim Kosoy on The LearningTheoretic AI Alignment Research Agend...  0 likes 
Yeah, when I went back and
by Alex Appel on Optimal and Causal Counterfactual Worlds  0 likes 
> Well, we could give up on
by Jessica Taylor on The LearningTheoretic AI Alignment Research Agend...  0 likes 
