Intelligent Agent Foundations Forumsign up / log in
1.Logical Inductor Tiling and Why it's Hard
post by Alex Appel 186 days ago | Sam Eisenstat and Abram Demski like this | discuss

(Tiling result due to Sam, exposition of obstacles due to me)

continue reading »
2.Logical Inductors Converge to Correlated Equilibria (Kinda)
post by Alex Appel 197 days ago | Sam Eisenstat and Jessica Taylor like this | 1 comment

Logical inductors of “similar strength”, playing against each other in a repeated game, will converge to correlated equilibria of the one-shot game, for the same reason that players that react to the past plays of their opponent converge to correlated equilibria. In fact, this proof is essentially just the proof from Calibrated Learning and Correlated Equilibrium by Forster (1997), adapted to a logical inductor setting.

continue reading »
3.Resource-Limited Reflective Oracles
discussion post by Alex Appel 242 days ago | Sam Eisenstat, Abram Demski and Jessica Taylor like this | 1 comment
4.No Constant Distribution Can be a Logical Inductor
discussion post by Alex Appel 246 days ago | Sam Eisenstat, Vadim Kosoy, Abram Demski, Jessica Taylor and Stuart Armstrong like this | 1 comment
5.An Untrollable Mathematician
post by Abram Demski 320 days ago | Alex Appel, Sam Eisenstat, Vadim Kosoy, Jack Gallagher, Jessica Taylor, Paul Christiano, Scott Garrabrant and Vladimir Slepnev like this | 1 comment

Follow-up to All Mathematicians are Trollable.

It is relatively easy to see that no computable Bayesian prior on logic can converge to a single coherent probability distribution as we update it on logical statements. Furthermore, the non-convergence behavior is about as bad as could be: someone selecting the ordering of provable statements to update on can drive the Bayesian’s beliefs arbitrarily up or down, arbitrarily many times, despite only saying true things. I called this wild non-convergence behavior “trollability”. Previously, I showed that if the Bayesian updates on the provabilily of a sentence rather than updating on the sentence itself, it is still trollable. I left open the question of whether some other side information could save us. Sam Eisenstat has closed this question, providing a simple logical prior and a way of doing a Bayesian update on it which (1) cannot be trolled, and (2) converges to a coherent distribution.

continue reading »
6.Current thoughts on Paul Christano's research agenda
post by Jessica Taylor 511 days ago | Ryan Carey, Owen Cotton-Barratt, Sam Eisenstat, Paul Christiano, Stuart Armstrong and Wei Dai like this | 15 comments

This post summarizes my thoughts on Paul Christiano’s agenda in general and ALBA in particular.

continue reading »
7.Smoking Lesion Steelman
post by Abram Demski 525 days ago | Tom Everitt, Sam Eisenstat, Vadim Kosoy, Paul Christiano and Scott Garrabrant like this | 10 comments

It seems plausible to me that any example I’ve seen so far which seems to require causal/counterfactual reasoning is more properly solved by taking the right updateless perspective, and taking the action or policy which achieves maximum expected utility from that perspective. If this were the right view, then the aim would be to construct something like updateless EDT.

I give a variant of the smoking lesion problem which overcomes an objection to the classic smoking lesion, and which is solved correctly by CDT, but which is not solved by updateless EDT.

continue reading »
8.Some Criticisms of the Logical Induction paper
link by Tarn Somervell Fletcher 528 days ago | Alex Mennen, Sam Eisenstat and Scott Garrabrant like this | 10 comments
9.An Approach to Logically Updateless Decisions
discussion post by Abram Demski 567 days ago | Sam Eisenstat, Jack Gallagher and Scott Garrabrant like this | 4 comments
10.Why I am not currently working on the AAMLS agenda
post by Jessica Taylor 575 days ago | Ryan Carey, Marcello Herreshoff, Sam Eisenstat, Abram Demski, Daniel Dewey, Scott Garrabrant and Stuart Armstrong like this | 2 comments

(note: this is not an official MIRI statement, this is a personal statement. I am not speaking for others who have been involved with the agenda.)

The AAMLS (Alignment for Advanced Machine Learning Systems) agenda is a project at MIRI that is about determining how to use hypothetical highly advanced machine learning systems safely. I was previously working on problems in this agenda and am currently not.

continue reading »
11.A correlated analogue of reflective oracles
post by Jessica Taylor 586 days ago | Sam Eisenstat, Vadim Kosoy, Abram Demski and Scott Garrabrant like this | discuss

Summary: Reflective oracles correspond to Nash equilibria. A correlated version of reflective oracles exists and corresponds to correlated equilibria. The set of these objects is convex, which is useful.

continue reading »
12.Generalizing Foundations of Decision Theory II
post by Abram Demski 595 days ago | Sam Eisenstat, Vadim Kosoy, Jessica Taylor and Patrick LaVictoire like this | 4 comments

As promised in the previous post, I develop my formalism for justifying as many of the decision-theoretic axioms as possible with generalized dutch-book arguments. (I’ll use the term “generalized dutch-book” to refer to arguments with a family resemblance to dutch-book or money-pump.) The eventual goal is to relax these assumptions in a way which addresses bounded processing power, but for now the goal is to get as much of classical decision theory as possible justified by a generalized dutch-book.

continue reading »
13.Two Major Obstacles for Logical Inductor Decision Theory
post by Scott Garrabrant 601 days ago | Alex Mennen, Sam Eisenstat, Abram Demski, Jessica Taylor, Patrick LaVictoire and Tsvi Benson-Tilsen like this | 3 comments

In this post, I describe two major obstacles for logical inductor decision theory: untaken actions are not observable and no updatelessness for computations. I will concretely describe both of these problems in a logical inductor framework, but I believe that both issues are general enough to transcend that framework.

continue reading »
14.The Ubiquitous Converse Lawvere Problem
post by Scott Garrabrant 608 days ago | Marcello Herreshoff, Sam Eisenstat, Jessica Taylor and Patrick LaVictoire like this | discuss

In this post, I give a stronger version of the open question presented here, and give a motivation for this stronger property. This came out of conversations with Marcello, Sam, and Tsvi.

Definition: A continuous function \(f:X\rightarrow Y\) is called ubiquitous if for every continuous function \(g:X\rightarrow Y\), there exists a point \(x\in X\) such that \(f(x)=g(x)\).

Open Problem: Does there exist a topological space \(X\) with a ubiquitous function \(f:X\rightarrow[0,1]^X\)?

continue reading »
15.Formal Open Problem in Decision Theory
post by Scott Garrabrant 618 days ago | Marcello Herreshoff, Sam Eisenstat, Vadim Kosoy, Jessica Taylor, Patrick LaVictoire and Stuart Armstrong like this | 13 comments

In this post, I present a new formal open problem. A positive answer would be valuable for decision theory research. A negative answer would be helpful, mostly for figuring out what is the closest we can get to a positive answer. I also give some motivation for the problem, and some partial progress.

Open Problem: Does there exist a topological space \(X\) (in some convenient category of topological spaces) such that there exists a continuous surjection from \(X\) to the space \([0,1]^X\) (of continuous functions from \(X\) to \([0,1]\))?

continue reading »
16.On motivations for MIRI's highly reliable agent design research
post by Jessica Taylor 684 days ago | Ryan Carey, Sam Eisenstat, Daniel Dewey, Nate Soares, Patrick LaVictoire, Paul Christiano, Tsvi Benson-Tilsen and Vladimir Nesov like this | 10 comments

(this post came out of a conversation between me and Owen Cotton-Barratt, plus a follow-up conversation with Nate)

continue reading »
17.The set of Logical Inductors is not Convex
post by Scott Garrabrant 803 days ago | Sam Eisenstat, Abram Demski and Patrick LaVictoire like this | 3 comments

Sam Eisenstat asked the following interesting question: Given two logical inductors over the same deductive process, is every (rational) convex combination of them also a logical inductor? Surprisingly, the answer is no! Here is my counterexample.

continue reading »
18.Variations of the Garrabrant-inductor
link by Sune Kristian Jakobsen 807 days ago | Sam Eisenstat, Abram Demski, Jessica Taylor, Nate Soares and Scott Garrabrant like this | 1 comment
19.Universal Inductors
post by Scott Garrabrant 816 days ago | Sam Eisenstat, Jack Gallagher, Benja Fallenstein, Jessica Taylor, Patrick LaVictoire and Tsvi Benson-Tilsen like this | discuss

Now that the Logical Induction paper is out, I am directing my attention towards decision theory. The approach I currently think will be most fruitful is attempting to make a logically updateless version of Wei Dai’s Updateless Decision Theory. Abram Demski has posted on here about this, but I think Logical Induction provides a new angle with which we can attack the problem. This post will present an alternate way of viewing Logical Induction which I think will be especially helpful for building a logical UDT. (The Logical Induction paper is a prerequisite for this post.)

continue reading »
20.Strict Dominance for the Modified Demski Prior
post by Abram Demski 1103 days ago | Sam Eisenstat, Jessica Taylor, Patrick LaVictoire and Scott Garrabrant like this | 1 comment

I previously noted that difficulties with the modified Demski prior were less than previously thought: I was failing to distinguish between the two different kinds of update. Here, I analyze more fully what it succeeds at.

continue reading »
21.A limit-computable, self-reflective distribution
post by Tsvi Benson-Tilsen 1120 days ago | Sam Eisenstat, Vadim Kosoy, Abram Demski, Jessica Taylor, Nate Soares, Patrick LaVictoire, Paul Christiano and Scott Garrabrant like this | 1 comment

We present a \(\Delta_2\)-definable probability distribution \({\Psi}\) that satisfies Christiano’s reflection schema for its own defining formula. The strategy is analogous to the chicken step employed by modal decision theory to obfuscate itself from the eyes of \({\mathsf{PA}}\); we will prevent the base theory \({T}\) from knowing much about \({\Psi}\), so that \({\Psi}\) can be coherent over \({T}\) and also consistently believe in reflection statements. So, the method used here is technical and not fundamental, but it does at least show that limit-computable and reflective distributions exist. These results are due to Sam Eisenstat and me, and this post benefited greatly from extensive notes from Sam; any remaining errors are probably mine.

Prerequisites: we assume familiarity with Christiano’s original result and the methods used there. In particular, we will freely use Kakutani’s fixed point theorem. See Christiano et al.’s paper.

continue reading »
22.Two Questions about Solomonoff Induction
post by Scott Garrabrant 1162 days ago | Sam Eisenstat, Vadim Kosoy and Jessica Taylor like this | 4 comments

In this post, I ask two questions about Solomonoff Induction. I am not sure if these questions are open or not. If you know the answer to either of them, please let me know. I think that the answers may be very relevant to stuff I am currently working on in Asymptotic Logical Uncertainty.

continue reading »
23.Proof Length and Logical Counterfactuals Revisited
post by Patrick LaVictoire 1191 days ago | Sam Eisenstat, Jessica Taylor and Scott Garrabrant like this | 5 comments

Update: This version of the Trolljecture fails too; see the counterexample due to Sam.

In An Informal Conjecture on Proof Length and Logical Counterfactuals, Scott discussed a “trolljecture” from a MIRI workshop, which attempted to justify (some) logical counterfactuals based on the lengths of proofs of various implications. Then Sam produced a counterexample, and Benja pointed to another counterexample.

But at the most recent MIRI workshop, I talked with Jonathan Lee and Holger Dell about a related way of evaluating logical counterfactuals, and we came away with a revived trolljecture!

continue reading »
24.Provability Counterfactuals vs Three Axioms of Galles and Pearl
link by Evan Lloyd 1197 days ago | Sam Eisenstat, Nate Soares, Patrick LaVictoire and Scott Garrabrant like this | discuss
25.Asymptotic Logical Uncertainty: Iterated Resource Bounded Solomonoff Induction
post by Scott Garrabrant 1203 days ago | Sam Eisenstat, Abram Demski and Jessica Taylor like this | discuss

This post is part of the Asymptotic Logical Uncertainty series. In this post, I present a possible fix to the strategy of using resource bounded Solomonoff induction for asymptotic logical uncertainty. This algorithm came out of conversations with Benja Fallenstein when I visited MIRI this week.

continue reading »
Older

NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

[Note: This comment is three
by Ryan Carey on A brief note on factoring out certain variables | 0 likes

There should be a chat icon
by Alex Mennen on Meta: IAFF vs LessWrong | 0 likes

Apparently "You must be
by Jessica Taylor on Meta: IAFF vs LessWrong | 1 like

There is a replacement for
by Alex Mennen on Meta: IAFF vs LessWrong | 1 like

Regarding the physical
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think that we should expect
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think I understand your
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

This seems like a hack. The
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

After thinking some more,
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yes, I think that we're
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

My intuition is that it must
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

To first approximation, a
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Actually, I *am* including
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yeah, when I went back and
by Alex Appel on Optimal and Causal Counterfactual Worlds | 0 likes

> Well, we could give up on
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

RSS

Privacy & Terms