Intelligent Agent Foundations Forumsign up / log in
by Scott Garrabrant 868 days ago | Ryan Carey, Abram Demski and Jessica Taylor like this | link | parent

I am not optimistic about this project. My primary reason is that decision theory has two parts. First, there is the part that is related to this post, which I’ll call “Expected Utility Theory.” Then, there is the much harder part, which I’ll call “Naturalized Decision Theory.”

I think expected utility theory is pretty well understood, and this post plays around with details of a well understood theory, while naturalized decision theory is not well understood at all.

I think we agree that the work in this post is not directly related to naturalized decision theory, but you think it is going to help anyway.

My understanding of your argument (correct me if I am wrong) is that probability theory is to logical uncertainty as expected utility theory is to naturalized decision theory, and dutch books lead to LU progress, so VNMish should lead to NDT progress.

I challenge this in two ways.

First, Logical Inductors look like dutch books, but this might be because things related to probability theory can be talked about with dutch books. I don’t think that thinking about Dutch books lead to the invention of Logical Inductors (Although maybe they would have if I followed the right path), and I don’t think that the post hoc connection provides much evidence that thinking about dutch books is useful. Perhaps whenever you have a theory, you can do this formal justification stuff, but formal justification does not create theories.

I realize that I actually do not stand behind this first challenge very much, but I still want to put it out there as a possibility.

Second, I think that in a way Logical Uncertainty is about resource bounded Probability theory, and this is why a weakening of dutch books helped. On the other hand, Naturalized Decision Theory is not about resource bounded Expected Utility Theory. We made a type of resource bounded Probability theory, and magically got some naturalistic reasoning out of it. I expect that we cannot do the same thing for decision theory, because the relationship is more complicated.

Expected Utility Theory is about your preferences over various worlds. If you follow the analogy with LI strongly, if you succeed, we will be able to extend it to having preferences over various worlds which contain yourself. This seems very far from a solution to naturalized decision theory. In fact, it does not feel that far from what we might be able to easily do with existing Expected Utility Theory plus logical inductors.

Perhaps I am attacking a straw man, and you mean “do the same thing we did with logical induction” less literally than I am interpreting it, but in this case there is way more special sauce in the part about what you do to generalize expected utility theory, so I expect it to be much harder than the Logical Induction case.



by Jessica Taylor 866 days ago | Abram Demski and Scott Garrabrant like this | link

On the other hand, Naturalized Decision Theory is not about resource bounded Expected Utility Theory.

I think there’s a sense in which I buy this but it might be worth explaining more.

My current suspicion is that “agents that have utility functions over the outcome of the physics they are embedded in” is not the right concept for understanding naturalized agency (in particular, the “motive forces” of the things that emerge from processes like abiogenesis/evolution/culture/AI research and development). This concept is often argued for using dutch-book arguments (e.g. VNM). I think these arguments are probably invalid when applied to naturalized agents (if taken literally they assume something like a “view from nowhere” that is unachievable from within the physics, unbounded computation, etc). As such, re-examining what arguments can be made about coherent naturalized agency while avoiding inscription errors* seems like a good path towards recovering the correct concepts for thinking about naturalized agency.

*I’m getting the term “inscription error” from Brian Cantwell Smith (On the Origin of Objects, p. 50):

It is a phenomenon that I will in general call an inscription error: a tendency for a theorist or observer, first, to write or project or impose or inscribe a set of ontological assumptions onto a computational system (onto the system itself, onto the task domain, onto the relation between the two, and so forth), and then, second, to read those assumptions or their consequences back off the system, as if that constituted an independent empirical discovery or theoretical result.

reply

by Abram Demski 863 days ago | Scott Garrabrant likes this | link

I think expected utility theory is pretty well understood, and this post plays around with details of a well understood theory, while naturalized decision theory is not well understood at all.

I think most of our disagreement actually hinges on this part. My feeling is that I, at least, don’t understand EU well enough; when I look at the foundations which are supposed to argue decisively in its favor, they’re not quite as solid as I’d like.

If I was happy with the VNM assumption of probability theory (which I feel is circular, since Dutch Book assumes EU), I think my position would be similar to this (linked by Alex), which strongly agrees with all of the axioms but continuity, and takes continuity as provisionally reasonable. Continuity would be something to maybe dig deeper into at some point, but not so likely to bear fruit that I’d want to investigate right away.

However, what’s really interesting is justification of EU and probability theory in one stroke. The justification of the whole thing from only money-pump/dutch-book style arguments seems close enough to be tantalizing, while also having enough hard-to-justify parts to make it a real possibility that such a justification would be of an importantly generalized DT.

First, […] I don’t think that thinking about Dutch books lead to the invention of Logical Inductors (Although maybe they would have if I followed the right path), and I don’t think that the post hoc connection provides much evidence that thinking about dutch books is useful.

All I have to say here is that I find it somewhat plausible outside-view; an insight from a result need not be an original generator of the result. I think max-margin classifiers in machine learning are like this; the learning theory which came from explaining why they work was then fruitful in producing other algorithms. (I could be wrong here.)

Second, I think that in a way Logical Uncertainty is about resource bounded Probability theory, and this is why a weakening of dutch books helped. On the other hand, Naturalized Decision Theory is not about resource bounded Expected Utility Theory.

I don’t think naturalized DT is exactly what I’m hoping to get. My highest hope that I have any concrete reason to expect is a logically-uncertain DT which is temporally consistent (without a parameter for how long to run the LI).

reply



NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

[Note: This comment is three
by Ryan Carey on A brief note on factoring out certain variables | 0 likes

There should be a chat icon
by Alex Mennen on Meta: IAFF vs LessWrong | 0 likes

Apparently "You must be
by Jessica Taylor on Meta: IAFF vs LessWrong | 1 like

There is a replacement for
by Alex Mennen on Meta: IAFF vs LessWrong | 1 like

Regarding the physical
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think that we should expect
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think I understand your
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

This seems like a hack. The
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

After thinking some more,
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yes, I think that we're
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

My intuition is that it must
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

To first approximation, a
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Actually, I *am* including
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yeah, when I went back and
by Alex Appel on Optimal and Causal Counterfactual Worlds | 0 likes

> Well, we could give up on
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

RSS

Privacy & Terms