Intelligent Agent Foundations Forumsign up / log in
by Daniel Dewey 1709 days ago | Ryan Carey and Nate Soares like this | link | parent

Nice example! I think I understood better why this picks out the particular weakness of EDT (and why it’s not a general exploit that can be used against any DT) when I thought of it less as a money-pump and more as “Not only does EDT want to manage the news, you can get it to pay you a lot for the privilege”.



by Abram Demski 1708 days ago | Nate Soares likes this | link

There is a nuance that needs to be mentioned here. If the EDT agent is aware of the researcher’s ploys ahead of time, it will set things up so that emails from the researcher go straight to the spam folder, block the researcher’s calls, and so on. It is not actually happy to pay the researcher for managing the news!

This is less pathological than listening to the researcher and paying up, but it’s still an odd news-management strategy that’s result of EDT.

reply

by Benja Fallenstein 1708 days ago | Abram Demski and Nate Soares like this | link

True. This looks to me like an effect of EDT not being stable under self-modification, although here the issue is handicapping itself through external means rather than self-modification—like, if you offer a CDT agent a potion that will make it unable to lift more than one box before it enters Newcomb’s problem (i.e., before Omega makes its observation of the agent), then it’ll cheerfully take it and pay you for the privilege.

reply

by Benja Fallenstein 1709 days ago | link

Thanks! I didn’t really think at all about whether or not “money-pump” was the appropriate word (I’m not sure what the exact definition is); have now changed “way to money-pump EDT agents” into “way to get EDT agents to pay you for managing the news for them”.

reply

by Daniel Dewey 1709 days ago | link

Hm, I don’t know what the definition is either. In my head, it means “can get an arbitrary amount of money from”, e.g. by taking it around a preference loop as many times as you like. In any case, glad the feedback was helpful.

reply



NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

[Note: This comment is three
by Ryan Carey on A brief note on factoring out certain variables | 0 likes

There should be a chat icon
by Alex Mennen on Meta: IAFF vs LessWrong | 0 likes

Apparently "You must be
by Jessica Taylor on Meta: IAFF vs LessWrong | 1 like

There is a replacement for
by Alex Mennen on Meta: IAFF vs LessWrong | 1 like

Regarding the physical
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think that we should expect
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think I understand your
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

This seems like a hack. The
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

After thinking some more,
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yes, I think that we're
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

My intuition is that it must
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

To first approximation, a
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Actually, I *am* including
by Vanessa Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yeah, when I went back and
by Alex Appel on Optimal and Causal Counterfactual Worlds | 0 likes

> Well, we could give up on
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

RSS

Privacy & Terms