Intelligent Agent Foundations Forumsign up / log in
My recent posts
discussion post by Paul Christiano 180 days ago | Ryan Carey, Jessica Taylor, Patrick LaVictoire, Stuart Armstrong and Tsvi Benson-Tilsen like this | discuss

Over at medium, I’m continuing to write about AI control; here’s a roundup from the last month.

Many of these seem like interesting things to discuss here; would it be better to post each of these as a link when I write it?

Strategy

  • Prosaic AI control argues that AI control research should first consider the case where AI involves no “unknown unknowns.”
  • Handling destructive technology tries to explain the upside of AI control, if we live in a universe where we eventually need to build a singleton anyway.
  • Hard-core subproblems explains a concept I find helpful for organizing research.

Building blocks of ALBA

Terminology and concepts



NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

The "benign induction
by David Krueger on Why I am not currently working on the AAMLS agenda | 0 likes

This comment is to explain
by Alex Mennen on Formal Open Problem in Decision Theory | 0 likes

Thanks for writing this -- I
by Daniel Dewey on AI safety: three human problems and one AI issue | 1 like

I think it does do the double
by Stuart Armstrong on Acausal trade: double decrease | 0 likes

>but the agent incorrectly
by Stuart Armstrong on CIRL Wireheading | 0 likes

I think the double decrease
by Owen Cotton-Barratt on Acausal trade: double decrease | 0 likes

The problem is that our
by Scott Garrabrant on Cooperative Oracles: Nonexploited Bargaining | 1 like

Yeah. The original generator
by Scott Garrabrant on Cooperative Oracles: Nonexploited Bargaining | 0 likes

I don't see how it would. The
by Scott Garrabrant on Cooperative Oracles: Nonexploited Bargaining | 1 like

Does this generalise to
by Stuart Armstrong on Cooperative Oracles: Nonexploited Bargaining | 0 likes

>Every point in this set is a
by Stuart Armstrong on Cooperative Oracles: Nonexploited Bargaining | 0 likes

This seems a proper version
by Stuart Armstrong on Cooperative Oracles: Nonexploited Bargaining | 0 likes

This doesn't seem to me to
by Stuart Armstrong on Change utility, reduce extortion | 0 likes

[_Regret Theory with General
by Abram Demski on Generalizing Foundations of Decision Theory II | 0 likes

It's not clear whether we
by Paul Christiano on Infinite ethics comparisons | 1 like

RSS

Privacy & Terms