Intelligent Agent Foundations Forumsign up / log in
My recent posts
discussion post by Paul Christiano 114 days ago | Ryan Carey, Jessica Taylor, Patrick LaVictoire, Stuart Armstrong and Tsvi Benson-Tilsen like this | discuss

Over at medium, I’m continuing to write about AI control; here’s a roundup from the last month.

Many of these seem like interesting things to discuss here; would it be better to post each of these as a link when I write it?

Strategy

  • Prosaic AI control argues that AI control research should first consider the case where AI involves no “unknown unknowns.”
  • Handling destructive technology tries to explain the upside of AI control, if we live in a universe where we eventually need to build a singleton anyway.
  • Hard-core subproblems explains a concept I find helpful for organizing research.

Building blocks of ALBA

Terminology and concepts



NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

I don't know which open
by Jessica Taylor on Some problems with making induction benign, and ap... | 0 likes

KWIK learning is definitely
by Vadim Kosoy on Some problems with making induction benign, and ap... | 0 likes

I should have said "reliably
by Patrick LaVictoire on HCH as a measure of manipulation | 0 likes

I think that one can argue
by Vadim Kosoy on Generalizing Foundations of Decision Theory | 0 likes

"Having a well-calibrated
by Jessica Taylor on HCH as a measure of manipulation | 0 likes

Re #2, I think this is an
by Patrick LaVictoire on HCH as a measure of manipulation | 0 likes

Re #1, an obvious set of
by Patrick LaVictoire on HCH as a measure of manipulation | 0 likes

There's the additional
by Patrick LaVictoire on HCH as a measure of manipulation | 0 likes

I agree it's not a complete
by David Krueger on An idea for creating safe AI | 0 likes

I spoke with Huw about this
by David Krueger on An idea for creating safe AI | 0 likes

Both of your conjectures are
by Alex Mennen on Generalizing Foundations of Decision Theory | 0 likes

I can think of two problems:
by Ryan Carey on HCH as a measure of manipulation | 0 likes

Question that I haven't seen
by Patrick LaVictoire on All the indifference designs | 0 likes

Agree that IRL doesn't solve
by Jessica Taylor on Some problems with making induction benign, and ap... | 0 likes

Designing an agent which is
by Vadim Kosoy on An idea for creating safe AI | 0 likes

RSS

Privacy & Terms