Intelligent Agent Foundations Forumsign up / log in
by Wei Dai 180 days ago | link | parent

I think I understand what you’re saying, but my state of uncertainty is such that I put a lot of probability mass on possibilities that wouldn’t be well served by what you’re suggesting. For example, the possibility that we can achieve most value not through the consequences of our actions in this universe, but through their consequences in much larger (computationally richer) universes simulating this one. Or that spreading hedonium is actually the right thing to do and produces orders of magnitude more value than spreading anything that resembles human civilization. Or that value scales non-linearly with brain size so we should go for either very large or very small brains.

While discussing the VR utopia post, you wrote “I know you want to use philosophy to extend the domain, but I don’t trust our philosophical abilities to do that, because whatever mechanism created them could only test them on normal situations.” I have some hope that there is a minimal set of philosophical abilities that would allow us to eventually solve arbitrary philosophical problems, and we already have this. Otherwise it seems hard to explain the kinds of philosophical progress we’ve made, like realizing that other universes probably exist, and figuring out some ideas about how to make decisions when there are multiple copies of us in this universe and others.

Of course it’s also possible that’s not the case, and we can’t do better than to optimize the future using our current “low resolution” values, but until we’re a lot more certain of this, any attempt to do this seems to constitute a strong existential risk.



NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

[Delegative Reinforcement
by Vadim Kosoy on Stable Pointers to Value II: Environmental Goals | 1 like

Intermediate update: The
by Alex Appel on Further Progress on a Bayesian Version of Logical ... | 0 likes

Since Briggs [1] shows that
by 258 on In memoryless Cartesian environments, every UDT po... | 2 likes

This doesn't quite work. The
by Nisan Stiennon on Logical counterfactuals and differential privacy | 0 likes

I at first didn't understand
by Sam Eisenstat on An Untrollable Mathematician | 1 like

This is somewhat related to
by Vadim Kosoy on The set of Logical Inductors is not Convex | 0 likes

This uses logical inductors
by Abram Demski on The set of Logical Inductors is not Convex | 0 likes

Nice writeup. Is one-boxing
by Tom Everitt on Smoking Lesion Steelman II | 0 likes

Hi Alex! The definition of
by Vadim Kosoy on Delegative Inverse Reinforcement Learning | 0 likes

A summary that might be
by Alex Appel on Delegative Inverse Reinforcement Learning | 1 like

I don't believe that
by Alex Appel on Delegative Inverse Reinforcement Learning | 0 likes

This is exactly the sort of
by Stuart Armstrong on Being legible to other agents by committing to usi... | 0 likes

When considering an embedder
by Jack Gallagher on Where does ADT Go Wrong? | 0 likes

The differences between this
by Abram Demski on Policy Selection Solves Most Problems | 1 like

Looking "at the very
by Abram Demski on Policy Selection Solves Most Problems | 0 likes

RSS

Privacy & Terms