Intelligent Agent Foundations Forumsign up / log in
by Abram Demski 102 days ago | link | parent

At present, I think the main problem of logical updatelessness is something like: how can we make a principled trade-off between thinking longer to make a better decision, vs thinking less long so that we exert more logical control on the environment?

For example, in Agent Simulates Predictor, an agent who thinks for a short amount of time and then decides on a policy for how to respond to any conclusions which it comes to after thinking longer can decide “If I think longer, and see a proof that the predictor thinks I two-box, I can invalidate that proof by one-boxing. Adopting this policy makes the predictor less likely to find such a proof.” (I’m speculating; I haven’t actually written up a thing which does this, yet, but I think it would work.) An agent who thinks longer before making a decision can’t see this possibility because it has already proved that the predictor predicts two-boxing, so from the perspective of having thought longer, there doesn’t appear to be a way to invalidate the prediction – being predicted to two-box is just a fact, not a thing the agent has control over.

Similarly, in Prisoner’s Dilemma, an agent who hasn’t thought too long can adopt a strategy of first thinking longer and then doing whatever it predicts the other agent to do. This is a pretty good strategy, because it makes it so that the other agent’s best strategy is to cooperate. However, you have to think for long enough to find this particular strategy, but short enough that the hypotheticals which show that the strategy is a good idea aren’t closed off yet.

So, I think there is less conflict between UDT and bounded reasoning than you are implying. However, it’s far from clear how to negotiate the trade-offs sanely.

(However, in both cases, you still want to spend as long a time thinking as you can afford; it’s just that you want to make the policy decision, about how to use the conclusions of that thinking, as early as they can be made while remaining sensible.)



NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

[Delegative Reinforcement
by Vadim Kosoy on Stable Pointers to Value II: Environmental Goals | 1 like

Intermediate update: The
by Alex Appel on Further Progress on a Bayesian Version of Logical ... | 0 likes

Since Briggs [1] shows that
by 258 on In memoryless Cartesian environments, every UDT po... | 2 likes

This doesn't quite work. The
by Nisan Stiennon on Logical counterfactuals and differential privacy | 0 likes

I at first didn't understand
by Sam Eisenstat on An Untrollable Mathematician | 1 like

This is somewhat related to
by Vadim Kosoy on The set of Logical Inductors is not Convex | 0 likes

This uses logical inductors
by Abram Demski on The set of Logical Inductors is not Convex | 0 likes

Nice writeup. Is one-boxing
by Tom Everitt on Smoking Lesion Steelman II | 0 likes

Hi Alex! The definition of
by Vadim Kosoy on Delegative Inverse Reinforcement Learning | 0 likes

A summary that might be
by Alex Appel on Delegative Inverse Reinforcement Learning | 1 like

I don't believe that
by Alex Appel on Delegative Inverse Reinforcement Learning | 0 likes

This is exactly the sort of
by Stuart Armstrong on Being legible to other agents by committing to usi... | 0 likes

When considering an embedder
by Jack Gallagher on Where does ADT Go Wrong? | 0 likes

The differences between this
by Abram Demski on Policy Selection Solves Most Problems | 1 like

Looking "at the very
by Abram Demski on Policy Selection Solves Most Problems | 0 likes

RSS

Privacy & Terms