Intelligent Agent Foundations Forumsign up / log in
Density Zero Exploration
post by Alex Mennen 185 days ago | Abram Demski, Paul Christiano and Scott Garrabrant like this | discuss

The idea here is due to Scott Garrabrant. All I did was write it.

Let’s say a logical induction-based agent is making an infinite sequence of decisions, and is using \(\varepsilon\)-exploration on each decision. There are two desirable criteria, which are somewhat in conflict:

First, we want there to be enough exporation that traders attempting to bet that good strategies would have bad outcomes (and thus prevent the good strategies from being tried, so that the bet never gets settled) will lose arbitrarily large amounts of money if they try doing that every time. This requires that in total, there is an infinite amount of exploration. For example, if the agent \(2^{-n}\)-explores on step \(n\), then it is possible for a sufficiently wealthy malicious trader to bet against a good strategy by enough that the agent will avoid it every time, without the trader losing all its money, because the actions it is discouraging only are taken anyway finitely many times. But if the agent \(\varepsilon\)-explores on step \(n\) for some fixed \(\varepsilon>0\), then this is not possible, because each action is taken infinitely many times no matter what any of the traders do, so no trader can consistently make some good action appear bad without losing all its money.

Second, we want there to be sufficiently little exploration that the agent does not sacrifice a nontrivial amount of value to it. If actions only have short-term effects, then it is enough for the probability of exploration to approach \(0\) as \(n\rightarrow\infty\), in order for the agent to behave optimally in the limit (if actions can have lasting consequences, then this is not enough; for instance, if there is an action that destroys all value forever if it is ever taken, then that action needs to never be taken; this directly conflicts with the first criterion). For example, if the agent \(\varepsilon\)-explores on step \(n\) for some fixed \(\varepsilon>0\), then it never gets any closer to acting optimally, but if it \(2^{-n}\)-explores on step \(n\), then its probability of acting optimally will approach \(1\) as \(n\rightarrow\infty\).

Fortunately, there are sequences that converge to \(0\) but whose sum diverges, like \(\frac{1}{n}\), so it is possible to satisfy both of these criteria. However, if there are important differences among what the agent should do for different steps, then this might not be enough. For example, if the agent \(\frac{1}{n}\)-explores on step \(n\), and it is particularly important what action the agent takes on steps that are powers of \(2\), then a wealthy malicious trader could bet against good actions on every step that is a power of \(2\), then on the \(k\)th time the malicious trader does this, the agent \(2^{-k}\)-explores, and the malicious trader will only lose a finite amount of money doing this. Thus we should strengthen the first criterion to ensure that a wealthy malicious trader cannot bet against a good strategy infinitely many times, rather than that it cannot bet against a good strategy every time. Thus, for every efficiently computable infinite subset \(X\subseteq\mathbb{N}\), we want an infinite amount of exploration to occur on steps in \(X\), so that no malicious trader can bet against a good strategy on every step in \(X\) without running out of money (we only need to consider efficiently computable subsets because only efficiently computable traders participate in the market, and they cannot pick out sets that are not efficiently computable).

To do this, pick some computable probability distribution \(p\) over all efficiently computable subsets of \(\mathbb{N}\), which does not assign probability \(0\) to any of them. This can be done computably by picking a probability distribution over programs that provably run in polynomial time. For each efficiently computable set \(X\subseteq\mathbb{N}\), we want to explore with probability at least \(\frac{p\left(X\right)}{k}\) on the \(k\)th element of \(X\). Formally, let \(X_{n}\) be \(\frac{1}{k}\) if \(n\) is the \(k\)th element of \(X\), and \(0\) if \(n\notin X\). On step \(n\), we explore with probability \(\sum_{X}p\left(X\right)X_{n}\). Since we explore with probability at least \(\frac{p\left(X\right)}{k}\) on the \(k\)th element of \(X\), this satisfies the strengthened version of the first condition. Since \(\sum_{X}p\left(X\right)=1\), \(X_{n}\leq1\) for every \(X\) and \(n\), and \(X_{n}\rightarrow0\) as \(n\rightarrow\infty\) for every \(X\), \(\sum_{X}p\left(X\right)X_{n}\rightarrow0\) as \(n\rightarrow\infty\), so the second condition is also satisfied.



NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

[Delegative Reinforcement
by Vadim Kosoy on Stable Pointers to Value II: Environmental Goals | 1 like

Intermediate update: The
by Alex Appel on Further Progress on a Bayesian Version of Logical ... | 0 likes

Since Briggs [1] shows that
by 258 on In memoryless Cartesian environments, every UDT po... | 2 likes

This doesn't quite work. The
by Nisan Stiennon on Logical counterfactuals and differential privacy | 0 likes

I at first didn't understand
by Sam Eisenstat on An Untrollable Mathematician | 1 like

This is somewhat related to
by Vadim Kosoy on The set of Logical Inductors is not Convex | 0 likes

This uses logical inductors
by Abram Demski on The set of Logical Inductors is not Convex | 0 likes

Nice writeup. Is one-boxing
by Tom Everitt on Smoking Lesion Steelman II | 0 likes

Hi Alex! The definition of
by Vadim Kosoy on Delegative Inverse Reinforcement Learning | 0 likes

A summary that might be
by Alex Appel on Delegative Inverse Reinforcement Learning | 1 like

I don't believe that
by Alex Appel on Delegative Inverse Reinforcement Learning | 0 likes

This is exactly the sort of
by Stuart Armstrong on Being legible to other agents by committing to usi... | 0 likes

When considering an embedder
by Jack Gallagher on Where does ADT Go Wrong? | 0 likes

The differences between this
by Abram Demski on Policy Selection Solves Most Problems | 1 like

Looking "at the very
by Abram Demski on Policy Selection Solves Most Problems | 0 likes

RSS

Privacy & Terms