Intelligent Agent Foundations Forumsign up / log in
by Patrick LaVictoire 797 days ago | link | parent

I can prove the property that for each hypothesis \(A()=a\) there is at most one \(u\) such that \(U()=u\) has a high valuation (for sufficiently high PA+N), with the following caveat: it can sometimes take many steps to prove that \(u\neq u'\) in PA+N, so we’ll need to include the length of that proof in our bound.

In what follows, we will take all subscripts of \(d\) and \(\nu\) to be \(PA+N, A()=a\) for \(N\) large.

For any \(\phi\), \(d(\bot) - d(\neg\phi)\leq d(\phi)\leq d(\bot)\) and thus \[1 - \frac{d(\phi)}{d(\bot)} \leq \nu(\phi) \leq \frac{d(\bot)}{d(\phi)+d(\bot)}.\]

Also, \(d(U()=u)+d(U()=u')+d(u\neq u')\geq d(\bot)\). This implies \(\max\{d(U()=u),d(U()=u')\}\geq \frac12(d(\bot)-d(u\neq u))\), which implies \[\min\{\nu(U()=u),\nu(U()=u')\}\leq \min\{\frac{d(\bot)}{d(U()=u)+d(\bot)},\frac{d(\bot)}{d(U()=u')+d(\bot)}\} \leq \frac{2d(\bot)}{3d(\bot)-d(u\neq u')}.\]

So we see that \(\nu(U()=u)\) and \(\nu(U()=u')\) cannot both be significantly larger than 2/3 if there is a short proof that \(u\neq u'\).



NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

Indeed there is some kind of
by Vadim Kosoy on Catastrophe Mitigation Using DRL | 0 likes

Very nice. I wonder whether
by Vadim Kosoy on Hyperreal Brouwer | 0 likes

Freezing the reward seems
by Vadim Kosoy on Resolving human inconsistency in a simple model | 0 likes

Unfortunately, it's not just
by Vadim Kosoy on Catastrophe Mitigation Using DRL | 0 likes

>We can solve the problem in
by Wei Dai on The Happy Dance Problem | 1 like

Maybe it's just my browser,
by Gordon Worley III on Catastrophe Mitigation Using DRL | 2 likes

At present, I think the main
by Abram Demski on Looking for Recommendations RE UDT vs. bounded com... | 0 likes

In the first round I'm
by Paul Christiano on Funding opportunity for AI alignment research | 0 likes

Fine with it being shared
by Paul Christiano on Funding opportunity for AI alignment research | 0 likes

I think the point I was
by Abram Demski on Predictable Exploration | 0 likes

(also x-posted from
by Sören Mindermann on The Three Levels of Goodhart's Curse | 0 likes

(x-posted from Arbital ==>
by Sören Mindermann on The Three Levels of Goodhart's Curse | 0 likes

>If the other players can see
by Stuart Armstrong on Predictable Exploration | 0 likes

Thinking about this more, I
by Abram Demski on Predictable Exploration | 0 likes

> So I wound up with
by Abram Demski on Predictable Exploration | 0 likes

RSS

Privacy & Terms