Intelligent Agent Foundations Forumsign up / log in
by Alex Appel 807 days ago | Patrick LaVictoire likes this | link | parent

Interestingly enough, the approximate coherence condition doesn’t hold when there is a short proof of φ from ψ, but it does hold when there is a sufficiently short proof of ψ from φ. (Very roughly, the valuation of (φ∧¬ψ) is negligible, while the valuation of (φ∧ψ) is approximately equal to the valuation of φ.) So coherence only works one way.

On a mostly unrelated note, this sort of reasoning doesn’t seem to link well with the strategy of “add important mathematical facts that you discover into your pool of starting axioms to speed up the process of deriving things.” While a set of axioms, and a set of axioms plus a bunch of utility tool theorems (Godel’s 2nd incompleteness, Lob’s theorem, the fundamental theorem of calculus, etc..) may “span” the same space of theorems, the second one is much easier to quickly derive new interesting statements from, and seems to be how humans think about math. The inconsistency involved in getting 10 utility on the 5-and-10 problem is much easier to spot if the second incompleteness theorem is already in your pool of sentences to apply to a new problem.

As with the Russel’s Paradox example, in practice, counterfactuals seem to be mind-dependent, and vary depending on which of the many different lines of reasoning a mind heads down. If you define a subjective distance relative to a particular search algorithm, this objective valuation would just use the shortest possible subjective distance to the statement and a contradiction. The valuation using the human distance function of a statement in naive set theory would be high, because we were slow to notice the contradiction. So that aspect of counterfactual reasoning seems easy to capture.

Has anyone checked what this does on ASP problems?



by Patrick LaVictoire 797 days ago | link

ASP has only been formalized in ways that don’t translate to modal universes, so it’s only the analogous extensions that could apply there.

In the formulation here, it’s not enough that the agent one-box; it must do so provably in less than \(N\) steps. I think that both the valuations \(\nu_{\textsf{ZFC+Sound(ZFC)}+A()=1}(U()=1M)\) and \(\nu_{\textsf{ZFC+Sound(ZFC)}+A()=1}(U()=0)\) will probably be about \(1/2\), but I’m not sure.

reply



NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

Indeed there is some kind of
by Vadim Kosoy on Catastrophe Mitigation Using DRL | 0 likes

Very nice. I wonder whether
by Vadim Kosoy on Hyperreal Brouwer | 0 likes

Freezing the reward seems
by Vadim Kosoy on Resolving human inconsistency in a simple model | 0 likes

Unfortunately, it's not just
by Vadim Kosoy on Catastrophe Mitigation Using DRL | 0 likes

>We can solve the problem in
by Wei Dai on The Happy Dance Problem | 1 like

Maybe it's just my browser,
by Gordon Worley III on Catastrophe Mitigation Using DRL | 2 likes

At present, I think the main
by Abram Demski on Looking for Recommendations RE UDT vs. bounded com... | 0 likes

In the first round I'm
by Paul Christiano on Funding opportunity for AI alignment research | 0 likes

Fine with it being shared
by Paul Christiano on Funding opportunity for AI alignment research | 0 likes

I think the point I was
by Abram Demski on Predictable Exploration | 0 likes

(also x-posted from
by Sören Mindermann on The Three Levels of Goodhart's Curse | 0 likes

(x-posted from Arbital ==>
by Sören Mindermann on The Three Levels of Goodhart's Curse | 0 likes

>If the other players can see
by Stuart Armstrong on Predictable Exploration | 0 likes

Thinking about this more, I
by Abram Demski on Predictable Exploration | 0 likes

> So I wound up with
by Abram Demski on Predictable Exploration | 0 likes

RSS

Privacy & Terms