Intelligent Agent Foundations Forumhttp://agentfoundations.org/Intelligent Agent Foundations ForumComment on Why I am not currently working on the AAMLS agendahttp://agentfoundations.org/item?id=1491David KruegernilOptimisation in manipulating humans: engineered fanatics vs yes-menhttp://agentfoundations.org/item?id=1489Stuart ArmstrongComment on Formal Open Problem in Decision Theoryhttp://agentfoundations.org/item?id=1488Alex MennennilAn Approach to Logically Updateless Decisionshttp://agentfoundations.org/item?id=1472Abram DemskiComment on AI safety: three human problems and one AI issuehttp://agentfoundations.org/item?id=1486Daniel DeweynilComment on Acausal trade: double decreasehttp://agentfoundations.org/item?id=1485Stuart ArmstrongnilComment on CIRL Wireheadinghttp://agentfoundations.org/item?id=1484Stuart ArmstrongnilComment on Acausal trade: double decreasehttp://agentfoundations.org/item?id=1483Owen Cotton-BarrattnilAcausal trade: conclusion: theory vs practicehttp://agentfoundations.org/item?id=1482Stuart Armstrong

When I started this dive into acausal trade, I expected to find subtle and interesting theoretical considerations. Instead, most of the issues are practical.

Acausal trade: being unusualhttp://agentfoundations.org/item?id=1404Stuart ArmstrongAcausal trade: different utilities, different tradeshttp://agentfoundations.org/item?id=1464Stuart ArmstrongAcausal trade: trade barriershttp://agentfoundations.org/item?id=1480Stuart ArmstrongValue Learning for Irrational Toy Modelshttp://agentfoundations.org/item?id=1467Patrick LaVictoireComment on Cooperative Oracles: Nonexploited Bargaininghttp://agentfoundations.org/item?id=1479Scott GarrabrantnilComment on Cooperative Oracles: Nonexploited Bargaininghttp://agentfoundations.org/item?id=1477Scott GarrabrantnilComment on Cooperative Oracles: Nonexploited Bargaininghttp://agentfoundations.org/item?id=1478Scott GarrabrantnilAcausal trade: full decision algorithmshttp://agentfoundations.org/item?id=1466Stuart ArmstrongAcausal trade: universal utility, or selling non-existence insurance too latehttp://agentfoundations.org/item?id=1471Stuart ArmstrongComment on Cooperative Oracles: Nonexploited Bargaininghttp://agentfoundations.org/item?id=1475Stuart ArmstrongnilComment on Cooperative Oracles: Nonexploited Bargaininghttp://agentfoundations.org/item?id=1474Stuart ArmstrongnilComment on Cooperative Oracles: Nonexploited Bargaininghttp://agentfoundations.org/item?id=1473Stuart ArmstrongnilWhy I am not currently working on the AAMLS agendahttp://agentfoundations.org/item?id=1470Jessica Taylor

(note: this is not an official MIRI statement, this is a personal statement. I am not speaking for others who have been involved with the agenda.)

The AAMLS (Alignment for Advanced Machine Learning Systems) agenda is a project at MIRI that is about determining how to use hypothetical highly advanced machine learning systems safely. I was previously working on problems in this agenda and am currently not.

Cooperative Oracles: Nonexploited Bargaininghttp://agentfoundations.org/item?id=1469Scott Garrabrant

In this post, we formalize and generalize the phenomenon described in the Eliezer Yudkowsky post Cooperating with agents with different ideas of fairness, while resisting exploitation.

Cooperative Oracles: Introductionhttp://agentfoundations.org/item?id=1468Scott Garrabrant

This is the first in a series of posts introducing a new tool called a Cooperative Oracle. All of these posts are joint work Sam Eisenstat, Tsvi Benson-Tilsen, and Nisan Stiennon.

Here is my plan for posts in this sequence. I will update this as I go.

  1. Introduction
  2. Nonexploited Bargaining
  3. Stratified and Nearly Pareto Optima
  4. Definition and Existence Proof
  5. Alternate Notions of Dependency
Acausal trade: double decreasehttp://agentfoundations.org/item?id=1463Stuart ArmstrongAcausal trade: introductionhttp://agentfoundations.org/item?id=1465Stuart Armstrong

I’ve never really understood acausal trade. So in a short series of posts, I’ll attempt to analyse the concept sufficiently that I can grasp it - and hopefully so others can grasp it as well.

Comment on Change utility, reduce extortionhttp://agentfoundations.org/item?id=1462Stuart ArmstrongnilCIRL Wireheadinghttp://agentfoundations.org/item?id=1459Tom Everitt

Cooperative inverse reinforcement learning (CIRL) generated a lot of attention last year, as it seemed to do a good job aligning an agent’s incentives with its human supervisor’s. Notably, it led to an elegant solution to the shutdown problem.

Comment on Generalizing Foundations of Decision Theory IIhttp://agentfoundations.org/item?id=1458Abram DemskinilComment on Infinite ethics comparisonshttp://agentfoundations.org/item?id=1457Paul ChristianonilComment on Change utility, reduce extortionhttp://agentfoundations.org/item?id=1456Stuart ArmstrongnilInfinite ethics comparisonshttp://agentfoundations.org/item?id=1455Stuart Armstrong

Work done with Amanda Askell; the errors are mine.

It’s very difficult to compare utilities across worlds with infinite populations. For instance, it seems clear that world \(w_1\) is better than \(w_2\), if the number indicate the utilities of various agents:

  • \(w_1 = 1,0,1,0,1,0,1,0,1,0, \ldots\)
  • \(w_2 = 1,0,1,0,0,1,0,0,0,1, \ldots\)
Comment on Intertheoretic utility comparison: simple theoryhttp://agentfoundations.org/item?id=1453Stuart ArmstrongnilIntertheoretic utility comparison: outcomes, strategies and utilitieshttp://agentfoundations.org/item?id=1449Stuart ArmstrongComment on Intertheoretic utility comparison: simple theoryhttp://agentfoundations.org/item?id=1452Stuart ArmstrongnilComment on Change utility, reduce extortionhttp://agentfoundations.org/item?id=1448Stuart ArmstrongnilComment on Formal Open Problem in Decision Theoryhttp://agentfoundations.org/item?id=1447Alex MennennilComment on Intertheoretic utility comparison: simple theoryhttp://agentfoundations.org/item?id=1446Alex MennennilFinding reflective oracle distributions using a Kakutani maphttp://agentfoundations.org/item?id=1444Jessica TaylorA correlated analogue of reflective oracleshttp://agentfoundations.org/item?id=1435Jessica Taylor

Summary: Reflective oracles correspond to Nash equilibria. A correlated version of reflective oracles exists and corresponds to correlated equilibria. The set of these objects is convex, which is useful.

Comment on Intertheoretic utility comparison: simple theoryhttp://agentfoundations.org/item?id=1442Stuart ArmstrongnilComment on Intertheoretic utility comparison: simple theoryhttp://agentfoundations.org/item?id=1441Stuart ArmstrongnilComment on Two Major Obstacles for Logical Inductor Decision Theoryhttp://agentfoundations.org/item?id=1440Vladimir SlepnevnilComment on Formal Open Problem in Decision Theoryhttp://agentfoundations.org/item?id=1438Alex MennennilChange utility, reduce extortionhttp://agentfoundations.org/item?id=1402Stuart Armstrong

EDIT: This method is not intended to solve extortion, just to remove the likelihood of extremely terrible outcomes (and slightly reduce the vulnerability to extortion).

A permutation argument for comparing utility functionshttp://agentfoundations.org/item?id=1405Stuart ArmstrongIntertheoretic utility comparison: exampleshttp://agentfoundations.org/item?id=1426Stuart ArmstrongIntertheoretic utility comparison: simple theoryhttp://agentfoundations.org/item?id=1418Stuart ArmstrongTwo Major Obstacles for Logical Inductor Decision Theoryhttp://agentfoundations.org/item?id=1399Scott Garrabrant

In this post, I describe two major obstacles for logical inductor decision theory: untaken actions are not observable and no updatelessness for computations. I will concretely describe both of these problems in a logical inductor framework, but I believe that both issues are general enough to transcend that framework.

Where's the first benign agent?http://agentfoundations.org/item?id=1394Jacob Kopczynski