Index of some decision theory posts
discussion post by Tsvi Benson-Tilsen 136 days ago | Ryan Carey, Jack Gallagher, Jessica Taylor and Scott Garrabrant like this | discuss

## What this is

An index of posts outlining some ideas in decision theory. I plan to have them available both on the forum and on github as pdfs. I might turn this into a general index for agent-foundations-related decision theory research depending on interest and my time.

## Index

### Index of some decision theory posts

Forum post: https://agentfoundations.org/item?id=1026

### Notation for induction and decision theory

A reference for notation that might be useful for using (universal) Garrabrant inductors as models for bounded reasoning, and some notation for modelling agents. (Not on the forum because tables.)

### Desiderata for decision theory

A list of desiderata for a theory of optimal decision-making for bounded rational agents in general environments.

Forum post: https://agentfoundations.org/item?id=1053

### An inductive setting for decision theory

A discussion of the appropriate setting for studying decision theory.

Forum post: todo

Github pdf: todo

### Training a universal Garrabrant inductor to predict counterfactuals

A proposal for training UGIs to predict action- and policy-counterfactuals by learning from the consequences of actions taken by similar (“logically previous”) agents.

Forum post: https://agentfoundations.org/item?id=1054

### Open problem: very thin logical priors

An open problem relevant to decision theory and to understanding bounded reasoning: is there a very easily computable prior over logical facts that, when updated on the results of computations, performs well in some sense?

Forum post: todo

Github pdf: todo

### NEW DISCUSSION POSTS

Why wouldn't it work? The
 by Jessica Taylor on True understanding comes from passing exams | 0 likes

It would be weird if the
 by Jessica Taylor on Are daemons a problem for ideal agents? | 0 likes

The second AI doesn't get to
 by Stuart Armstrong on True understanding comes from passing exams | 0 likes

Fixed the $\varepsilon$,
 by Scott Garrabrant on Entangled Equilibria and the Twin Prisoners' Dilem... | 0 likes

I think you meant to divide
 by Vadim Kosoy on Entangled Equilibria and the Twin Prisoners' Dilem... | 0 likes

Yup, this isn't robust to
 by Patrick LaVictoire on Censoring out-of-domain representations | 0 likes

I don't think "honesty" is
 by Paul Christiano on How likely is a random AGI to be honest? | 2 likes

Discussed briefly in Concrete
 by Daniel Dewey on Minimizing Empowerment for Safety | 2 likes

I reason as follows: 1.
 by David Krueger on Does UDT *really* get counter-factually mugged? | 1 like

I agree... if there are
 by David Krueger on Censoring out-of-domain representations | 0 likes

Game-aligned agents aren't
 by Vladimir Nesov on Does UDT *really* get counter-factually mugged? | 0 likes

The issue in the OP is that
 by Vladimir Nesov on Does UDT *really* get counter-factually mugged? | 0 likes

This seems only loosely
 by David Krueger on Does UDT *really* get counter-factually mugged? | 0 likes

OK that makes sense, thanks.
 by David Krueger on Does UDT *really* get counter-factually mugged? | 0 likes

It's not the same (but
 by David Krueger on Learning Impact in RL | 0 likes