Intelligent Agent Foundations Forumsign up / log in
Logical uncertainty and mathematical uncertainty
link by Alex Mennen 143 days ago | discuss
Using lying to detect human values
link by Stuart Armstrong 250 days ago | discuss
Intuitive examples of reward function learning?
link by Stuart Armstrong 259 days ago | discuss
Funding for independent AI alignment research
link by Paul Christiano 262 days ago | discuss
Beyond algorithmic equivalence: self-modelling
link by Stuart Armstrong 264 days ago | discuss
Beyond algorithmic equivalence: algorithmic noise
link by Stuart Armstrong 264 days ago | discuss
Goodhart Taxonomy
link by Scott Garrabrant 325 days ago | discuss
Announcing the AI Alignment Prize
link by Vladimir Slepnev 381 days ago | Vadim Kosoy likes this | discuss
Metamathematics and probability
link by Alex Mennen 425 days ago | Abram Demski likes this | discuss
Funding opportunity for AI alignment research
link by Paul Christiano 451 days ago | Vadim Kosoy likes this | 3 comments
Open Problems Regarding Counterfactuals: An Introduction For Beginners
link by Alex Appel 491 days ago | Vadim Kosoy, Tsvi Benson-Tilsen, Vladimir Nesov and Wei Dai like this | 2 comments
Some Criticisms of the Logical Induction paper
link by Tarn Somervell Fletcher 510 days ago | Alex Mennen, Sam Eisenstat and Scott Garrabrant like this | 10 comments
Where's the first benign agent?
link by Jacob Kopczynski 584 days ago | Patrick LaVictoire and Paul Christiano like this | 15 comments
Neural nets designing neural nets
link by Stuart Armstrong 671 days ago | Vadim Kosoy likes this | discuss
The universal prior is malign
link by Paul Christiano 720 days ago | Ryan Carey, Vadim Kosoy, Jessica Taylor and Patrick LaVictoire like this | 4 comments
(Non-)Interruptibility of Sarsa(λ) and Q-Learning
link by Richard Möhn 735 days ago | Jessica Taylor and Patrick LaVictoire like this | 5 comments
Asymptotic Decision Theory
link by Jack Gallagher 767 days ago | Abram Demski, Jessica Taylor, Patrick LaVictoire, Paul Christiano and Tsvi Benson-Tilsen like this | 2 comments
Variations of the Garrabrant-inductor
link by Sune Kristian Jakobsen 788 days ago | Sam Eisenstat, Abram Demski, Jessica Taylor, Nate Soares and Scott Garrabrant like this | 1 comment
Two Agent Mild Optimization
link by Norman Perlmutter 841 days ago | Abram Demski and Jessica Taylor like this | discuss
A Layman's Explanation of "Safely Interruptible Agents"
link by Zach Weems 842 days ago | Jessica Taylor and Patrick LaVictoire like this | discuss
Improbable Oversight, An Attempt at Informed Oversight
link by William Saunders 850 days ago | Jessica Taylor and Patrick LaVictoire like this | 8 comments
A new proposal for logical counterfactuals
link by Jack Gallagher 866 days ago | Jessica Taylor, Patrick LaVictoire and Scott Garrabrant like this | 3 comments
An Alternative Setting for Resource-Bounded Lob's Theorem
link by Siddharth Bhaskar 867 days ago | Patrick LaVictoire and Scott Garrabrant like this | discuss
Working on a series of safety environments for OpenAI gym. Would love comments and ideas.
link by Rafael Cosman 892 days ago | Daniel Dewey, Jessica Taylor, Patrick LaVictoire and Tsvi Benson-Tilsen like this | discuss
every function can be computable
link by Ramana Kumar 927 days ago | Patrick LaVictoire likes this | discuss
Older

NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

[Note: This comment is three
by Ryan Carey on A brief note on factoring out certain variables | 0 likes

There should be a chat icon
by Alex Mennen on Meta: IAFF vs LessWrong | 0 likes

Apparently "You must be
by Jessica Taylor on Meta: IAFF vs LessWrong | 1 like

There is a replacement for
by Alex Mennen on Meta: IAFF vs LessWrong | 1 like

Regarding the physical
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think that we should expect
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

I think I understand your
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

This seems like a hack. The
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

After thinking some more,
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yes, I think that we're
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

My intuition is that it must
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

To first approximation, a
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Actually, I *am* including
by Vadim Kosoy on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

Yeah, when I went back and
by Alex Appel on Optimal and Causal Counterfactual Worlds | 0 likes

> Well, we could give up on
by Jessica Taylor on The Learning-Theoretic AI Alignment Research Agend... | 0 likes

RSS

Privacy & Terms