Intelligent Agent Foundations Forumsign up / log in
Funding opportunity for AI alignment research
link by Paul Christiano 23 days ago | Vadim Kosoy likes this | discuss
Open Problems Regarding Counterfactuals: An Introduction For Beginners
link by Alex Appel 63 days ago | Vadim Kosoy, Tsvi Benson-Tilsen, Vladimir Nesov and Wei Dai like this | 2 comments
Some Criticisms of the Logical Induction paper
link by Tarn Somervell Fletcher 83 days ago | Alex Mennen, Sam Eisenstat and Scott Garrabrant like this | 10 comments
Where's the first benign agent?
link by Jacob Kopczynski 157 days ago | Patrick LaVictoire and Paul Christiano like this | 15 comments
Neural nets designing neural nets
link by Stuart Armstrong 244 days ago | Vadim Kosoy likes this | discuss
The universal prior is malign
link by Paul Christiano 293 days ago | Ryan Carey, Vadim Kosoy, Jessica Taylor and Patrick LaVictoire like this | 4 comments
(Non-)Interruptibility of Sarsa(λ) and Q-Learning
link by Richard Möhn 307 days ago | Jessica Taylor and Patrick LaVictoire like this | 5 comments
Asymptotic Decision Theory
link by Jack Gallagher 339 days ago | Abram Demski, Jessica Taylor, Patrick LaVictoire, Paul Christiano and Tsvi Benson-Tilsen like this | 2 comments
Variations of the Garrabrant-inductor
link by Sune Kristian Jakobsen 361 days ago | Sam Eisenstat, Abram Demski, Jessica Taylor, Nate Soares and Scott Garrabrant like this | 1 comment
Two Agent Mild Optimization
link by Norman Perlmutter 414 days ago | Abram Demski and Jessica Taylor like this | discuss
A Layman's Explanation of "Safely Interruptible Agents"
link by Zach Weems 415 days ago | Jessica Taylor and Patrick LaVictoire like this | discuss
Improbable Oversight, An Attempt at Informed Oversight
link by William Saunders 423 days ago | Jessica Taylor and Patrick LaVictoire like this | 8 comments
A new proposal for logical counterfactuals
link by Jack Gallagher 439 days ago | Jessica Taylor, Patrick LaVictoire and Scott Garrabrant like this | 3 comments
An Alternative Setting for Resource-Bounded Lob's Theorem
link by Siddharth Bhaskar 440 days ago | Patrick LaVictoire and Scott Garrabrant like this | discuss
Working on a series of safety environments for OpenAI gym. Would love comments and ideas.
link by Rafael Cosman 464 days ago | Daniel Dewey, Jessica Taylor, Patrick LaVictoire and Tsvi Benson-Tilsen like this | discuss
every function can be computable
link by Ramana Kumar 499 days ago | Patrick LaVictoire likes this | discuss
Goal completion prior art: feature construction
link by Stuart Armstrong 525 days ago | discuss
An approach to the Agent Simulates Predictor problem
link by Alex Mennen 528 days ago | Vadim Kosoy, Abram Demski, Gary Drescher, Jessica Taylor and Patrick LaVictoire like this | 11 comments
Analysis of Algorithms and Partial Algorithms
link by Andrew MacFie 594 days ago | Patrick LaVictoire and Scott Garrabrant like this | 3 comments
Another toy model of the control problem
link by Paul Christiano 599 days ago | Jessica Taylor likes this | discuss
My current take on logical uncertainty
link by Paul Christiano 599 days ago | Jessica Taylor and Patrick LaVictoire like this | discuss
Some work on connecting UDT and Reinforcement Learning
link by David Krueger 642 days ago | Patrick LaVictoire and Paul Christiano like this | 5 comments
Sequential Extensions of Causal and Evidential Decision Theory
link by Tom Everitt 705 days ago | Kaya Stechly and Patrick LaVictoire like this | discuss
What's logical coherence for anyway?
link by Pedro Carvalho 717 days ago | Jessica Taylor and Patrick LaVictoire like this | discuss
Probabilities Small Enough To Ignore: An attack on Pascal's Mugging
link by Kaj Sotala 734 days ago | discuss
Older

NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

Note that the problem with
by Vadim Kosoy on Open Problems Regarding Counterfactuals: An Introd... | 0 likes

Typos on page 5: *
by Vadim Kosoy on Open Problems Regarding Counterfactuals: An Introd... | 0 likes

Ah, you're right. So gain
by Abram Demski on Smoking Lesion Steelman | 0 likes

> Do you have ideas for how
by Jessica Taylor on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

I think I understand what
by Wei Dai on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

>You don’t have to solve
by Wei Dai on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

Your confusion is because you
by Vadim Kosoy on Delegative Inverse Reinforcement Learning | 0 likes

My confusion is the
by Tom Everitt on Delegative Inverse Reinforcement Learning | 0 likes

> First of all, it seems to
by Abram Demski on Smoking Lesion Steelman | 0 likes

> figure out what my values
by Vladimir Slepnev on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

I agree that selection bias
by Jessica Taylor on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

>It seems quite plausible
by Wei Dai on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

> defending against this type
by Paul Christiano on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

2. I think that we can avoid
by Paul Christiano on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

I hope you stay engaged with
by Wei Dai on Autopoietic systems and difficulty of AGI alignmen... | 0 likes

RSS

Privacy & Terms