Intelligent Agent Foundations Forumsign up / log in
Where's the first benign agent?
link by Jacob Kopczynski 43 days ago | Patrick LaVictoire and Paul Christiano like this | 15 comments
Neural nets designing neural nets
link by Stuart Armstrong 130 days ago | Vadim Kosoy likes this | discuss
The universal prior is malign
link by Paul Christiano 179 days ago | Ryan Carey, Vadim Kosoy, Jessica Taylor and Patrick LaVictoire like this | 4 comments
(Non-)Interruptibility of Sarsa(λ) and Q-Learning
link by Richard Möhn 193 days ago | Jessica Taylor and Patrick LaVictoire like this | 5 comments
Asymptotic Decision Theory
link by Jack Gallagher 226 days ago | Abram Demski, Jessica Taylor, Patrick LaVictoire, Paul Christiano and Tsvi Benson-Tilsen like this | 2 comments
Variations of the Garrabrant-inductor
link by Sune Kristian Jakobsen 247 days ago | Sam Eisenstat, Abram Demski, Jessica Taylor, Nate Soares and Scott Garrabrant like this | 1 comment
Two Agent Mild Optimization
link by Norman Perlmutter 300 days ago | Abram Demski and Jessica Taylor like this | discuss
A Layman's Explanation of "Safely Interruptible Agents"
link by Zach Weems 301 days ago | Jessica Taylor and Patrick LaVictoire like this | discuss
Improbable Oversight, An Attempt at Informed Oversight
link by William Saunders 309 days ago | Jessica Taylor and Patrick LaVictoire like this | 8 comments
A new proposal for logical counterfactuals
link by Jack Gallagher 325 days ago | Jessica Taylor, Patrick LaVictoire and Scott Garrabrant like this | 3 comments
An Alternative Setting for Resource-Bounded Lob's Theorem
link by Siddharth Bhaskar 326 days ago | Patrick LaVictoire and Scott Garrabrant like this | discuss
Working on a series of safety environments for OpenAI gym. Would love comments and ideas.
link by Rafael Cosman 351 days ago | Daniel Dewey, Jessica Taylor, Patrick LaVictoire and Tsvi Benson-Tilsen like this | discuss
every function can be computable
link by Ramana Kumar 385 days ago | Patrick LaVictoire likes this | discuss
Goal completion prior art: feature construction
link by Stuart Armstrong 411 days ago | discuss
An approach to the Agent Simulates Predictor problem
link by Alex Mennen 414 days ago | Vadim Kosoy, Abram Demski, Gary Drescher, Jessica Taylor and Patrick LaVictoire like this | 11 comments
Analysis of Algorithms and Partial Algorithms
link by Andrew MacFie 480 days ago | Patrick LaVictoire and Scott Garrabrant like this | 3 comments
Another toy model of the control problem
link by Paul Christiano 485 days ago | Jessica Taylor likes this | discuss
My current take on logical uncertainty
link by Paul Christiano 485 days ago | Jessica Taylor and Patrick LaVictoire like this | discuss
Some work on connecting UDT and Reinforcement Learning
link by David Krueger 528 days ago | Patrick LaVictoire and Paul Christiano like this | 5 comments
Sequential Extensions of Causal and Evidential Decision Theory
link by Tom Everitt 591 days ago | Kaya Stechly and Patrick LaVictoire like this | discuss
What's logical coherence for anyway?
link by Pedro Carvalho 603 days ago | Jessica Taylor and Patrick LaVictoire like this | discuss
Probabilities Small Enough To Ignore: An attack on Pascal's Mugging
link by Kaj Sotala 620 days ago | discuss
Provability Counterfactuals vs Three Axioms of Galles and Pearl
link by Evan Lloyd 638 days ago | Sam Eisenstat, Nate Soares, Patrick LaVictoire and Scott Garrabrant like this | discuss
Attempting to refine "maximization" with 3 new -izers
link by Pasha Kamyshev 656 days ago | Kaya Stechly and Patrick LaVictoire like this | 1 comment
Relating Modal Polymorphism to PA with soundness
link by Siddharth Bhaskar 666 days ago | Abram Demski, Nate Soares, Patrick LaVictoire and Scott Garrabrant like this | 2 comments
Older

NEW LINKS

NEW POSTS

NEW DISCUSSION POSTS

RECENT COMMENTS

The "benign induction
by David Krueger on Why I am not currently working on the AAMLS agenda | 0 likes

This comment is to explain
by Alex Mennen on Formal Open Problem in Decision Theory | 0 likes

Thanks for writing this -- I
by Daniel Dewey on AI safety: three human problems and one AI issue | 1 like

I think it does do the double
by Stuart Armstrong on Acausal trade: double decrease | 0 likes

>but the agent incorrectly
by Stuart Armstrong on CIRL Wireheading | 0 likes

I think the double decrease
by Owen Cotton-Barratt on Acausal trade: double decrease | 0 likes

The problem is that our
by Scott Garrabrant on Cooperative Oracles: Nonexploited Bargaining | 1 like

Yeah. The original generator
by Scott Garrabrant on Cooperative Oracles: Nonexploited Bargaining | 0 likes

I don't see how it would. The
by Scott Garrabrant on Cooperative Oracles: Nonexploited Bargaining | 1 like

Does this generalise to
by Stuart Armstrong on Cooperative Oracles: Nonexploited Bargaining | 0 likes

>Every point in this set is a
by Stuart Armstrong on Cooperative Oracles: Nonexploited Bargaining | 0 likes

This seems a proper version
by Stuart Armstrong on Cooperative Oracles: Nonexploited Bargaining | 0 likes

This doesn't seem to me to
by Stuart Armstrong on Change utility, reduce extortion | 0 likes

[_Regret Theory with General
by Abram Demski on Generalizing Foundations of Decision Theory II | 0 likes

It's not clear whether we
by Paul Christiano on Infinite ethics comparisons | 1 like

RSS

Privacy & Terms