|19.||In memoryless Cartesian environments, every UDT policy is a CDT+SIA policy|
| post by Jessica Taylor 834 days ago | Vadim Kosoy and Abram Demski like this | 3 comments|
Summary: I define a memoryless Cartesian environments (which can model many familiar decision problems), note the similarity to memoryless POMDPs, and define a local optimality condition for policies, which can be roughly stated as “the policy is consistent with maximizing expected utility using CDT and subjective probabilities derived from SIA”. I show that this local optimality condition is necesssary but not sufficient for global optimality (UDT).
|22.||Lagrangian duality for constraints on expectations|
| post by Jessica Taylor 872 days ago | Patrick LaVictoire likes this | discuss|
Summary: It’s possible to set up an zero-sum game between two agents so that, in any Nash equilibrium, one agent picks a policy to optimize a particular objective subject to some constraints on expected features of the state resulting from the policy. This seems potentially useful for getting an approximate agent to maximize some objective subject to constraints.