

















 19.  In memoryless Cartesian environments, every UDT policy is a CDT+SIA policy  
 Summary: I define a memoryless Cartesian environments (which can model many familiar decision problems), note the similarity to memoryless POMDPs, and define a local optimality condition for policies, which can be roughly stated as “the policy is consistent with maximizing expected utility using CDT and subjective probabilities derived from SIA”. I show that this local optimality condition is necesssary but not sufficient for global optimality (UDT).
 



 22.  Lagrangian duality for constraints on expectations  
 Summary: It’s possible to set up an zerosum game between two agents so that, in any Nash equilibrium, one agent picks a policy to optimize a particular objective subject to some constraints on expected features of the state resulting from the policy. This seems potentially useful for getting an approximate agent to maximize some objective subject to constraints.
 





