Jessica Taylor and Chris Olah has a post on “Maximizing a quantity while ignoring effect through some channel”. I’ll briefly present a different way of doing this, and compare the two.
Essentially, the AI’s utility is given by a function \(U\) of a variable \(C\). The AI’s actions are a random variable \(A\), but we want to ‘factor out’ another random variable \(B\).
If we have a probability distribution \(Q\) over actions, then, given background evidence \(E\), the standard way to maximise \(U(C)\) would be to maximise:
 \(\sum_{a,b,c} U(c)P(C=c,B=b,A=ae) \\ = \sum_{a,b,c} U(c)P(C=cB=b,A=a,e)P(B=bA=a,e)Q(A=ae)\).
The most obvious idea, for me, is to replace \(P(B=bA=a,e)\) with \(P(B=be)\), making \(B\) artificially independent of \(A\) and giving the expression:
 \(\sum_{a,b,c} U(c)P(C=cB=b,A=a,e)P(B=be)Q(A=ae)\).
If \(B\) is dependent on \(A\)  if it isn’t, then factoring it out is not interesting  then \(P(B=b)\) needs some implicit probability distribution over \(A\) (which is independent of \(Q\)). So, in essence, this approach relies on two distributions over the possible actions, one that the agent is optimising, the other than is left unoptimised. In terms of Bayes nets, this just seems to be cutting \(B\) from \(A\).
Jessica and Chris’s approach also relies on two distributions. But, as far as I understand their approach, the two distributions are taken to be the same, and instead, it is assumed that \(U(C)\) cannot be improved by changes to the distribution of \(A\), if one keeps the distribution of \(B\) constant. This has the feel of being a kind of differential condition  the infinitesimal impact on \(U(C)\) of changes to \(A\) but not \(B\) is nonpositive.
I suspect my version might have some odd behaviour (defining the implicit distribution for \(A\) does not seem necessarily natural), but I’m not sure of the consistency properties of the differential approach.
A very dull coordination game
Suppose that \(A\) is an integer in the range \(1\) to \(100\). Then \(B\) will simply be set to the value of \(A\). And the utility is equal to \(A\) if \(A=B\) and \(0\) otherwise.
For the differential approach, all pure distributions \(P(A=a)=1\) are optimal. For the approach presented here, it depends on the choice of implicit distribution over \(A\) (and hence over \(B\)). If the distribution is uniform, then \(A=100\) is the outcome. If there is a default \(A=a\), then \(A=a\) is the outcome. A third possibility is that the agent gets to select both the implicit distribution and the true distribution; in that case \(A=100\) is the outcome.
Mixing in some anticoordination gets different results (though I’m still not clear if, for the differential method, \(d=d'\) means \(A=A'\) or simply that they have the same distribution).
