by Jack Gallagher 203 days ago | link | parent When considering an embedder $$F$$, in universe $$U$$, in response to which SADT picks policy $$\pi$$, I would be tempted to apply the following coherence condition: $E[F(\pi)] = E[F(DDT)] = E[U]$ (all approximately of course) I’m not sure if this would work though. This is definitely a necessary condition for reasonable counterfactuals, but not obviously sufficient. A potentially useful augmentation is to use absolute expected difference: $E[|F(\pi) - F(DDT)|] = E[|F(DDT) - U|] = 0$

