Over at medium, I’m continuing to write about AI control; here’s a roundup from the last month.
Many of these seem like interesting things to discuss here; would it be better to post each of these as a link when I write it?
- Prosaic AI control argues that AI control research should first consider the case where AI involves no “unknown unknowns.”
- Handling destructive technology tries to explain the upside of AI control, if we live in a universe where we eventually need to build a singleton anyway.
- Hard-core subproblems explains a concept I find helpful for organizing research.
Building blocks of ALBA
Terminology and concepts