AI ALIGNMENT FORUM
AF

Moderation Log

Comment Author	Post	Deleted By User	Deleted Date	Deleted Public	Reason
Lawrence Chan	How We Picture Bayesian Agents	LawrenceC	18d	`false`	Whoops, Gwern already mentioned this work, my bad.
Luke H Miles	LLMs for Alignment Research: a safety priority?	lukehmiles	1mo	`false`
leogao	Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small	leogao	3mo	`false`
Clément Dumas	Discussion: Challenges with Unsupervised LLM Knowledge Discovery	Clément Dumas	5mo	`true`	Sorry I didn't understand you were confused because of the visualization
Daniel Kokotajlo	Evaluating the historical value misspecification argument	Daniel Kokotajlo	5mo	`true`	Accidental duplicate
Daniel Kokotajlo	Evaluating the historical value misspecification argument	Daniel Kokotajlo	5mo	`true`	Accidental duplicate
Ben Pace	TurnTrout's shortform feed	Ben Pace	6mo	`false`
Ben Pace	TurnTrout's shortform feed	Ben Pace	6mo	`false`
Fabien Roger	Coup probes: Catching catastrophes with probes trained off-policy	Fabien Roger	6mo	`false`
Fabien Roger	Preventing Language Models from hiding their reasoning	Fabien Roger	6mo	`true`

Author	Post	Banned Users
michaelcohen	Asymptotically Unambitious AGI	GPT2

ID	Banned From Frontpage	Banned from Personal Posts
quila		JBlack
[deactivated]		Shankar Sivarajan Jim Babcock
Noosphere89		Phil Tanny Eliezer Yudkowsky shminux
rank-biserial		Ruben Bloom
mike_hawke		Viliam PatrickDFarley Stuart Anderson Ericf Liav Koren [DEACTIVATED] Duncan Sabien
frontier64		Kaj Sotala
Thomas Kwa	MadHatter bharathk98	Zack M. Davis Said Achmiz
lsusr	Christian Kleineidam shminux Jason Maguire	Christian Kleineidam shminux Jason Maguire
DirectedEvolution	Christian Kleineidam Said Achmiz TAG	Christian Kleineidam Said Achmiz TAG
TurnTrout	Repetitive Experimenter Ofer