two mistakes in my plans for mitigating xrisk

In this post I’ll discuss two things that I’ve been doing incorrectly wrt mitigating xrisk in hopes that pointing this out and thinking about it will help me update towards not making these mistakes.

Mistake 0 — not talking about this more.

It’d be great to have more smart passionate people that care about making the universe a good place aware of this issue, and trying to do stuff to improve our odds. I can do more. Being more vocal about this is good in part because that gives others “permission” to feel less crazy about taking this seriously — it’s not just nobel prize winners and the CEO’s of openai / anthropic that take xrisk seriously — even the author of skyspace3 thinks that this is a not good situation!

See good arguments for suggestions on how to discuss this issue. To be clear, this is usually not an argument, it’s usually something that the other party has never thought about. But the techniques transfer, because people have natural skepticism.

Mistake 1 — optimizing for “feeling helpful” rather than solving the problem.

To be fair, this is a pretty easy Goodhart’s law issue to come up against — optimizing for feeling personally helpful is a reasonable heuristic to optimize if what I really care about is that humanity survives. But being free of this notion opens me up to other strategies which might be more effective.

And working on alignment because I think it’s personally morally imperative that I do so just has kind of bad vibes, especially as a double standard.

🐱 Skyspace3.0

Explorer

two mistakes in my plans for mitigating xrisk

Graph View

Backlinks