• I’ve been historically too critical of Anthropic specifically, and this was a mistake.

There are several things I mean by this. First of all,

If you’re an AI capabilities worker and you can’t be persuaded to work on AI safety or go do something non-AI related, then plz go work on capabilities Anthropic. Deepmind is my second pick.

(Why? One reason is that, at least Anthropic and Deepmind have safety teams --- afaik openai doesn’t currently. And they just have a better safety culture, and do expensive signalling to indicate that they care about safety and they have a large number of safety conscious ppl).

  • In the past i’ve criticized Anthropic something like as follows:

Dario has Pr(AI ~human extinction) at around 25% iirc. He’s working to build ASI. That’s not good.

But, I’m actually very glad that Dario is transparent about this belief.

Put another way:

true beliefs + coherent actions true beliefs + incoherent actions false beliefs

Sometimes becoming consistent takes time. Maybe you have to move from false beliefs to true beliefs and then gradually adopt the right actions.

Punishing ppl for having true beliefs + incoherent actions puts pressure on ppl to fall back into false beliefs.

I should apply this principle to myself too :) --- many of my actions are incoherent, I’d like to slowly turn my actions to be coherent rather than turn my beliefs to being false.