EM reprise

In this post I once again remind myself what EM is. It seems like a really cool idea, but it hasn’t totally stuck yet.

A simple example:

Hmm. Now how could we do this. We will do EM algorithm. Which is great for this kind of hidden features setup.

Define pseudo log likelihood

Rewriting this a bit:

In other words:

Clearly, the pseudo log likelihood is leq to actual log likelihood, with equality when is actually correct.

Another perspective:

why EM works:

We can take the pseudo log likelihood thing and drop a constant that doesn’t matter for the optimization.

Alg is now:

Take

Gaussian mixture model:

ok i dont really have time to read this rn read till page 10/14 could be fun to go to OH or talk to Kevin + Anthony about this more

🐱 Skyspace3.0