Well red-black trees, AVL trees, and other standard balanced binary search trees are

So, can we do better?

Definition. Splay Tree: a “self-balancing” binary search tree. In other words, it achieves \(O(\log n)\) ammortized update / find operations. and it’s super simple.

find(v):
  search the tree for v
  splay(v)

ok so what’s a splay operation? It’s a sequence of double rotations, termed zig-zag, zag-zig, zig-zig, zag-zag. In particular, these operations are performed a path from the splayed node to the root, and result in the splayed node becoming the root (or at least becoming really close to the root, e.g. distance \(1\) away from the root).

So why splay? Well, the key feature of any lazy data structure is that expensive operations in your data structure need to clean up the data structure. So for example if there is a find operation that takes a really long time, then the found node is moved to the top of the tree, causing subsequent searches on that node to be less expensive. This serves to shorten paths that are too long.

Some more intuition for splay trees: if a nodes subtrees are unbalanced in size, e.g. one subtree has more than \(2/3\) the weight of the full tree then that’s an indication that we are not super balanced. So, we want to make “huge subtrees” a high potential state, and then we can extract potential from them to fix them. We can fix a huge subtree by rotating it to be higher in the tree.

Definition. The rank of a node \(x\) is \(r(x) = \log\) size of subtree rooted at \(x\).

Lemma. The amortized cost to splay node \(x\) to the root \(t\) is \(3 (r(t) -r(x)) + 1\)

Proof. just analyze the potential change from each double-rotation operation. It’s not too complicated.

Corollary. We get amortized \(\log n\) operations for the splay tree.

Remark. OK. Splay Trees are actually super weird though. There is this thing called the Dynamic Optimality conjecture, which postulates that splay trees are just literally the best at everything.

Some data supporting the conjecture includes:

  • if you had some access frequencies splay trees achieve (up to constants) entropy for average search time.
  • static finger theorem: if you search for stuff and you are measuring how far you have to move to get to stuff
  • cache optimality

credit

Splay Trees are were invented by Sleator and Tarjan My understanding of Splay Trees is thanks to Professor Karger