Compression
WORK IN PROGRESS
Its important
makes sending stuff fast
Entropy
Definition. Entropy is \[H = \sum_i p_i \log p_i\]
Remark. You can’t beat the entropy in encryption. Probably Shannon proved this.
Remark. No compression scheme works all the time. But who cares.
Huffman coding
Cool compression technique. Almost achieves the entropy. Assign short symbols to more common letters in your alphabet.
Python implementation:
class Node():
"""this is a node in a tree"""
def __init__(self, val, count):
self.val = val # e.g. "a"
self.count = count # e.g. 300
self.left = None # will be a Node
self.right = None
def printTree(self, prefix=""):
print(f"val: {self.val}, count: {self.count}, prefix: {prefix}")
if self.left:
self.left.printTree(prefix+"0")
if self.right:
self.right.printTree(prefix+"1")
= ["a", "b", "c", "d", "e"]
alphabet = [100,200,300,5000,1]
counts
= [Node(alphabet[i], counts[i]) for i in range(len(alphabet))]
S
while len(S) > 1:
=lambda x: x.count)
S.sort(key= S[0], S[1]
x, y = S[2:]
S # merge the 2 least counts nodes
= Node("dumb internal node", x.count+y.count)
mergeNode = x
mergeNode.left = y
mergeNode.right
S.append(mergeNode)
0].printTree() S[